# sample function in R: The Complete Guide

Statisticians usually need to take the samples of the dataset and then calculate the statistics. Taking a sample is easy with R because of the sample() method, as it is nothing more than a subset of data.

## sample function in R

The sample in R is a built-in function that takes a sample of the specified size from the input elements using either with or without replacement. For example, the sample() function takes data, size, replace, and prob as arguments.

By default, the sample() function randomly reorders the elements passed as the first argument. This means that the default size is the size of the given array.replace=TRUE.

### Syntax

``sample(data, size, replace = FALSE, prob = NULL)``

### Parameters

data: It is either a vector of one or more elements from which to choose or a positive integer.

n: It is a positive number, the number of items to choose from. See ‘Details.’

size: It is a non-negative integer giving the number of items to choose.

replace: It should sampling be with replacement?

prob: It is a vector of probability weights for obtaining the elements of the vector being sampled.

### Example

Let’s define a numerical vector using :(colon operator) and sample the 5 values from that vector.

``````data <- 1:20
sample(data, 5, replace = FALSE, prob = NULL)``````

#### Output

`` 17 6 13 11 19``

In this example, we are creating a vector with 20 values. Then use the sample() method and pass the data vector and length, which will be 5 that means it will pluck the random five elements from the vector and returns those values.

For sample, the default for size is the number of items inferred from the first argument so that sample(x) generates a random permutation of the elements of x (or 1:x).

If replace is FALSE, these probabilities are applied sequentially; that is, the probability of choosing the next element is proportional to the weights amongst the remaining items.

If the data vector has length 1, is numeric, and data >= 1, sampling via sample takes place from 1:x.

## The replace = TRUE in sample() function

If you want to simulate rolls of a die, and you want to get 12 results, then you can use the sample() function and pass the length of 12, which will repeat the numbers since we will give the replace = TRUE. Please note that the dice have only 6 different numbers.

See the following code.

``````data <- 1:6
sample(data, 12, replace = TRUE)``````

#### Output

``  4 3 5 1 2 2 2 3 6 6 5 6``

You can see that some numbers are repeating 3 times, some 2 times, and some appear only one time.

Because the return value of the sample() function is a randomly determined number, if you try the sample() function repeatedly, you will get different results every time.

``````➜ R RScript Pro.R
 4 3 5 1 2 2 2 3 6 6 5 6
➜ R RScript Pro.R
 1 2 6 5 6 6 3 1 5 5 6 6
➜ R RScript Pro.R
 4 1 1 6 1 6 1 3 5 5 5 2
➜ R RScript Pro.R
 2 2 4 5 6 1 1 3 3 5 1 5
➜ R RScript Pro.R
 3 4 3 1 1 2 1 5 6 3 3 2``````

You can see that every time we run the program, we will get different outputs.

## Random Reordering of Data using sample() function

The most common usage of the sample function is the random subsampling of data. First, let’s subsample the vector.

``````rv <- 1:20

sample(rv, size = 10)``````

#### Output

`` 16  8  11  20  19  10  4  17  21  12``

## Generating a Sample from a Dataset

The sample() function can generate random sample rows from a dataset.

``````len <- length(mtcars)
sample_rows <- sample(len, 10)
print(sample_rows)``````

#### Output

``   8  1  6  9  2  10  11  3  5  7``

## Sampling with uneven probabilities using sample() function

To modify the probabilities of our random selection, pass the “prob” argument of the sample function.

``````rv <- 1:11

sample(rv, size = 10, replace = TRUE, prob = c(0.6, rep(0.1, 10)))``````

#### Output

``   1  11  1  1  5  3  10  1  1  2``

## Random sampling of list elements using sample() function

You can use the sample() function to get the random elements from the List in R.

``````lst <- list(
1:5,
833,
c("K", "LLL", "Ouija"),
"Board",
5
)
len_list <- length(lst)
list_samp <- lst[sample(len_list, size = 3)]
list_samp``````

#### Output

``````[]
 1 2 3 4 5

[]
 "K"  "LLL"  "Ouija"

[]
 833``````

## Random Sampling of data frame rows

To extract the random subset of rows from a data frame in R, use the sample() function.

``````df <- data.frame(a1 = 1:10,
a2 = letters[1:10],
a3 = letters[1:10],
a4 = letters[1:10],
a5 = letters[1:10],
a6 = letters[1:10],
a7 = letters[1:10],
a8 = letters[1:10],
a9 = letters[1:10],
a10 = letters[1:10])

df_len <- length(df)

df_sample <- df[sample(seq_len(df_len), size = 3), ]

df_sample``````

#### Output

``````   a1  a2  a3  a4  a5  a6  a7  a8  a9  a10
8  8   h   h   h   h   h   h   h   h    h
1  1   a   a   a   a   a   a   a   a    a
10 10  j   j   j   j   j   j   j   j    j``````

That is it for sample() function in R.