Statisticians usually need to take the samples of the dataset and then calculate the statistics. Taking a sample is easy with R because of the sample() method, as it is nothing more than a subset of data.

## sample function in R

The sample in R is a built-in function that takes a sample of the specified size from the input elements using either with or without replacement. For example, the sample() function takes **data, size, replace, **and **prob **as arguments.

By default, the sample() function randomly reorders the elements passed as the first argument. This means that the default size is the size of the given **array.replace=TRUE**.

**Syntax**

`sample(data, size, replace = FALSE, prob = NULL)`

**Parameters**

**data:** It is either a vector of one or more elements from which to choose or a positive integer.

**n:** It is a positive number, the number of items to choose from. See ‘Details.’

**size:** It is a non-negative integer giving the number of items to choose.

**replace:** It should sampling be with replacement?

**prob:** It is a vector of probability weights for obtaining the elements of the vector being sampled.

**Example**

Let’s define a numerical vector using :(colon operator) and sample the 5 values from that vector.

```
data <- 1:20
sample(data, 5, replace = FALSE, prob = NULL)
```

**Output**

`[1] 17 6 13 11 19`

In this example, we are creating a vector with 20 values. Then use the sample() method and pass the data vector and length, which will be 5 that means it will pluck the random five elements from the vector and returns those values.

For sample, the default for size is the number of items inferred from the first argument so that sample(x) generates a random permutation of the elements of x (or 1:x).

If **replace** is **FALSE**, these probabilities are applied sequentially; that is, the probability of choosing the next element is proportional to the weights amongst the remaining items.

If the data vector has length 1, is numeric, and data >= 1, sampling via sample takes place from 1:x.

**The replace = TRUE in sample() function**

If you want to simulate rolls of a die, and you want to get 12 results, then you can use the sample() function and pass the length of 12, which will repeat the numbers since we will give the **replace = TRUE.** Please note that the dice have only 6 different numbers.

See the following code.

```
data <- 1:6
sample(data, 12, replace = TRUE)
```

**Output**

` [1] 4 3 5 1 2 2 2 3 6 6 5 6`

You can see that some numbers are repeating 3 times, some 2 times, and some appear only one time.

Because the return value of the sample() function is a randomly determined number, if you try the sample() function repeatedly, you will get different results every time.

```
➜ R RScript Pro.R
[1] 4 3 5 1 2 2 2 3 6 6 5 6
➜ R RScript Pro.R
[1] 1 2 6 5 6 6 3 1 5 5 6 6
➜ R RScript Pro.R
[1] 4 1 1 6 1 6 1 3 5 5 5 2
➜ R RScript Pro.R
[1] 2 2 4 5 6 1 1 3 3 5 1 5
➜ R RScript Pro.R
[1] 3 4 3 1 1 2 1 5 6 3 3 2
```

You can see that every time we run the program, we will get different outputs.

**Random Reordering of Data using sample() function**

The most common usage of the sample function is the random subsampling of data. First, let’s subsample the vector.

```
rv <- 1:20
sample(rv, size = 10)
```

**Output**

`[1] 16 8 11 20 19 10 4 17 21 12`

**Generating a Sample from a Dataset**

The **sample()** function can generate random sample rows from a dataset.

```
len <- length(mtcars)
sample_rows <- sample(len, 10)
print(sample_rows)
```

**Output**

` [1] 8 1 6 9 2 10 11 3 5 7`

**Sampling with uneven probabilities using sample() function**

To modify the probabilities of our random selection, pass the **“prob” **argument of the sample function.

```
rv <- 1:11
sample(rv, size = 10, replace = TRUE, prob = c(0.6, rep(0.1, 10)))
```

**Output**

` [1] 1 11 1 1 5 3 10 1 1 2`

**Random sampling of list elements using sample() function**

You can use the sample() function to get the random elements from the List in R.

```
lst <- list(
1:5,
833,
c("K", "LLL", "Ouija"),
"Board",
5
)
len_list <- length(lst)
list_samp <- lst[sample(len_list, size = 3)]
list_samp
```

**Output**

```
[[1]]
[1] 1 2 3 4 5
[[2]]
[1] "K" "LLL" "Ouija"
[[3]]
[1] 833
```

**Random Sampling of data frame rows**

To extract the random subset of rows from a data frame in R, use the sample() function.

```
df <- data.frame(a1 = 1:10,
a2 = letters[1:10],
a3 = letters[1:10],
a4 = letters[1:10],
a5 = letters[1:10],
a6 = letters[1:10],
a7 = letters[1:10],
a8 = letters[1:10],
a9 = letters[1:10],
a10 = letters[1:10])
df_len <- length(df)
df_sample <- df[sample(seq_len(df_len), size = 3), ]
df_sample
```

**Output**

```
a1 a2 a3 a4 a5 a6 a7 a8 a9 a10
8 8 h h h h h h h h h
1 1 a a a a a a a a a
10 10 j j j j j j j j j
```

That is it for sample() function in R.

Krunal Lathiya is an Information Technology Engineer by education and web developer by profession. He has worked with many back-end platforms, including Node.js, PHP, and Python. In addition, Krunal has excellent knowledge of Data Science and Machine Learning, and he is an expert in R Language. Krunal has written many programming blogs, which showcases his vast expertise in this field.