R dplyr

R dplyr::slice() Function

The dplyr::slice() function subsets rows by their position or index within a data frame. If you want to select specific rows from a data frame, you can use the slice() function and pass the index of specific rows, and it will return a data frame with those specific rows.

In the above figure, we selected rows 3 and 4 by their position without grouping or anything.

The slice() function is extremely helpful when you want to select specific rows, get the top N or bottom N rows after sorting, exclude rows, use group_by(), and combine with other dplyr functions.

Syntax

slice(.df, ..., .by = NULL, .preserve = FALSE)

Parameters

Name Value
.df It is either a data frame or an extension of it, like a tibble.
It can be row indices (positive/negative integers or functions like n()).
.by (Optional) It is a grouping specification without permanent grouping.
.preserve Whether to preserve a grouping structure. If TRUE, it makes it as it is.

Basic row selection

Let’s select rows 3 and 4:

library(dplyr)

df <- data.frame(
  name = c("Krunal", "Ankit", "Rushabh", "Niva", "Hemang", "Vidisha"),
  score = c(85, 90, 78, 95, 80, 92),
  subject = c("Math", "Math", "History", "History", "Math", "History"),
  grade = c("10th", "11th", "11th", "10th", "11th", "10th")
)

cat("Before slicing: ", "\n")
df

new_df <- df %>% slice(3, 4)

cat("-------------", "\n")
cat("After slicing: ", "\n")
cat("-------------", "\n")

new_df

Output

Exclude Rows with Negative Indices

If you want to exclude specific rows from the output data frame, prepend the hyphen (-) before its position, and that specific row will be removed.

library(dplyr)

df <- data.frame(
  name = c("Krunal", "Ankit", "Rushabh", "Niva", "Hemang", "Vidisha"),
  score = c(85, 90, 78, 95, 80, 92),
  subject = c("Math", "Math", "History", "History", "Math", "History"),
  grade = c("10th", "11th", "11th", "10th", "11th", "10th")
)

cat("Before Dropping: ", "\n")
df

new_df <- df %>% slice(-1, -3, -5)

cat("-------------", "\n")
cat("After Dropping 1st, 3rd and 5th rows:", "\n")
cat("-------------", "\n")

new_df

Output

Slicing a range of rows

You can slice a range of rows using the “:” operator.

library(dplyr)

df <- data.frame(
  name = c("Krunal", "Ankit", "Rushabh", "Niva", "Hemang", "Vidisha"),
  score = c(85, 90, 78, 95, 80, 92),
  subject = c("Math", "Math", "History", "History", "Math", "History"),
  grade = c("10th", "11th", "11th", "10th", "11th", "10th")
)

df

new_df <- df %>% slice(2:5)

new_df

Output

The above output figure shows that we sliced rows from row positions 2 to 5, and the output data frame contains exact rows.

Dynamic row slicing

Using the n() function, we can dynamically slice rows, which ultimately returns the total number of rows in the data frame.

Let’s select the last two rows dynamically using the slice(), n(), and “:” operator.

library(dplyr)

df <- data.frame(
  name = c("Krunal", "Ankit", "Rushabh", "Niva", "Hemang", "Vidisha"),
  score = c(85, 90, 78, 95, 80, 92),
  subject = c("Math", "Math", "History", "History", "Math", "History"),
  grade = c("10th", "11th", "11th", "10th", "11th", "10th")
)

df

# Usage of slice() with n() to select last two rows
new_df <- df %>% slice((n() - 1):n())

new_df

Output

In the above figure, n() means the last row, and (n() – 1) means the second last row. Using the slice() function, we are selecting rows from the second last row to the last row, which means the last two rows dynamically.

Random slicing

The dplyr package provides a complimentary function called “slice_sample()” that you can use to slice any number of rows randomly.

set.seed(123) # For reproducibility
library(dplyr)

df <- data.frame(
  name = c("Krunal", "Ankit", "Rushabh", "Niva", "Hemang", "Vidisha"),
  score = c(85, 90, 78, 95, 80, 92),
  subject = c("Math", "Math", "History", "History", "Math", "History"),
  grade = c("10th", "11th", "11th", "10th", "11th", "10th")
)

df

# dplyr random slicing (corrected)
new_df <- df %>% slice_sample(n = 2)

new_df

Output

Since we passed n = 2, we want to slice randomly two rows from a data frame. By setting the seed, we ensured we would reproduce results every time, even if selected randomly.

Slicing rows by group

The dplyr function comes with the group_by() method, which helps us get the row by specific groups. Let’s get the first row for each subject.

For example, in our data frame, there are only two subjects. So, get only the first row for each subject.

library(dplyr)

df <- data.frame(
  name = c("Krunal", "Ankit", "Rushabh", "Niva", "Hemang", "Vidisha"),
  score = c(85, 90, 78, 95, 80, 92),
  subject = c("Math", "Math", "History", "History", "Math", "History"),
  grade = c("10th", "11th", "11th", "10th", "11th", "10th")
)

cat("Before slicing: ", "\n")
df

new_df <- df %>%
  group_by(subject) %>%
  slice(1)

Output

Combine with arrange()

If you want to sort the data before slicing, you can use dplyr’s arrange() function.

library(dplyr)

df <- data.frame(
  name = c("Krunal", "Ankit", "Rushabh", "Niva", "Hemang", "Vidisha"),
  score = c(85, 90, 78, 95, 80, 92),
  subject = c("Math", "Math", "History", "History", "Math", "History"),
  grade = c("10th", "11th", "11th", "10th", "11th", "10th")
)

df

# Sort the data frame based on score in descending order
# and then slice first 3 rows
new_df <- df %>% arrange(desc(score)) %>% slice(1:3)

new_df

Output

The above output figure shows that we sorted the data frame in descending order based on the “score” column and then sliced the first three rows in the output data frame.

The slice() function gracefully handles out-of-bounds indices (e.g., slice(100) returns nothing) and doesn’t throw any errors.

Recent Posts

How to Check If File and Folder Already Exists in R

Whether you are reading or writing files via programs in the file system, it is…

2 days ago

How to Check Data type of a Variable in R

When it comes to checking the data type of a variable, it depends on what…

3 days ago

Mastering grepl() Function in R

The grepl() function (stands for "grep logical") in R searches for patterns within each element…

4 days ago

zip(), unzip() and tar(), untar() Functions in R

The zip() function creates a new zip archive file. You must ensure that the zip tool…

5 days ago

How to Create Directory and File If It doesn’t Exist in R

When working with file systems, checking the directory or file existence is always better before…

6 days ago

How to Create a Grouped Boxplot in R

To create a grouped boxplot in R, we can use the ggplot2 library's aes() and…

1 week ago