R split() Function

The split() function is used to divide data into groups based on some criteria, typically defined by a factor or list of factors.

This method is robust when combined with lapply() or sapply() for applying functions to each subset of data.

Syntax

split(x, f, drop = FALSE)

Parameters

  1. x: It is a data frame or vector to be divided into groups.
  2. f: It is a factor to divide the data.
  3. drop: It is a logical argument suggesting that the levels that do not occur should be dropped.

Return value

It returns a list where each element corresponds to a unique factor level.

The type of object returned by this function depends on the input type. For a vector, you get a list of vectors; for a data frame, a list of data frames.

Example 1: Splitting the data frame

Figure of How to split data frame using split() function in R

df <- data.frame(
  name = c("Krunal", "Ankit", "Rushabh", "Dhaval", "Tejas"),
  score = c(85, 90, 78, 92, 88),
  subject = c("Math", "Math", "History", "History", "Math"),
  grade = c("10th", "11th", "11th", "10th", "10th")
)

# Split the data frame by subject
split_df <- split(df, df$subject)

# Print the split data frame
print(split_df)

Output

Output of splitting data frame using split() function

You can use the unsplit() function to restore the original data frame.

unsplit(df, f = df$subject)

Example 2: Splitting a vector

You can split a vector into two vectors where elements are of the same group, passing the names of the vector with the names function to the f argument.

Figure of splitting a vector using the split() function

vec <- c(x = 3, y = 5, x = 1, x = 4, y = 3)
vec

data <- split(vec, f = names(vec))
data

Output

$x
x   x
3   1

$y
y   y
5   4

Example 3: Splitting a dataset into groups

data("ToothGrowth")

df <- head(ToothGrowth)

data <- split(df, f = df$len)
data

Output

$`4.2`
   len supp dose
1  4.2  VC  0.5

$`5.8`
   len supp dose
4  5.8  VC  0.5

$`6.4`
   len supp dose
5  6.4 VC 0.5

$`7.3`
   len supp dose
3  7.3  VC 0.5

$`10`
   len supp dose
6  10   VC  0.5

$`11.5`
   len supp dose
2  11.5 VC 0.5

That’s it.

Leave a Comment