The subset() function in R creates subsets of a data frame. It can also be used to drop columns from a data frame. The syntax is a subset(df, expr), where df is the data frame, and expr is an expression that specifies the rows to be included in the subset.
Syntax
subset(x, subset, select, drop = FALSE, …)
Parameters
- x – Object to be subsetted. It could be any of the vector data.frame, & matrices.
- subset – It is a subset expression.
- select – The columns to select in a vector.
- drop – Passed on to the indexing method for matrices and data frames.
- … – Other arguments.
Return value
The subset() function returns the subset of rows from the data frame based on a list of row names, a list of values, and based on conditions.
Example 1: Using the subset() by row name
You can use the subset() function to get a subset of rows from a data frame based on row names. You can specify a vector of required row names and use the %in% operator to check for the presence of data frame row names in that vector.
df <- data.frame(x = 1:3, y = c("a", "b", "c"))
rownames(df) <- c("A", "B", "C")
# Subset by row name
dfsub <- subset(df, rownames(df) %in% c("A", "C"))
dfsub
Output
x y
A 1 a
C 3 c
In the above code example, we created a data frame df with three rows and two columns. The rows are named “A”, “B”, and “C”.
Then we used the subset() function to create a new data frame dfsub that contains only the rows with names “A” and “C”.
Example 2: Using the subset() function by a list of values
You can use the subset() function to get a subset of rows from a data frame based on a list of values. You can create a vector with the list of values and use the %in% operator on condition to the subset() function.
df <- data.frame(x = 1:3, y = c("a", "b", "c"))
rownames(df) <- c("A", "B", "C")
# Subset by row name
dfsub <- subset(df, x %in% c(1, 2))
dfsub
Output
x y
A 1 a
B 2 b
In the above code, we created data frame df with three rows and two columns.
The rows are named “A”, “B”, and “C”.
Then, we used the subset() function to create a new data frame dfsub that contains only the rows where column x has values 1 and 2.
Example 3: Using the subset() columns by Name
You can use the subset() function to get a subset of columns from a data frame based on column names. You can use the select argument with either a single column name or a vector of column names.
df <- data.frame(x = 1:3, y = c("a", "b", "c"), z = c("A", "B", "C"))
# Subset by column name
dfsub <- subset(df, select = c("x", "z"))
dfsub
Output
x z
1 1 A
2 2 B
3 3 C
You can see that we created a data frame df with three rows and three columns.
Then, it uses the subset() function to create a new data frame dfsub that contains only columns “x” and “z”.
Example 4: Using subset() Columns by Index
The subset() function has no built-in way to subset columns by index, but you can achieve the same result using standard subsetting with square brackets [].
df <- data.frame(x = 1:3, y = c("a", "b", "c"), z = c("A", "B", "C"))
# Subset by column index
dfsub <- df[, c(1, 2)]
dfsub
Output
x y
1 1 a
2 2 b
3 3 c
In the above code example, the data frame df with three rows and three columns.
Then, we used a standard subsetting with square brackets to create a new data frame dfsub that contains only columns 1 and 2.
That’s it.

Krunal Lathiya is a Software Engineer with over eight years of experience. He has developed a strong foundation in computer science principles and a passion for problem-solving. In addition, Krunal has excellent knowledge of Data Science and Machine Learning, and he is an expert in R Language.