How to Drop Data Frame Columns by Name in R

There are the following methods to drop data frame columns by name in R.

  1. Method 1: Using the df_new <- subset(df, select = -c(col2, col4))
  2. Method 2: Using the df_new <- df %>% select(-c(col2, col4))
  3. Method 3: Using the !(names(DF) %in% drops)]
  4. Method 4: Using the setDT(df)[, c(“col2”, “col4”) := NULL]
  5. Method 5: Using the within(df, rm(col2, col4))

Method 1: Using the subset() function

The easiest way to drop data frame columns by name in R is using the subset() function. The syntax to remove columns by name is df_new <- subset(df, select = -c(col2, col4)), where df is a data frame and col2 and col4 are the columns that need to be dropped.

df <- data.frame(
  name = c("Krunal", "Ankit", "Rushabh", "Niva"),
  age = c(30, 28, 31, 26),
  gender = c("M", "M", "M", "F"),
  salary = c(8000, 7000, 5500, 6500)
)

df_after_dropped <- subset(df, select = -c(age, salary))

df_after_dropped

Output

   name    gender
1  Krunal    M
2  Ankit     M
3  Rushabh   M
4  Niva      F

We removed two columns by name: age, and salary and returned result data frame does not contain these two columns.

Method 2: Using the select() function

You can use the select() function to choose or drop columns by name, index, or pattern. In addition, you can use the pipe operator %>% to chain multiple operations.

library(dplyr)

df <- data.frame(
  name = c("Krunal", "Ankit", "Rushabh", "Niva"),
  age = c(30, 28, 31, 26),
  gender = c("M", "M", "M", "F"),
  salary = c(8000, 7000, 5500, 6500)
)

df_after_dropped <- df %>% select(-c(age, salary))

df_after_dropped

Output

   name    gender
1  Krunal    M
2  Ankit     M
3  Rushabh   M
4  Niva      F

Method 3: Using the %in% operator

You can use  the %in% operator to check if the names of the columns in df are in the drops vector and then use the ! operator to negate the result. This will return a logical vector that indicates which columns to keep or drop. Then you use the [ , ] operator to subset the data frame by columns.

df <- data.frame(
 name = c("Krunal", "Ankit", "Rushabh", "Niva"),
 age = c(30, 28, 31, 26),
 gender = c("M", "M", "M", "F"),
 salary = c(8000, 7000, 5500, 6500)
)

drops <- c("age", "salary")

df[, !(names(df) %in% drops)]

Output

   name    gender
1  Krunal    M
2  Ankit     M
3  Rushabh   M
4  Niva      F

Method 4: Using the setDT() function

You can use the data.table package with a concise and fast syntax that uses the [] operator to perform operations on a data table object. You can use the := operator to assign or delete columns by name, index, or pattern. You can also use helper functions like .SDcols to select columns by a criterion.

library(data.table)

df <- data.frame(
  name = c("Krunal", "Ankit", "Rushabh", "Niva"),
  age = c(30, 28, 31, 26),
  gender = c("M", "M", "M", "F"),
  salary = c(8000, 7000, 5500, 6500)
)

setDT(df)

df[, c("age", "salary") := NULL]

df

Output

    name    gender
1:  Krunal    M
2:  Ankit     M
3:  Rushabh   M
4:  Niva      F

Method 5: Using the within() and rm() functions

You can use the within() function that evaluates an expression within an environment constructed from the data frame and then the rm() function inside the expression to remove columns by name.

df <- data.frame(
  name = c("Krunal", "Ankit", "Rushabh", "Niva"),
  age = c(30, 28, 31, 26),
  gender = c("M", "M", "M", "F"),
  salary = c(8000, 7000, 5500, 6500)
)

df_columns_dropped <- within(df, rm(age, salary))

df_columns_dropped

Output

   name    gender
1  Krunal    M
2  Ankit     M
3  Rushabh   M
4  Niva      F

Conclusion

The most common way to remove or drop the data frame columns by name is to use the subset() or select() function in R.

Leave a Comment