When you are working with real-time projects, you often come across raw datasets where columns have names like (“v1”, “var2”, and “cust_id”). These names seem very cryptic and are not descriptive.
That’s why we must rename them to descriptive names (e.g., “age”, “income”, “customer_ID”), which makes the data much easier to understand at a glance.
Here are the four ways to rename a data frame column in R:
- Using colnames()
- Using names()
- Using rename() from the dplyr
- Using rename_with() from dplyr
Method 1: Using colnames()
The most efficient way to rename a single column is to use Base R’s colnames() function, which sets or gets the names of the data frame columns. This function provides flexibility, allowing you to rename a column by name or position.
For renaming, you want to assign a new character vector to colnames(your_data_frame). Ensure that the vector length is the same as the number of columns.
df <- data.frame(
name = c("Krunal", "Ankit", "Rushabh"),
score = c(85, 90, 78),
subject = c("Math", "Math", "History"),
grade = c("10th", "11th", "11th")
)
print("Before renaming the column")
df
colnames(df)[colnames(df) == "grade"] <- "class"
print("After renaming the column")
df
Output
You can see from the above visual representation that we renamed the column from “grade” to “class”.
Pros
- The colnames() is a base method. So, it does not require any third-party package.
- It works blazing for simply renaming.
Cons
- It becomes cumbersome while renaming multiple columns. It’s not so efficient in that scenario.
- It relies on the order of the columns. So, if the order changes, you might see unexpected results.
Method 2: Using names()
The names() is a general-purpose base function that can be used to get or set names of various R objects, including data frame columns.
How can I use it? Well, assign a new character vector (column name) to names(your_data_frame). It will be renamed with your new character vector.
df <- data.frame( name = c("Krunal", "Ankit", "Rushabh"), score = c(85, 90, 78), subject = c("Math", "Math", "History"), grade = c("10th", "11th", "11th") ) print("Before renaming the column") df names(df)[names(df) == "grade"] <- "class" print("After renaming the column") df
Output
As illustrated in the above image, our final data frame’s column name has been changed.
The pros and cons are the same as the “colnames()” method.
Method 3: Using rename() function from dplyr package
You can also use the dplyr::rename() function, where you specify the new column name on the left side of the = and the old name on the right side.
library(dplyr)
df <- data.frame(
name = c("Krunal", "Ankit", "Rushabh"),
score = c(85, 90, 78),
subject = c("Math", "Math", "History"),
grade = c("10th", "11th", "11th")
)
print("Before renaming the column")
df
print("After renaming the column")
df %>% rename(class = grade)
Output
You can see from the above figure that dplyr provides a concise syntax for renaming single or multiple columns of the data frame.
Pros
- It is a perfect way to rename even multiple columns as well. It is fast and efficient.
- It provides an easier syntax that is concise and easily understandable.
Cons
- We need to install a separate package, “dplyr”, for this approach.
Method 4: Using rename_with() from dplyr package
If you want to rename single or multiple columns at once, you can use the “rename_with()” function from the “dplyr” package.
This approach is not only helpful for changing the column name but also for converting column names to lowercase or replacing specific characters.
library(dplyr)
df <- data.frame(
name = c("Krunal", "Ankit", "Rushabh"),
score = c(85, 90, 78),
subject = c("Math", "Math", "History"),
grade = c("10th", "11th", "11th")
)
print("Before renaming the column")
df
print("After renaming the column")
df <- df %>% rename_with(~ ifelse(. == "grade", "class", .), .cols = "grade")
df
Output
Pros
- It provides a transforming function through which you can perform any operation, including changing the column name. Thus, it is a multi-functional approach.
- It is also efficient for renaming multiple columns at once.
Cons
- You must install the “dplyr” package in your R environment.
- This approach might be an overkill if you just want to rename a column.
Summary
For renaming a single column, use the “colnames()” or “names()” function.
For renaming multiple columns at once in R, use the “rename_with()” or “rename()” function from the “dplyr” package.
Krunal Lathiya is a seasoned Computer Science expert with over eight years in the tech industry. He boasts deep knowledge in Data Science and Machine Learning. Versed in Python, JavaScript, PHP, R, and Golang. Skilled in frameworks like Angular and React and platforms such as Node.js. His expertise spans both front-end and back-end development. His proficiency in the Python language stands as a testament to his versatility and commitment to the craft.