How to Rename Column of Data Frame in R

If you working with large datasets and their structure and the naming convention is not standard, then we often modify the data set and turn it into the standard set. In these scenarios, we often change the name of the column or rename it to our needs. In that case, we need to figure out how to rename the column of the data frame.

There are many ways to rename a data frame column. One way is to use built-in R functions like names() or colnames(), and the other is to use the third-party library like dplyr and use its function. Let’s see how to do that.

How to Rename Column of Data Frame in R

To rename a column of a data frame in R, use the following basic R steps.

  1. Get column names using the function names() or colnames().
  2. Change the column names using the assignment operator.

The above method can be helpful if you are not exploring any third-party library. 

R does provide a second way to use the third-party package that provides a rename() function, which we can use to rename the column of the data frame.

Step 1: Using names() function

To create a data frame in R, use the data.frame() function and pass the necessary data.

shows_data <- data.frame(
 show_id = c(1:5),
 show_name = c("Bridgerton", "Lucifer", "Stranger Things", "HoC", "Better Call Saul"),
 show_ratings = c(7, 7.5, 8.8, 8.5, 8.5),
 stringsAsFactors = FALSE
)

shows_data

Output

   show_id   show_name    show_ratings
1     1     Bridgerton        7.0
2     2     Lucifer           7.5
3     3     Stranger Things   8.8
4     4       HoC             8.5
5     5     Better Call Saul  8.5

To get the column names of the above data frame, use the colnames() function.

shows_data <- data.frame(
 show_id = c(1:5),
 show_name = c("Bridgerton", "Lucifer", "Stranger Things", "HoC", "Better Call Saul"),
 show_ratings = c(7, 7.5, 8.8, 8.5, 8.5),
 stringsAsFactors = FALSE
)

colnames(shows_data)

Output

[1] "show_id"  "show_name"  "show_ratings"

You can also use the names() function to get the name of the columns.

names(shows_data)

Output

[1] "show_id"  "show_name"  "show_ratings"

And we get the column names of our data frame. You can see that we got three columns of the data frame which is why it returns a vector of three elements.

Step 2: Use data frame indexing to rename a column

We will rename the column show_ratings to imdb_ratings.

To do that, we will use data frame name indexing and select the column name and compare it to the show_ratings column, and if it matches, then we will reassign a new name, or we can call it a rename of the column. We will rename it to imdb_ratings.

shows_data <- data.frame(
 show_id = c(1:5),
 show_name = c("Bridgerton", "Lucifer", "Stranger Things", "HoC", "Better Call Saul"),
 show_ratings = c(7, 7.5, 8.8, 8.5, 8.5),
 stringsAsFactors = FALSE
)

shows_data

cat("----------------------------------------", "\n")
cat("After renaming a column", "\n")

names(shows_data)[names(shows_data) == "show_ratings"] <- "imdb_ratings"

cat("----------------------------------------", "\n")
shows_data

Output

    show_id    show_name    show_ratings
1       1      Bridgerton        7.0
2       2      Lucifer           7.5
3       3      Stranger Things   8.8
4       4      HoC               8.5
5       5      Better Call Saul  8.5
----------------------------------------
After renaming a column
----------------------------------------
    show_id     show_name    imdb_ratings
1      1        Bridgerton       7.0
2      2        Lucifer          7.5
3      3        Stranger Things  8.8
4      4        HoC              8.5
5      5        Better Call Saul 8.5

That is it. Now, you can see that the column name has been changed from show_ratings to imdb_ratings.

There is also another way in which you can change the column name.

In the previous example, we have renamed a column by comparing column names.

In this case, we will use numeric indexing and assign a new name by referencing the index of the column.

names(shows_data)[3] <- "imdb_ratings"

The above code will rename column number 3 of the data frame, the show_ratings, and rename it to imdb_ratings.

Data frame column indices start with 1 and not 0. So, our last column has index 3.

Using dplyr package to rename a column

The dplyr is a built-in R grammar package of data manipulation, providing a consistent set of verbs that help you solve the most common data manipulation challenges.

To import the package in R, add the following code at the start of the file.

library(tidyverse)

To rename the column of a data frame using the dplyr package, use its rename() function.

shows_data %>%
 rename(
 imdb_ratings = show_ratings
)

What the above code does is that it will rename the show_ratings column to the imdb_ratings column of the data frame.

The complete code is following.

library(tidyverse)

shows_data <- data.frame(
 show_id = c(1:5),
 show_name = c("Bridgerton", "Lucifer", "Stranger Things", "HoC", "Better Call Saul"),
 show_ratings = c(7, 7.5, 8.8, 8.5, 8.5),
 stringsAsFactors = FALSE
)
shows_data
cat("----------------------------------------", "\n")
cat("After renaming a column", "\n")
cat("----------------------------------------", "\n")
shows_data %>%
 rename(
 imdb_ratings = show_ratings
)

It will give us the output with changed column name.

That is it for renaming a column of a data frame in the R tutorial.

Leave a Comment