If you are working with a data frame where you need to perform mathematical operations like addition and subtraction on a column, that column is of type character, but the data is numbers. What will you do? This is where you need to convert that specific column to numeric.
Here are four ways to convert a data frame column from character to numeric in R:
When you are working with a data frame, with the help of the “transform()” function, use the as.numeric() function to convert a character vector to a numeric vector.
While converting, if you encounter character strings that do not represent numeric values, as.numeric() function will return NA.
df <- data.frame(
COL1 = c("1", "2", "3", "4"),
COL2 = c("11", "19", "21", "46")
)
# Print the original data frame
print("Original data frame:")
print(df)
# Convert the columns to numeric using transform() and as.numeric()
df_numeric <- transform(df,
COL1 = as.numeric(COL1),
COL2 = as.numeric(COL2)
)
# Print the converted data frame
print("Converted data frame:")
print(df_numeric)
sapply(df_numeric, class)
Output
You can see from the above figure that “COL2” is a type of character, and after applying a transform() and as.numeric(), we converted its type to numeric.
Using the transform() function, we can convert multiple columns in one go with the help of as.numeric() function.
# Sample data frame with character columns
df <- data.frame(
COL1 = c("1", "2", "3", "4"),
COL2 = c("11", "19", "21", "46"),
COL3 = c("1.5", "2.7", "3.9", "4.1") # Added a column with decimal values
)
# Print the original data frame and its structure
print("Original data frame:")
print(df)
str(df) # Check the structure
# Convert multiple columns to numeric using transform() and as.numeric()
df_numeric <- transform(df,
COL1 = as.numeric(COL1),
COL2 = as.numeric(COL2),
COL3 = as.numeric(COL3)
)
# Print the converted data frame and its structure
print("Converted data frame:")
print(df_numeric)
str(df_numeric) # Check the structure
# Verify the class of each column using sapply()
print("Column classes:")
sapply(df_numeric, class)
Output
Using the $ operator, you can select the “column” of a data frame and convert that column into a numeric using the “as.numeric()” function.
df <- data.frame(
COL1 = c("1", "2", "3", "4"),
COL2 = c("11", "19", "21", "46")
)
# Print the original data frame
print("Original data frame:")
print(df)
df$COL1 <- as.numeric(df$COL1)
# Print the converted data frame
print("Converted data frame:")
print(df)
sapply(df, class)
Output
You can use the [] operator, which is helpful if you want to refer to columns by name or index more dynamically.
df <- data.frame(
COL1 = c("1", "2", "3", "4"),
COL2 = c("11", "19", "21", "46")
)
# Print the original data frame
print("Original data frame:")
print(df)
df[, "COL1"] <- as.numeric(df[, "COL1"])
# Print the converted data frame
print("Converted data frame:")
print(df)
sapply(df, class)
Output
The mutate() function helps modify the columns, and you need to use mutate() with the as.numeric() function to convert specific columns into numeric.
library(dplyr)
df <- data.frame(
COL1 = c("1", "2", "3", "4"),
COL2 = c("11", "19", "21", "46")
)
# Print the original data frame
print("Original data frame:")
print(df)
df <- df %>%
mutate(COL1 = as.numeric(COL1))
# Print the converted data frame
print("Converted data frame:")
print(df)
sapply(df, class)
Output
Remember to handle potential issues with factors when converting columns to numeric. For example, if a column is a factor, convert it to a character and then to a numeric.
data$COL1 <- as.numeric(as.character(data$COL1))
That’s it!
Krunal Lathiya is a seasoned Computer Science expert with over eight years in the tech industry. He boasts deep knowledge in Data Science and Machine Learning. Versed in Python, JavaScript, PHP, R, and Golang. Skilled in frameworks like Angular and React and platforms such as Node.js. His expertise spans both front-end and back-end development. His proficiency in the Python language stands as a testament to his versatility and commitment to the craft.
Before executing an operation on an object, it is advisable to check its length, as…
Rounding is a process of approximating a number to a shorter, simpler, and more interpretable…
Whether you want to add new data to your existing datasets or create new variables…
The square root of a number is a value that is multiplied by itself, giving…
Duplicate rows refer to all the values across all columns that are the same in…
A vector is a data structure that holds the same type of data. When working…