Introduction to Factors in R
Definition of factors in R
R factor is a data type that can store categorical variables.
Factors are stored as a vector of integer values with a related set of character values called the “levels”.
Factors are stored as integers and have labels associated with these unique integers.
Use the factor() function to create a factor in R.
Why factors are essential in data analysis
Factors are helpful in data analysis because they streamline and accelerate the processing of categorical data.
Categorical data represents characteristics or attributes of observations, such as the type of a product, the ethnicity of a person, or the color of clothes.
Factors are also essential because many statistical and machine learning models expect the input to be numeric, and factors provide a suitable way to encode categorical data as numerical values.
What is as.factor in R
The as.factor() is a built-in R function that converts an R object like a vector or a data frame column from numeric to factor.
The as.factor() function takes a vector, column, or data frame as an argument and returns the requested column specified as a factor rather than a numeric one.
Syntax
as.factor(input)
Parameters
input: The as.factor() function takes x as a column in an object of class or data frame.
Return value
It returns the original class object with the requested column specified as a factor rather than a numeric.
Usage
- The as.factor() function can convert a character vector to a factor.
- The as.factor() function can convert a numeric vector to a factor.
- It specifies the levels of a factor.
- It reorders the levels of a factor.
Coding implementation
Let’s define a character vector using the c() function.
data <- c("m", "l", "a")
as.factor(data)
Output
[1] m l a
Levels: a l m
How to use as.factor in R
Using as.factor() function to the character object containing numbers
Use the as.factor() function to the numeric vector to a factor and see the output.
data <- c(1.1, 11, 2.2, 19, 21)
as.factor(data)
Output
[1] 1.1 11 2.2 19 21
Levels: 1.1 2.2 11 19 21
Example of converting a character vector to a factor
You can convert a character vector to a factor using the as.factor() function.
The as.factor() function takes a vector of character values and returns the factor.
data <- c("zack", "synder", "cut")
as.factor(data)
Output
[1] zack synder cut
Levels: cut synder zack
Example of converting a factor to a character vector
Use the as.character() function to convert a factor to a character vector.
data <- c("zack", "synder", "cut")
factr <- as.factor(data)
chr <- as.character(factr)
chr
Output
1] "zack" "synder" "cut"
Using the as.numeric() function, you can convert a factor to numbers.
data <- c("zack", "synder", "cut")
factr <- as.factor(data)
intr <- as.numeric(factr)
intr
Output
[1] 3 2 1
We used an as.numeric() function to convert a factor to a numeric vector. You can see in the output that the numeric codes correspond to the factor levels. For example, “zack“ corresponds to 3, “snyder“ corresponds to 2, and “cut” fits 1.
Example of using as.factor() function to a data frame
You can use the as.factor() function to convert a specific data frame column to a factor.
df <- data.frame(Singer = c("MJ", "Justin", "Drake", "Selena", "Rema", "Ed"),
Age = c(64, 30, 40, 30, 25, 38))
df$Singer <- as.factor(df$Singer)
print(df$Singer)
Output
[1] MJ Justin Drake Selena Rema Ed
Levels: Drake Ed Justin MJ Rema Selena
In this example, we created a data frame using the data.frame() function which has two columns.
- Singer
- Age
We converted the “Singer” column to the factor using the as.factor() function and printed the factor with six levels.
In-depth look at as.factor() function and its parameters
Changing the levels of a factor
The levels() function provides access to the levels attribute of a variable. Use the levels() function along with a factor() function to change the levels of a factor in R.
char_vec <- c("k", "b", "l", "c", "n", "d")
factr <- as.factor(char_vec)
levels(factr) <- c("k", "b", "l", "f", "d", "m", "n")
factr
Output
[1] f k d b m l
Levels: k b l f d m n
We modified the factor levels using the levels() function in this example by assigning a new levels vector.
The levels() function accepts the new levels in the form of a vector and returns the new levels when we print the values of new levels of that factor.
Reordering the levels of a factor
Use the relevel() function along with the factor() function to reorder the levels of a factor in R.
char_vec <- c("k", "b", "l", "c", "n", "d")
factr <- as.factor(char_vec)
factr <- relevel(factr, ref = "b")
factr
Output
[1] k b l c n d
Levels: b c d k l n
In this code example, the level “b” is moved to the front of the factor using the relevel() function.
The resulting factor has levels “b”, “c”, “d”, “k”, “l”, and “n”.
Combining multiple factors into a single factor
You can use the as.factor() function in combination with the c() function to combine multiple factors into a single factor in R.
char_vec_one <- c("k", "b", "l")
char_vec_two <- c("c", "n", "d", "s")
factor_one <- as.factor(char_vec_one)
factor_two <- as.factor(char_vec_two)
combined_factor <- as.factor(c(factor_one, factor_two))
combined_factor
Output
[1] k b l c n d s
Levels: b k l c d n s
After running the above code, we get the combined factor, and if you see the values and levels of that factor, you will see that it is the combination of both factor_one and factor_two.
The resulting factor “combined_factor” has seven levels: “b”, “k”, “l”, “c”, “d”, “n” and “s”.
Performing advanced operations on factors
Splitting a factor into multiple factors
You can use the split() function in combination with the unlist() function to split a factor into multiple factors in R.
char_vec_one <- c("k", "b", "l", "s", "d", "n")
factor_one <- as.factor(char_vec_one)
split <- split(unlist(factor_one), rep(1:2, c(3, 3)))
split
Output
$`1`
[1] k b l
Levels: b d k l n s
$`2`
[1] s d n
Levels: b d k l n s
In this code, we splitted a factor into two factors using the split() and unlist() functions.
Each element will be a factor with three levels: “k”, “b”, and “l” for the first element, and “s”, “d”, and “n” for the second element.
Removing unused levels from a factor
You can use the droplevels() function to remove unused levels from a factor in R. It will remove any levels that do not exist in the factor.
char_vec_one <- c("k", "b", "l", "s", "d", "n")
factor_one <- factor(char_vec_one, level = c("k", "b", "l", "s", "d", "n", "f", "m"))
drop <- droplevels(factor_one)
drop
Output
[1] k b l s d n
Levels: k b l s d n
In this example, the factor_one has levels “k”, “b”, “l”, “s”, “d”, “n”, “f” and “m”.
The levels “f” and “m” don’t exist in the factor, so they are removed by the droplevels() function.
The resulting factor “drop” has only three levels: “k”, “b”, “l”, “s”, “d” and “n”.
Conclusion
The as.factor() function is a wrapper for factor, allowing quick return if the input vector is already a factor.
Use the as.factor() function is helpful to convert a numeric or character vector to a factor.

Krunal Lathiya is a Software Engineer with over eight years of experience. He has developed a strong foundation in computer science principles and a passion for problem-solving. In addition, Krunal has excellent knowledge of Data Science and Machine Learning, and he is an expert in R Language.