R Basic

Understanding the as.factor() Function in R: Converting to Categorical Data


Before diving into the as.factor() function, let’s understand categorical data and why it is helpful in statistical analysis.

Categorical data is data that is divided into groups and categories that are distinct from each other. For example, colors (red, green, blue), brands (Nike, Puma, Asics), or types of cuisine (Italian, Chinese, Indian) are categories.

As you can see in the above figure, we defined two types of data:

  1. Numerical data: It consists of numbers or numeric values.
  2. Categorical data: It consists of labels you can divide into specific groups or categories, such as mode of transportation and colors.

The main difference between numerical and categorical data is that numerical data can be ordered meaningfully, whereas categorical data represents characteristics or attributes and is often descriptive in nature.

Factors are helpful in statistical modeling because they help in handling categorical data accurately.

What is the as.factor() function?

The as.factor() function converts a vector object to a factor in R. 

In other words, you have a vector of t-shirt size, and you want R to identify “Small”, “Medium”, or “Large” as distinct categories rather than just text.




Name Description
input (required) It is a vector object.

Return value

It returns a factor object.

Example 1: Converting numeric vector to factor

The above figure shows that we convert a numeric vector to a factor, and each unique value of a vector becomes the level of the factor.

mixed_vec <- c(1.1, 11, 2.2, 19)



[1] 1.1 11 2.2 19
Levels: 1.1 2.2 11 19

Example 2: Converting character vector to factor

In the above figure, the as.factor() function converts each unique character of a character vector into the level of the factor.

char_vec <- c("zack", "john", "jian")



[1] zack john jian
Levels: jian john zack

Example 3: Converting data frame column to factor

The above figure shows that as.factor() function converts a single column of the data frame to a factor.

df <- data.frame(
  name = c("Krunal", "Ankit", "Rushabh"),
  score = c(85, 90, 78),
  subject = c("Math", "Math", "History"),
  grade = c("10th", "11th", "11th")

df$grade <- as.factor(df$grade)



[1] 10th 11th 11th
Levels: 10th 11th

That’s all!

Recent Posts

Unlocking Statistical Consistency with set.seed in R

Picture this: You are playing Snakes and Ladder and need the dice to roll the…

4 months ago

What is copy-on-modify Semantics in R

The copy-on-modify semantics is a memory management technique that modifies one or more objects, copies…

4 months ago

How to Find Standard deviation in R [Using Real-life DataSet]

The standard deviation is a measure that tells you how spread out data are in…

11 months ago

How to Calculate Mean in R

Mean means the arithmetic average of a number in mathematics. An average is the sum…

11 months ago

How to Append an Element to a List at Any Position in R

List in R is a data structure that can hold multiple types of elements. You…

11 months ago

R ln() Function (SciViews Package)

The ln() function from the SciViews package calculates the natural log of the input vector.…

1 year ago