** Introduction to Factors in R**

**Definition of factors in R**

R factor is a data type that can store categorical variables.

Factors are stored as a vector of integer values with a related set of character values called the **“levels”**.

Factors are stored as integers and have labels associated with these unique integers.

Use the **factor()** function to create a factor in R.

**Why factors are essential in data analysis**

Factors are helpful in data analysis because they streamline and accelerate the processing of categorical data.

Categorical data represents characteristics or attributes of observations, such as the type of a product, the ethnicity of a person, or the color of clothes.

Factors are also essential because many statistical and machine learning models expect the input to be numeric, and factors provide a suitable way to encode categorical data as numerical values.

**What is as.factor in R**

The **as.factor()** is a built-in **R** function that converts an R object like a vector or a data frame column from numeric to factor.

The** as.factor()** function takes a vector, column, or data frame as an argument and returns the requested column specified as a factor rather than a numeric one.

**Syntax**

`as.factor(input)`

**Parameters**

**input:** The **as.factor()** function takes x as a column in an object of class or data frame.

**Return value**

It returns the original class object with the requested column specified as a factor rather than a numeric.

**Usage**

- The
**as.factor()**function can convert a character vector to a factor. - The
**as.factor()**function can convert a numeric vector to a factor. - It specifies the levels of a factor.
- It reorders the levels of a factor.

**Coding implementation**

Let’s define a character vector using the c() function.

```
data <- c("m", "l", "a")
as.factor(data)
```

**Output**

```
[1] m l a
Levels: a l m
```

**How to use as.factor in R**

**Using as.factor() function to the character object containing numbers**

Use the **as.factor()** function to the numeric vector to a factor and see the output.

```
data <- c(1.1, 11, 2.2, 19, 21)
as.factor(data)
```

**Output**

```
[1] 1.1 11 2.2 19 21
Levels: 1.1 2.2 11 19 21
```

**Example of converting a character vector to a factor**

You can convert a character vector to a factor using the** as.factor()** function.

The **as.factor()** function takes a vector of character values and returns the factor.

```
data <- c("zack", "synder", "cut")
as.factor(data)
```

**Output**

```
[1] zack synder cut
Levels: cut synder zack
```

**Example of converting a factor to a character vector**

Use the **as.character()** function to convert a factor to a character vector.

```
data <- c("zack", "synder", "cut")
factr <- as.factor(data)
chr <- as.character(factr)
chr
```

**Output**

`1] "zack" "synder" "cut"`

Using the **as.numeric()** function, you can convert a factor to numbers.

```
data <- c("zack", "synder", "cut")
factr <- as.factor(data)
intr <- as.numeric(factr)
intr
```

**Output**

`[1] 3 2 1`

We used an **as.numeric()** function to convert a factor to a numeric vector. You can see in the output that the numeric codes correspond to the factor levels. For example, “**zack**“ corresponds to 3, “**snyder**“ corresponds to 2, and “cut” fits 1.

**Example of using as.factor() function to a data frame**

You can use the **as.factor()** function to convert a specific data frame column to a factor.

```
df <- data.frame(Singer = c("MJ", "Justin", "Drake", "Selena", "Rema", "Ed"),
Age = c(64, 30, 40, 30, 25, 38))
df$Singer <- as.factor(df$Singer)
print(df$Singer)
```

**Output**

```
[1] MJ Justin Drake Selena Rema Ed
Levels: Drake Ed Justin MJ Rema Selena
```

In this example, we created a data frame using the **data.frame()** function which has two columns.

- Singer
- Age

We converted the **“Singer”** column to the factor using the **as.factor()** function and printed the factor with six levels.

**In-depth look at as.factor() function and its parameters**

**Changing the levels of a factor**

The **levels()** function provides access to the levels attribute of a variable. Use the **levels()** function along with a **factor()** function to change the levels of a factor in R.

```
char_vec <- c("k", "b", "l", "c", "n", "d")
factr <- as.factor(char_vec)
levels(factr) <- c("k", "b", "l", "f", "d", "m", "n")
factr
```

**Output**

```
[1] f k d b m l
Levels: k b l f d m n
```

We modified the **factor levels** using the **levels()** function in this example by assigning a new levels vector.

The **levels()** function accepts the new levels in the form of a vector and returns the new levels when we print the values of new levels of that factor.

**Reordering the levels of a factor**

Use the **relevel()** function along with the **factor()** function to reorder the levels of a **factor** in **R**.

```
char_vec <- c("k", "b", "l", "c", "n", "d")
factr <- as.factor(char_vec)
factr <- relevel(factr, ref = "b")
factr
```

**Output**

```
[1] k b l c n d
Levels: b c d k l n
```

In this code example, the level “b” is moved to the front of the factor using the **relevel()** function.

The resulting factor has levels “b”, “c”, “d”, “k”, “l”, and “n”.

**Combining multiple factors into a single factor**

You can use the **as.factor()** function in combination with the **c()** function to combine multiple factors into a single factor in R.

```
char_vec_one <- c("k", "b", "l")
char_vec_two <- c("c", "n", "d", "s")
factor_one <- as.factor(char_vec_one)
factor_two <- as.factor(char_vec_two)
combined_factor <- as.factor(c(factor_one, factor_two))
combined_factor
```

**Output**

```
[1] k b l c n d s
Levels: b k l c d n s
```

After running the above code, we get the combined factor, and if you see the values and levels of that factor, you will see that it is the **combination** of both **factor_one** and **factor_two**.

The resulting factor “combined_factor” has seven levels: “b”, “k”, “l”, “c”, “d”, “n” and “s”.

**Performing advanced operations on factors**

**Splitting a factor into multiple factors**

You can use the **split()** function in combination with the** unlist()** function to split a factor into multiple factors in R.

```
char_vec_one <- c("k", "b", "l", "s", "d", "n")
factor_one <- as.factor(char_vec_one)
split <- split(unlist(factor_one), rep(1:2, c(3, 3)))
split
```

**Output**

```
$`1`
[1] k b l
Levels: b d k l n s
$`2`
[1] s d n
Levels: b d k l n s
```

In this code, we splitted a factor into two factors using the **split()** and **unlist()** functions.

Each element will be a factor with three levels: **“k”**, **“b”**, and **“l”** for the first element, and **“s”**, **“d”**, and **“n”** for the second element.

**Removing unused levels from a factor**

You can use the **droplevels()** function to remove unused levels from a factor in R. It will remove any levels that do not exist in the factor.

```
char_vec_one <- c("k", "b", "l", "s", "d", "n")
factor_one <- factor(char_vec_one, level = c("k", "b", "l", "s", "d", "n", "f", "m"))
drop <- droplevels(factor_one)
drop
```

**Output**

```
[1] k b l s d n
Levels: k b l s d n
```

In this example, the factor_one has levels “k”, “b”, “l”, “s”, “d”, “n”, “f” and “m”.

The levels **“f”** and** “m”** don’t exist in the factor, so they are removed by the **droplevels()** function.

The resulting factor “drop” has only three levels: “k”, “b”, “l”, “s”, “d” and “n”.

**Conclusion**

The **as.factor()** function is a wrapper for factor, allowing quick return if the input vector is already a factor.

Use the **as.factor()** function is helpful to convert a numeric or character vector to a factor.

