What is the stringAsFactors in R

The stringsAsFactors in R is an argument of the “data.frame()” function. While creating a data frame, there is an argument of stringsAsFactors. The “stringsAsFactors” is a logical argument suggesting whether the strings in a data frame should be treated as factor variables or just plain strings.

netflix_data <- data.frame(
  show_id = c(1:4),
  show_name = c("Cabinet of Curiosities", "Stranger Things", 
                "Rick and Morty", "Locke and Key"),
  seasons = c(1, 4, 6, 3),
  stringsAsFactors = FALSE
)

print(netflix_data)

Output

  show_id   show_name             seasons
1    1     Cabinet of Curiosities   1
2    2     Stranger Things          4
3    3     Rick and Morty           6
4    4     Locke and Key            3

We used the stringAsFactors = FALSE as we plan to change the type of strings we will use in the data frame.

The strings are read by default as factors in R which means your data is stored effectively because each unique string gets a number, and whenever it’s used in the data frame, you can store its numerical value.

If you assign any value to that column that is not in the list of factor strings, you will get an error.

To avoid the conversion of strings to factors in R while using the base R function, use the stringsAsFactors = FALSE.

Conclusion

The default behavior of R when creating data frames is to convert all characters into factors. To prevent converting all characters into factors in R, use the stringsAsFactors = FALSE.

Leave a Comment