%in% in R: How to Check Element in Vector or DataFrame

The %notin% operator negates the %in% operator. In this article, we will learn how to use %in% to check if the element belongs to a vector or dataframe, create a new column of the data frame, and remove the column from the data frame.

How to Check Element in Vector or data frame in R

To check if the element belongs to a vector or dataframe in R, use the %in% operator. 

%in% in R can be used in the data frame in the following circumstances.

  • To create a new variable of a column using the %in% operator.
  • To select a column of a data frame in R using the %in% operator.
  • To delete or drop the column of a data frame in R using the %in% operator.

%in% in R

The %in% in R is a built-in R operator that returns TRUE if an element belongs to a vector or data frame or FALSE otherwise. The %in% will check if two vectors contain overlapping numbers.

%in% operator example

To use the %in% operator, define two vectors and one sequence, and we will check if the vectors belong to a sequence.

v1 <- 4
v2 <- 11
t <- 1:10

print(v1 %in% t)
print(v2 %in% t)

Output

RScript Pro.R
[1] TRUE
[1] FALSE

You can see that the v1 vector belongs to t cause 4 belongs to 1:10, and that is why it returns TRUE, but 11 does not, so it returns FALSE.

%in% operator in R for dataframe

A data frame is a table or a two-dimensional array-like structure in which each column contains values of one variable, and each row contains one set of values from each column.

Let’s define a dataframe.

df <- data.frame(
 group = c("Laptop", "Mobile", "Game Console", "Assistants", "Game Console"),
 name = c("Alienware", "iPhone 12 Pro", "Xbox", "Alexa", "Playstation 5"),
 price = c(3000, 1000, 300, 200, 500)
)

print(df)

Output

   group      name           price
1  Laptop     Alienware     3000
2  Mobile     iPhone 12 Pro  1000
3  Game       Xbox           300
4  Assistants Alexa          200
5  Game       Playstation 5  500

Okay, if you see the data frame carefully, the Game group is repeated one more time. Now, we will check if the Game group is included in the data frame or not. If it is in, then it will return TRUE. Here, it appears two times, which means we will get two TRUEs and other FALSE.

df <- data.frame(
 group = c("Laptop", "Mobile", "Game", "Assistants", "Game"),
 name = c("Alientware", "iPhone 12 Pro", "Xbox", "Alexa", "Playstation 5"),
 price = c(3000, 1000, 300, 200, 500)
)
print(df$group %in% "Game")

Output

[1] FALSE FALSE TRUE FALSE TRUE

You can see that the Game group has appeared two times, which is why it returns TRUE two times. It will return FALSE if it is not included in the data frame.

Create a column of the data frame using the %in% operator in R

You can create a new column in the data frame using the %in% operator. By creating a new column means creating a new variable.

df <- data.frame(
 group = c("Laptop", "Mobile", "Game", "Assistants", "Game"),
 name = c("Alienware", "iPhone 12 Pro", "Xbox", "Alexa", "Playstation 5"),
 price = c(3000, 1000, 300, 200, 500)
)

df_new <- within(df, {
 is_game <- "No"
 is_game[name %in% c("Xbox", "Playstation 5")] <- "Yes"
 is_game[group %in% c("Game")] <- "Yes"
 is_game[group %in% c("Laptop", "Mobile", "Assistants")] <- "No"
})

print(df_new)

Output

   group      name          price   is_game
1  Laptop     Alienware     3000    NO
2  Mobile     iPhone 12 Pro 1000    NO
3  Game       Xbox          300     YES
4  Assistants Alexa         200     NO
5  Game       Playstation 5 500     YES

In this example, we create a new variable called is_game, which checks for the group column, and if the column name is Game, then it returns. Yes, otherwise, it returns No. So, the values of the new column is_game return Yes or No.

To create a new variable, namely “is_game“, we will be using the %in% operator, assign Yes if the group is Game, or the name is Xbox or Playstation 5 products. Else it will assign No.

Drop column using %in% operator

You can use the %in% operator to drop or delete a column from the data frame.

df <- data.frame(
 group = c("Laptop", "Mobile", "Game", "Assistants", "Game"),
 name = c("Alienware", "iPhone 12 Pro", "Xbox", "Alexa", "Playstation 5"),
 price = c(3000, 1000, 300, 200, 500)
)

print(df)
cat("After removing price column", "\n")
df[, !(colnames(df) %in% c("price"))]

Output

   group      name           price
1  Laptop     Alienware      3000
2  Mobile     iPhone 12 Pro  1000
3  Game       Xbox           300
4  Assistants Alexa          200
5  Game       Playstation 5  500

After removing price column

   group       name
1  Laptop      Alienware
2  Mobile      iPhone 12 Pro
3  Game        Xbox
4  Assistants  Alexa
5  Game        Playstation 5

You can see in the output that the third column price is removed from the new_df data frame.

Difference Between the == and %in% Operators in R

The %in% operator is used for matching values. On the other hand, the == operator is a logical operator and compares if two elements are exactly equal. 

Conclusion

The %in% operator in R can check if an element belongs to a vector or data frame. The use of the %in% operator is to match values in, e.g., two different vectors.

See also

Paste in R

R append to List

R array

R Factor and Factor Levels

R List

Leave a Comment