Detailed Guide on %in% Operator in R

The %in% operator checks if elements of one vector are present in another vector. It returns a logical vector of the same length as the left-side argument. Each element in the output is either TRUE or FALSE, depending on the value matching. If the value matches, it returns TRUE; otherwise, it returns FALSE.

If you apply the %in% operator on a character vector, then a case of both vectors matters. For example, “A” %in% c(“a”, “B”) returns FALSE. It is helpful for subsetting, data filtering, and conditional checks in data analysis.

Vectors

Usage of the %in% operator
Figure 1: Usage of the %in% operator
vec <- 4 
main_vec <- 1:5 

print(vec %in% main_vec)

# [1] TRUE

Let’s check for an element that does not exist in the main vector.

Check for an element that does not exist
Figure 2: Check for an element that does not exist
vec <- 6
main_vec <- 1:5

print(vec %in% main_vec)

# [1] FALSE

Handling NA

NA == NA and that’s why it returns TRUE.

left_vec <- c(11, NA, 19)
right_vec <- c(21, NA, 18)

left_vec %in% right_vec

# Output: FALSE TRUE FALSE

If you compare NA with other values except NA, it returns FALSE.

Data frame

Using %in% operator with data frame
Figure 3: Using %in% operator with data frame
df <- data.frame(
  col1 = c(1, 2, 3),
  col2 = c(4, 5, 6),
  col3 = c(7, 8, 9)
)

# Vector to check against 'col2'
vec <- c(4, 5, 8)

# Check if elements of vec are in 'col2' of df
is_contain <- vec %in% df$col2

# Print the result
print(is_contain)

Output

[1]  TRUE  TRUE  FALSE

Adding a column with the help of %in%

df <- data.frame(
  col1 = c(1, 2, 3),
  col2 = c(4, 5, 6),
  col3 = c(7, 8, 9)
)

# Vector to check against column 'col2'
vec <- c(4, 5, 8)

# Check if elements of v are in column A of df
df$is_contain <- vec %in% df$col2

# Print the updated data frame
print(df)

Creating a column of the data frame using the %in% operator

 

In this updated df, the new column is_contain will be TRUE for rows where the value in column col2 matches any element in the vector vec and FALSE otherwise.

For the given data, rows 1 and 2 will have is_contain as TRUE, while the third row is FALSE.

Dropping column

Dropping a column from a data frame using the %in% operator involves a combination of this operator and the subsetting features.

While %in% is typically used for checking membership, it can indirectly exclude specific columns from a data frame.

For example, if you want to drop a list of column names, you can use this operator to create a logical vector suggesting which columns should be kept (i.e., those not in your list) and then subset the data frame accordingly.

df <- data.frame(
  col1 = c(1, 2, 3),
  col2 = c(4, 5, 6),
  col3 = c(7, 8, 9)
)
cat("Before removing the column", "\n")
df

# Column to drop
columns_to_drop <- c("col2")

# Subset the data frame to keep columns not in 'columns_to_drop'
df <- df[, !(names(df) %in% columns_to_drop)]

# Print the updated data frame
cat("After removing the column", "\n")
print(df)

Drop a column using %in% operator

Lists

You can also check if a list elements exist using the %in% operator.

main_list <- list(a = 1:3, b = "mongo", c = TRUE)

"mongo" %in% main_list
# Output: [1] TRUE (because "mongo" is inside a list element)

x <- 3
x %in% main_list

# Output: [1] FALSE 
# Not a direct element

The first element “mongo” directly appears as an element in the main_list, so it returns TRUE. However, element three does not directly appear in the list, so it returns FALSE. It appears to be a vector value of the first element.

That’s it!

Leave a Comment