Detailed Guide on %in% Operator in R

The %in% operator checks if elements of one vector are present in another vector. It returns a logical vector of the same length as the left-side argument.

Each element in the output is either TRUE or FALSE, depending on the value matching. If the value matches, it returns TRUE; otherwise, it returns FALSE.

If you apply the %in% operator on a character vector, then a case of both vectors matters.

For example, “A” %in% c(“a”, “B”) returns FALSE. It is helpful for subsetting, data filtering, and conditional checks in data analysis.

Vectors

vec <- 4 

main_vec <- 1:5 

print(vec %in% main_vec)

# [1] TRUE

Let’s check for an element that does not exist in the main vector.

vec <- 6
main_vec <- 1:5

print(vec %in% main_vec)

# [1] FALSE

Handling NA

NA == NA and that’s why it returns TRUE.

left_vec <- c(11, NA, 19)

right_vec <- c(21, NA, 18)

left_vec %in% right_vec

# Output: FALSE TRUE FALSE

If you compare NA with other values except NA, it returns FALSE.

Data frame

df <- data.frame(
  col1 = c(1, 2, 3),
  col2 = c(4, 5, 6),
  col3 = c(7, 8, 9)
)

# Vector to check against 'col2'
vec <- c(4, 5, 8)

# Check if elements of vec are in 'col2' of df
is_contain <- vec %in% df$col2

# Print the result
print(is_contain)

Output

[1]  TRUE  TRUE  FALSE

Adding a column with the help of %in%

df <- data.frame(
  col1 = c(1, 2, 3),
  col2 = c(4, 5, 6),
  col3 = c(7, 8, 9)
)

# Vector to check against column 'col2'
vec <- c(4, 5, 8)

# Check if elements of v are in column A of df
df$is_contain <- vec %in% df$col2

# Print the updated data frame
print(df)

In this updated df, the new column is_contain will be TRUE for rows where the value in column col2 matches any element in the vector vec and FALSE otherwise.

For the given data, rows 1 and 2 will have is_contain as TRUE, while the third row is FALSE.

Dropping column

Dropping a column from a data frame using the %in% operator involves a combination of this operator and the subsetting features.

While the %in% operator is typically used for checking membership, it can also be used to indirectly exclude specific columns from a data frame.

For example, if you want to drop a list of column names, you can use this operator to create a logical vector suggesting which columns should be kept (i.e., those not in your list) and then subset the data frame accordingly.

df <- data.frame(
  col1 = c(1, 2, 3),
  col2 = c(4, 5, 6),
  col3 = c(7, 8, 9)
)
cat("Before removing the column", "\n")
df

# Column to drop
columns_to_drop <- c("col2")

# Subset the data frame to keep columns not in 'columns_to_drop'
df <- df[, !(names(df) %in% columns_to_drop)]

# Print the updated data frame
cat("After removing the column", "\n")
print(df)

Lists

You can also check if a list elements exist using the %in% operator.

main_list <- list(a = 1:3, b = "mongo", c = TRUE)

"mongo" %in% main_list
# Output: [1] TRUE (because "mongo" is inside a list element)

x <- 3
x %in% main_list

# Output: [1] FALSE 
# Not a direct element

The first element “mongo” directly appears as an element in the main_list, so it returns TRUE.

However, element three does not directly appear in the list, so it returns FALSE. It appears to be a vector value of the first element.

That’s it!

Krunal Lathiya

Krunal Lathiya is a seasoned Computer Science expert with over eight years in the tech industry. He boasts deep knowledge in Data Science and Machine Learning. Versed in Python, JavaScript, PHP, R, and Golang. Skilled in frameworks like Angular and React and platforms such as Node.js. His expertise spans both front-end and back-end development. His proficiency in the Python language stands as a testament to his versatility and commitment to the craft.