R Advanced

How to Select Rows by Single or Multiple Conditions in R

We can perform a targeted analysis to identify patterns and trends. To isolate the data pertinent to our analysis, we must filter rows based on conditions from a data frame. Conditions help us uncover patterns and trends within our data.

Here are the two most prominent ways to select rows by single or multiple conditions in R:

  1. Subsetting with [ ]
  2. Using dplyr::filter()

Method 1: Subsetting with [ ]

Subsetting with a square bracket ([ ]) is a basic way to select rows where the column meets the condition. For example, df[df$column_name > value, ].

Here is the input data frame we will use throughout this tutorial:

df <- data.frame(
  name = c("Millie", "Yogita", "KMJ"),
  score = c(90, 95, 77),
  subject = c("Biology", "Biology", "Biology"),
  grade = c(12, 12, 11)
)

Single condition

Let’s apply a single condition to filter rows using subsetting.

Fetch me the rows where ‘score’ > 85.

df <- data.frame(
  name = c("Millie", "Yogita", "KMJ"),
  score = c(90, 95, 77),
  subject = c("Biology", "Biology", "Biology"),
  grade = c(12, 12, 11)
)

df[df$score > 85, ]

Output

Our central column to filter is “score”, and you can select any column of the data frame using the “df$score” syntax, which we have done. For filtering, we used a greater than (>) sign to select only columns with scores greater than 85.

Multiple conditions

To apply multiple conditions at once, you can use the And (&)/Or (|) operator. But make sure to use the vector logical operators & and |, not the scalar ones && or ||.

Let’s select only rows whose score is > 70 & grade == 11.

df <- data.frame(
  name = c("Millie", "Yogita", "KMJ"),
  score = c(90, 95, 77),
  subject = c("Biology", "Biology", "Biology"),
  grade = c(12, 12, 11)
)

df[df$score > 70 & df$grade == 11, ]

Output

After filtering, only row number 3 satisfied our conditions, so we got that in the output, as shown in the above figure.

Handling NA values

If there are NAs in the column, the condition might result in NA, which would be treated as FALSE in subsetting. Using a function like is.na(), we can check for any NA values before subsetting on it.

Although our data frame does not contain any NA values, we can use a different data frame that contains NA values.

df <- data.frame(
  name = c("Millie", "Yogita", NA),
  score = c(90, 95, NA),
  subject = c("Biology", "Biology", "Biology"),
  grade = c(12, 12, 11)
)

df[is.na(df$score) | df$age > 70, ]

Output

[1]  name  score  subject  grade
<0 rows> (or 0-length row.names)

Since our data frame, df contains NA values, we get the 0 rows in the output.

String matching with grepl()

Let’s select rows based on specific string column values using the “grepl()” function.

df <- data.frame(
  name = c("Millie", "Yogita", "KMJ"),
  score = c(90, 95, 77),
  subject = c("Biology", "Biology", "Biology"),
  grade = c(12, 12, 11)
)

df[grepl("gita", df$name), ]

Output

Method 2: Using dplyr::filter()

The dplyr filter() function subsets a data frame, retaining all rows that satisfy your conditions. For example df %>% filter(column_name > value).

The dplyr is a third-party package, so we need to install and load it in our program:

install.packages("dplyr")


library(dplyr)

Single condition

Let’s select rows whose score is > 80 using the df %>% filter() function.

library(dplyr)

df <- data.frame(
  name = c("Millie", "Yogita", "KMJ"),
  score = c(90, 95, 77),
  subject = c("Biology", "Biology", "Biology"),
  grade = c(12, 12, 11)
)

df %>% filter(score > 80)

Output

Multiple conditions

For multiple conditions, we can use: df %>% filter(first_column > value1, second_column == “value2”) or use “&” between conditions.

library(dplyr)

df <- data.frame(
  name = c("Millie", "Yogita", "KMJ"),
  score = c(90, 95, 77),
 subject = c("Biology", "Biology", "Biology"),
  grade = c(12, 12, 11)
)

df %>% filter(score > 70 & grade == 11)

Output

Both methods provide flexibility; you can choose whichever method is based on your workflow and readability needs.

Share
Published by
Krunal Lathiya

Recent Posts

R append() Function: Complete Guide

The append() function in R concatenates values to a vector or list at a specified…

1 day ago

How to Remove NULL from List and Nested List in R

NULL represents a null object, and sometimes, it's logical for the project to filter it…

2 days ago

How to Remove the Last Row or N Rows from DataFrame in R

In a real-life dataset, the last row may contain metadata, summaries, footnotes, or unwanted rows…

4 days ago

How to Remove the First Row of DataFrame in R

When we attempt to remove the first row of a data frame, we are essentially…

7 days ago

R basename() Function

The basename() is a base R function that extracts the last component (or the 'base…

1 week ago

How to Append an Element to a List at Any Position in R

To grow the list, you can add an element (numeric value, character vectors, other lists,…

1 week ago