R Advanced

R grep(): Finding a Position of Matched Pattern

The grep() function in R  searches for matches to a pattern within a character vector. It returns indices or values of elements that match the pattern. It is part of a family of functions that includes grepl(), regexpr(), gregexpr(), sub(), and gsub().

Syntax

grep(
  pattern, 
  char, 
  ignore.case = FALSE, 
  value = FALSE, 
  invert = FALSE, 
  perl = FALSE, 
  fixed = FALSE, 
  useBytes = FALSE, 
  ...
)

Parameters

Name Value
pattern It can be a regular expression or a fixed string to search for.
char It is a character vector where the search occurs.
ignore.case By default, it is FALSE, but if set to TRUE, the match is case-insensitive.
value If set to TRUE, it returns matching values. By default, it is FALSE.
perl If set to TRUE, you can write Perl-compatible regex for complex searching.
fixed By default, it is FALSE, but if TRUE, pattern is a fixed string, not a regular expression.
usedBytes If TRUE, it matches byte codes rather than characters. By default, it is FALSE.

Return Value

By default, it returns indices, but if you pass value = TRUE, it returns the value instead of the index. It returns a character vector of the elements of x that matched.

Basic usage

vec <- c("Amazon", "Apple", "Netflix", "Spotify")

print(grep("i", vec))

Output

[1] 3  4

Case Insensitivity

In the above figure, we are searching for the character “a” in our input character vector. The “a” character does not exist in the input character vector, but “A” exists.

Because we are passing “ignore.case = TRUE”, it now means that it will look for “a” or “A”. Since “A” exists, it will return the index of the vector that contains “A”.

rv <- c("Amazon", "Apple", "Netflix", "Spotify")

print(grep("a", rv, ignore.case = TRUE))

# Output: [1] 1  2

Passing multiple patterns

This function checks for multiple character patterns in a vector of character strings and returns the indices of elements that contain the pattern.

rv <- c("Amazon", "Apple", "Netflix", "Spotify")

print(grep("o|i", rv, ignore.case = TRUE))

# Output: [1] 1  3  4

Returning Matching Elements

If you set the argument value=TRUE, it will return the actual matched elements themselves instead of their indices.

rv <- c("Amazon", "Apple", "Netflix", "Spotify")

print(grep("o|i", rv, ignore.case = TRUE, value = TRUE))

# Output: [1] "Amazon" "Netflix" "Spotify"

Fixed Strings

To match the pattern as a fixed string rather than a regular expression, use fixed = TRUE:

rv <- c("Amazon", "Apple", "Netflix", "Spotify")

print(grep("A", rv, fixed = TRUE))

# Output: [1] 1  2

Perl-compatible regular expressions

You can write Perl-compatible regular expressions that can help write complex patterns and matches. 

IDs <- c("ID:219", "ID:4567", "ID:89")

# match exactly three digits
grep("^ID:\\d{3}$", IDs, perl = TRUE, value = TRUE)

# Output: [1] "ID:219"

We searched for an element with exactly three digits and found the first one.

Invert match (elements that don’t match)

By passing invert = TRUE, you are saying that you don’t want to include the elements that match. Only include non-matching elements.

rv <- c("Amazon", "Apple", "Netflix", "Spotify")

grep("A", rv, invert = TRUE, value = TRUE)

# Output: [1] "Netflix" "Spotify"

Searching file names returned by the list.files()

Let’s say you want to list only the CSV files in your current directory.

Here, you can use the list.files() method with the grep() method to get precisely what you want.

csv_files <- grep("\\.csv$", list.files(), value = TRUE)

print(csv_files)

# Output:
# [1] "data_types.csv" "data.csv" "input_domains.csv"
# [4] "missing_data.csv"

That’s it.

Recent Posts

cbind() Function: Binding R Objects by Columns

R cbind (column bind) is a function that combines specified vectors, matrices, or data frames…

2 weeks ago

rbind() Function: Binding Rows in R

The rbind() function combines R objects, such as vectors, matrices, or data frames, by rows.…

2 weeks ago

as.numeric(): Converting to Numeric Values in R

The as.numeric() function in R converts valid non-numeric data into numeric data. What do I…

3 weeks ago

Calculating Natural Log using log() Function in R

The log() function calculates the natural logarithm (base e) of a numeric vector. By default,…

4 weeks ago

Dollar Sign ($ Operator) in R

In R, you can use the dollar sign ($ operator)  to access elements (columns) of…

1 month ago

Calculating Absolute Value using abs() Function in R

The abs() function calculates the absolute value of a numeric input, returning a non-negative (only…

2 months ago