R Advanced

R grep(): Finding a Position of Matched Pattern

The grep() function in R  searches for matches to a pattern within a character vector. It returns indices or values of elements that match the pattern. It is part of a family of functions that includes grepl(), regexpr(), gregexpr(), sub(), and gsub().

Syntax

grep(
  pattern, 
  char, 
  ignore.case = FALSE, 
  value = FALSE, 
  invert = FALSE, 
  perl = FALSE, 
  fixed = FALSE, 
  useBytes = FALSE, 
  ...
)

Parameters

Name Value
pattern It can be a regular expression or a fixed string to search for.
char It is a character vector where the search occurs.
ignore.case By default, it is FALSE, but if set to TRUE, the match is case-insensitive.
value If set to TRUE, it returns matching values. By default, it is FALSE.
perl If set to TRUE, you can write Perl-compatible regex for complex searching.
fixed By default, it is FALSE, but if TRUE, pattern is a fixed string, not a regular expression.
usedBytes If TRUE, it matches byte codes rather than characters. By default, it is FALSE.

Return Value

By default, it returns indices, but if you pass value = TRUE, it returns the value instead of the index. It returns a character vector of the elements of x that matched.

Basic usage

vec <- c("Amazon", "Apple", "Netflix", "Spotify")

print(grep("i", vec))

Output

[1] 3  4

Case Insensitivity

In the above figure, we are searching for the character “a” in our input character vector. The “a” character does not exist in the input character vector, but “A” exists. Because we are passing “ignore.case = TRUE”, that means now, it will look for “a” or “A” and since “A” exists, it will return the index for the vector that contains A.

rv <- c("Amazon", "Apple", "Netflix", "Spotify")

print(grep("a", rv, ignore.case = TRUE))

# Output: [1] 1  2

Passing multiple patterns

This function can check for multiple character patterns in the vector of character strings and returns the indices of elements that contain the pattern.

rv <- c("Amazon", "Apple", "Netflix", "Spotify")

print(grep("o|i", rv, ignore.case = TRUE))

# Output: [1] 1  3  4

Returning Matching Elements

If you set the argument value=TRUE, it will return the actual matched elements themselves instead of their indices.

rv <- c("Amazon", "Apple", "Netflix", "Spotify")

print(grep("o|i", rv, ignore.case = TRUE, value = TRUE))

# Output: [1] "Amazon" "Netflix" "Spotify"

Fixed Strings

To match the pattern as a fixed string rather than a regular expression, use fixed = TRUE:

rv <- c("Amazon", "Apple", "Netflix", "Spotify")

print(grep("A", rv, fixed = TRUE))

# Output: [1] 1  2

Perl-compatible regular expressions

You can write Perl-compatible regular expressions that can help write complex patterns and matches. 

IDs <- c("ID:219", "ID:4567", "ID:89")

# match exactly three digits
grep("^ID:\\d{3}$", IDs, perl = TRUE, value = TRUE)

# Output: [1] "ID:219"

We searched for an element with exactly three digits and found the first one.

Invert match (elements that don’t match)

By passing invert = TRUE, you are saying that you don’t want to include the elements that match. Only include non-matching elements.

rv <- c("Amazon", "Apple", "Netflix", "Spotify")

grep("A", rv, invert = TRUE, value = TRUE)

# Output: [1] "Netflix" "Spotify"

Searching file names returned by the list.files()

Let’s say you want to know the list of only CSV files in your current directory. Here, you can use the list.files() method with the grep() method to get precisely what you want.

csv_files <- grep("\\.csv$", list.files(), value = TRUE)

print(csv_files)

# Output:
# [1] "data_types.csv" "data.csv" "input_domains.csv"
# [4] "missing_data.csv"

That’s it.

Recent Posts

How to Check If File and Folder Already Exists in R

Whether you are reading or writing files via programs in the file system, it is…

7 days ago

How to Check Data type of a Variable in R

When it comes to checking the data type of a variable, it depends on what…

1 week ago

Mastering grepl() Function in R

The grepl() function (stands for "grep logical") in R searches for patterns within each element…

1 week ago

zip(), unzip() and tar(), untar() Functions in R

The zip() function creates a new zip archive file. You must ensure that the zip tool…

1 week ago

How to Create Directory and File If It doesn’t Exist in R

When working with file systems, checking the directory or file existence is always better before…

2 weeks ago

How to Create a Grouped Boxplot in R

To create a grouped boxplot in R, we can use the ggplot2 library's aes() and…

2 weeks ago