The grep() function in R searches for matches to a pattern within a character vector. It returns indices or values of elements that match the pattern. It is part of a family of functions that includes grepl(), regexpr(), gregexpr(), sub(), and gsub().
Syntax
grep(
pattern,
char,
ignore.case = FALSE,
value = FALSE,
invert = FALSE,
perl = FALSE,
fixed = FALSE,
useBytes = FALSE,
...
)
Parameters
Name | Value |
pattern | It can be a regular expression or a fixed string to search for. |
char | It is a character vector where the search occurs. |
ignore.case | By default, it is FALSE, but if set to TRUE, the match is case-insensitive. |
value | If set to TRUE, it returns matching values. By default, it is FALSE. |
perl | If set to TRUE, you can write Perl-compatible regex for complex searching. |
fixed | By default, it is FALSE, but if TRUE, pattern is a fixed string, not a regular expression. |
usedBytes | If TRUE, it matches byte codes rather than characters. By default, it is FALSE. |
Return Value
By default, it returns indices, but if you pass value = TRUE, it returns the value instead of the index. It returns a character vector of the elements of x that matched.
Basic usage
vec <- c("Amazon", "Apple", "Netflix", "Spotify")
print(grep("i", vec))
Output
[1] 3 4
Case Insensitivity
In the above figure, we are searching for the character “a” in our input character vector. The “a” character does not exist in the input character vector, but “A” exists. Because we are passing “ignore.case = TRUE”, that means now, it will look for “a” or “A” and since “A” exists, it will return the index for the vector that contains A.
rv <- c("Amazon", "Apple", "Netflix", "Spotify")
print(grep("a", rv, ignore.case = TRUE))
# Output: [1] 1 2
Passing multiple patterns
This function can check for multiple character patterns in the vector of character strings and returns the indices of elements that contain the pattern.
rv <- c("Amazon", "Apple", "Netflix", "Spotify")
print(grep("o|i", rv, ignore.case = TRUE))
# Output: [1] 1 3 4
Returning Matching Elements
If you set the argument value=TRUE, it will return the actual matched elements themselves instead of their indices.
rv <- c("Amazon", "Apple", "Netflix", "Spotify")
print(grep("o|i", rv, ignore.case = TRUE, value = TRUE))
# Output: [1] "Amazon" "Netflix" "Spotify"
Fixed Strings
To match the pattern as a fixed string rather than a regular expression, use fixed = TRUE:
rv <- c("Amazon", "Apple", "Netflix", "Spotify")
print(grep("A", rv, fixed = TRUE))
# Output: [1] 1 2
Perl-compatible regular expressions
You can write Perl-compatible regular expressions that can help write complex patterns and matches.
IDs <- c("ID:219", "ID:4567", "ID:89")
# match exactly three digits
grep("^ID:\\d{3}$", IDs, perl = TRUE, value = TRUE)
# Output: [1] "ID:219"
We searched for an element with exactly three digits and found the first one.
Invert match (elements that don’t match)
By passing invert = TRUE, you are saying that you don’t want to include the elements that match. Only include non-matching elements.
rv <- c("Amazon", "Apple", "Netflix", "Spotify")
grep("A", rv, invert = TRUE, value = TRUE)
# Output: [1] "Netflix" "Spotify"
Searching file names returned by the list.files()
Let’s say you want to know the list of only CSV files in your current directory. Here, you can use the list.files() method with the grep() method to get precisely what you want.
csv_files <- grep("\\.csv$", list.files(), value = TRUE)
print(csv_files)
# Output:
# [1] "data_types.csv" "data.csv" "input_domains.csv"
# [4] "missing_data.csv"
That’s it.

Krunal Lathiya is a seasoned Computer Science expert with over eight years in the tech industry. He boasts deep knowledge in Data Science and Machine Learning. Versed in Python, JavaScript, PHP, R, and Golang. Skilled in frameworks like Angular and React and platforms such as Node.js. His expertise spans both front-end and back-end development. His proficiency in the Python language stands as a testament to his versatility and commitment to the craft.