The grep() function in R searches for matches to a pattern within a character vector. It returns indices or values of elements that match the pattern. It is part of a family of functions that includes grepl(), regexpr(), gregexpr(), sub(), and gsub().
grep(
pattern,
char,
ignore.case = FALSE,
value = FALSE,
invert = FALSE,
perl = FALSE,
fixed = FALSE,
useBytes = FALSE,
...
)
Name | Value |
pattern | It can be a regular expression or a fixed string to search for. |
char | It is a character vector where the search occurs. |
ignore.case | By default, it is FALSE, but if set to TRUE, the match is case-insensitive. |
value | If set to TRUE, it returns matching values. By default, it is FALSE. |
perl | If set to TRUE, you can write Perl-compatible regex for complex searching. |
fixed | By default, it is FALSE, but if TRUE, pattern is a fixed string, not a regular expression. |
usedBytes | If TRUE, it matches byte codes rather than characters. By default, it is FALSE. |
By default, it returns indices, but if you pass value = TRUE, it returns the value instead of the index. It returns a character vector of the elements of x that matched.
vec <- c("Amazon", "Apple", "Netflix", "Spotify")
print(grep("i", vec))
Output
[1] 3 4
In the above figure, we are searching for the character “a” in our input character vector. The “a” character does not exist in the input character vector, but “A” exists. Because we are passing “ignore.case = TRUE”, that means now, it will look for “a” or “A” and since “A” exists, it will return the index for the vector that contains A.
rv <- c("Amazon", "Apple", "Netflix", "Spotify")
print(grep("a", rv, ignore.case = TRUE))
# Output: [1] 1 2
This function can check for multiple character patterns in the vector of character strings and returns the indices of elements that contain the pattern.
rv <- c("Amazon", "Apple", "Netflix", "Spotify")
print(grep("o|i", rv, ignore.case = TRUE))
# Output: [1] 1 3 4
If you set the argument value=TRUE, it will return the actual matched elements themselves instead of their indices.
rv <- c("Amazon", "Apple", "Netflix", "Spotify")
print(grep("o|i", rv, ignore.case = TRUE, value = TRUE))
# Output: [1] "Amazon" "Netflix" "Spotify"
To match the pattern as a fixed string rather than a regular expression, use fixed = TRUE:
rv <- c("Amazon", "Apple", "Netflix", "Spotify")
print(grep("A", rv, fixed = TRUE))
# Output: [1] 1 2
You can write Perl-compatible regular expressions that can help write complex patterns and matches.
IDs <- c("ID:219", "ID:4567", "ID:89")
# match exactly three digits
grep("^ID:\\d{3}$", IDs, perl = TRUE, value = TRUE)
# Output: [1] "ID:219"
We searched for an element with exactly three digits and found the first one.
By passing invert = TRUE, you are saying that you don’t want to include the elements that match. Only include non-matching elements.
rv <- c("Amazon", "Apple", "Netflix", "Spotify")
grep("A", rv, invert = TRUE, value = TRUE)
# Output: [1] "Netflix" "Spotify"
Let’s say you want to know the list of only CSV files in your current directory. Here, you can use the list.files() method with the grep() method to get precisely what you want.
csv_files <- grep("\\.csv$", list.files(), value = TRUE)
print(csv_files)
# Output:
# [1] "data_types.csv" "data.csv" "input_domains.csv"
# [4] "missing_data.csv"
That’s it.
Krunal Lathiya is a seasoned Computer Science expert with over eight years in the tech industry. He boasts deep knowledge in Data Science and Machine Learning. Versed in Python, JavaScript, PHP, R, and Golang. Skilled in frameworks like Angular and React and platforms such as Node.js. His expertise spans both front-end and back-end development. His proficiency in the Python language stands as a testament to his versatility and commitment to the craft.
Whether you are reading or writing files via programs in the file system, it is…
When it comes to checking the data type of a variable, it depends on what…
The grepl() function (stands for "grep logical") in R searches for patterns within each element…
The zip() function creates a new zip archive file. You must ensure that the zip tool…
When working with file systems, checking the directory or file existence is always better before…
To create a grouped boxplot in R, we can use the ggplot2 library's aes() and…