R split string: How to Split String in R with Delimiter

The string is a collection of combined letters and words. String manipulation is a common operation in R programming. Concatenation, splitting, and joining are common operations we need to perform on the string.

R split string

To split a string in R, use the strsplit() method. The strsplit() is a built-in R function that splits the string vector into sub-strings. The strsplit() method returns the list, where each list item resembles the item of input that has been split.

The strsplit() method splits a character string or vector of character strings using a regular expression or a literal string and returns the substring list.

Syntax

strsplit(x, split, fixed=T)

Parameters

  1. x:  It is a character string or vector stings.
  2. split: It is the character string to split x. It splits the strings into required formats.
  3. fixed: It matches the split or uses the regular expression.

Implementation of strsplit() method

To split string in R, use the strsplit() method. The strsplit() method accepts the character or vector string and the character string to split and return the formatted string.

rs <- ("This is First R String Example")
strsplit(rs, split = " ")

Output

[1] "This" "is" "First" "R" "String" "Example"

In this example, we are using space as a delimiter and splits the string from the space.

Use strsplit() function with delimiter in R

A delimiter in programming is a symbol or a special character or value that separates the words or text in the data.

Let’s use the character as a delimiter and split the string from that character.

rs <- ("This&is&First&R&String&Example")
strsplit(rs, split = "&")

Output

[1] "This" "is" "First" "R" "String" "Example"

In this case, the input text has the & as a delimiter. Then, we removed the delimiter and get the text as a list of strings. Finally, the strsplit() method removed the delimiter from the input string and returned the strings as a list.

Use strsplit() function with Regular Expression delimiter

Regular expressions are a compact and flexible way of describing patterns in strings. The strsplit() split the elements of a character vector into substrings according to the matches to substring split within them.

rs <- ("This12is3First4R5String6Example")
strsplit(rs, split = "[0-9]+")

Output

[1] "This" "is" "First" "R" "String" "Example"

In this example, our input has the numbers lie between 0-9. Therefore, we used the regular expression as [0-9]+ to split the data by removing the numbers. As a result, the strsplit() function will return the list of strings as output, as shown above.

Split each Character of a String in R

To split each character of a string in R, pass an empty string in delimiter, and it will split each character.

rs <- ("This12is3First4R5String6Example")
strsplit(rs, split = "")

Output

[1] "T" "h" "i" "s" "1" "2" "i" "s" "3" "F" "i" "r" "s" "t" "4" "R" "5" "S" "t"
[20] "r" "i" "n" "g" "6" "E" "x" "a" "m" "p" "l" "e"

Conclusion

The strsplit() function is extensively used and most popular in terms of splitting the strings. In the R language, we use the paste() function to concatenate and the strsplit() function to split the string. Let’s see how to split the string.

Leave a Comment