Converting String to Uppercase in R

For string operations like comparing strings, data standardization, formatting output, or input validation, we may want to convert the input to uppercase (or lowercase, depending on the situation) to ensure consistency and prevent errors caused by inconsistent capitalization.

The toupper() function converts a string to an upper case. It accepts an input string and converts all alphabetic characters in a string to their uppercase equivalents.

The toupper() function will not change non-alphabetic characters (like numbers or punctuation).

Syntax

toupper(str)

Parameters

Name Value
str It is a string to be converted to uppercase.

Basic conversion

Figure of converting string to uppercase in R

In the above figure, you can see that we converted an input string rlang to uppercase RLANG.

str <- "rlang"

str_upper <- toupper(str)

str_upper

Output

[1] "RLANG"

Passing a string with a numeric value

Passing a string with a numeric value

Let’s define a string that contains alphabets and numeric values.

mixed_str <- "c2v2kb"

uppercase_mixed <- toupper(mixed_str)

print(paste("Original:", mixed_str, "Uppercase:", uppercase_mixed))

# "Original: c2v2kb Uppercase: C2V2KB"

The commented output in the code above shows that uppercase() left numeric values unchanged while converting lowercase letters to uppercase.

Converting a vector of strings

You can also pass a vector of strings, which you need to convert into uppercase. Since the toupper() function is vectorized, it can be applied to multiple strings at once.

# Vector of strings
string_vector <- c("Hello", "R-lang", "blog")

# Convert each string in the vector to uppercase
uppercase_vector <- toupper(string_vector)

# Print the result
print(uppercase_vector)

# [1] "HELLO" "R-LANG" "BLOG"

Handling mixed data types (with coercion)

Let’s define a vector of mixed data type values, including character vectors, and use our function.

mixed_data <- c("text", 123, TRUE, "another string")

uppercase_mixed <- toupper(mixed_data)

print(paste("Original:", mixed_data, "Uppercase:", uppercase_mixed))

# [1] "Original: text Uppercase: TEXT"
# [2] "Original: 123 Uppercase: 123"
# [3] "Original: TRUE Uppercase: TRUE"
# [4] "Original: another string Uppercase: ANOTHER STRING"

The output shows that boolean and numeric values remain as they are, while alphabetic letters are converted to uppercase.

Case-insensitive comparison

If you want to compare two strings but they have different cases, you can convert all of them into uppercase and then compare both strings.

string1 <- "kaynes technology"
string2 <- "Kaynes Technology"

# Incorrect comparison (case-sensitive)
print(paste("Case-sensitive comparison:", string1 == string2))

# Correct comparison (case-insensitive)
print(paste("Case-insensitive comparison:", toupper(string1) == toupper(string2))) # nolint

# [1] "Case-sensitive comparison: FALSE"
# [1] "Case-insensitive comparison: TRUE"

Handling NA values

If an element in the input vector is NA, the toupper() function correctly identifies it as NA, and the corresponding element in the output vector will also be NA. It maintains data integrity when working with missing values.

data_na <- c("text", NA, "another string")

uppercase_mixed <- toupper(data_na)

print(paste("Original:", data_na, "Uppercase:", uppercase_mixed))

# [1] "Original: text Uppercase: TEXT"
# [2] "Original: NA Uppercase: NA"
# [3] "Original: another string Uppercase: ANOTHER STRING"

That’s all!

Leave a Comment