R Basic

How to Remove Duplicates from a Vector in R

In R, unique() and subsetting with !duplicated() are efficient ways to remove duplicates.

Duplicate elements in a vector refer to those elements that appear more than once.

Duplicates can skew the data analysis and lead to inaccurate results. Removing them leads to more reliable insights.

Method 1: Using unique()

The unique() function is a one-step, quick solution that identifies and removes duplicate elements from a vector while preserving the order of the first occurrence. This function is fundamentally optimized for large-scale vectors.

vec <- c(11, 21, 19, 19, 21, 19, 18, 18, 18)

unique_vec <- unique(vec)

unique_vec

# Output: [1] 11 21 19 18

Element “11” appears once, “21” twice, “19” and “18” thrice. So, the final output has only one appearance for each element.

Handling NA

If a vector contains multiple NA values, the unique() method will keep only one NA and remove other NAs.

vec_na <- c(11, 21, 19, 19, NA, 19, 18, 18, NA)

unique_vec_na <- unique(vec_na)

unique_vec_na

# Output: [1] 11 21 19 NA 18

Method 2: Subsetting with !duplicated()

The duplicated() function returns a logical vector indicating which elements are duplicates.

The ! operator suggests negation. Therefore, if I negate it with !duplicated(), I can subset the original vector to obtain only the unique elements.

Using vec[!duplicated(vec)] would keep the first occurrence and remove the duplicates.

vec <- c(11, 21, 19, 19, 21, 19, 18, 18, 18)

unique_vec <- vec[!duplicated(vec)]

unique_vec

# Output: [1] 11 21 19 18

Handling NA

If a vector contains multiple NA values, the vec[!duplicated(vec)] approach will keep only one NA and remove other NAs.

vec_na <- c(11, 21, 19, 19, NA, 19, 18, 18, NA)

unique_vec_na <- vec_na[!duplicated(vec_na)]

unique_vec_na

# Output: [1] 11 21 19 NA 18

That’s all!

Recent Posts

cbind() Function: Binding R Objects by Columns

R cbind (column bind) is a function that combines specified vectors, matrices, or data frames…

1 week ago

rbind() Function: Binding Rows in R

The rbind() function combines R objects, such as vectors, matrices, or data frames, by rows.…

1 week ago

as.numeric(): Converting to Numeric Values in R

The as.numeric() function in R converts valid non-numeric data into numeric data. What do I…

2 weeks ago

Calculating Natural Log using log() Function in R

The log() function calculates the natural logarithm (base e) of a numeric vector. By default,…

3 weeks ago

Dollar Sign ($ Operator) in R

In R, you can use the dollar sign ($ operator)  to access elements (columns) of…

1 month ago

Calculating Absolute Value using abs() Function in R

The abs() function calculates the absolute value of a numeric input, returning a non-negative (only…

1 month ago