R Advanced

How to Remove NA From Vector in R

A vector is a data structure that holds the same type of data. When working with real-time data, it may contain missing data, and it can be represented by NA (not available).

For a proper data analysis, we need to exclude NA values from the vector. Now, you may have one question in your mind. What if each element of a vector contains an NA value? How about that?

Well, if a vector is filled with only NA values, removing NAs will return an empty vector.

If there are no NAs in the vector, it will not alter an input vector, and the output is the same as the input.

Here are three main ways to remove NA values from a Vector in R:

  1. Using logical indexing with !is.na(x)
  2. Using na.omit(x)
  3. Using complete.cases(x)

Also, some methods, like sum(), provide an argument called “na.rm = TRUE,” which removes NA before executing sum. However, this method is specific to removing NA values while executing sum and is not a general solution.

Method 1: Using logical indexing with !is.na(x)

The is.na() function detects returns TRUE for each NA value and FALSE otherwise. You can use it to subset the original vector by excluding NA values using this syntax: x[!is.na(x)], where x is a vector.

vec <- c(11, 21, NA, 41, NA, 51)

vec[!is.na(vec)]

# [1] 11 21 41 51

In the above code, we directly removed NA values from a vector by subsetting a vector and returning a clean vector. But what do we mean by clean vector? Well, a clean vector does not contain any NA value, and it does not provide any extra attributes.

If the user wants a clean vector quickly, logical indexing is the best approach because it is the fastest.

For code readability, it is fairly obvious that we want to get a vector without NA values just by its syntax.

Method 2: Using na.omit()

The na.omit() function not only eliminates NA values from an input vector but also returns the vector with an attribute listing omitted positions (attributes are typically ignored in operations).

In the above figure, the upper layer in the output is a vector without NA values. The second layer is the index of NA values, which are 3 and 5, provided by the “na.action” attribute, which can be helpful for debugging.

The class “omit” is part of how R handles these omitted values.

Here is the code implementation:

vec <- c(11, 21, NA, 41, NA, 51)

vec_clean <- na.omit(vec)

vec_clean

# [1] 11 21 41 51
# attr(,"na.action")
# [1] 3 5
# attr(,"class")
# [1] "omit"

The main difference between vec[!is.na(vec)] and na.omit() is that the na.omit() function contains metadata, whereas logical indexing does not return any attribute.

If you still want to strip attributes that are not necessary, use this approach: as.vector(na.omit(x))

Method 3: Using complete.cases()

The complete.cases() method is generally used when we are dealing with a data frame because it returns a logical vector suggesting which rows have no missing values.

However, we can use vec[complete.cases(vec)], where vec is a vector, to delete NA values from a vector and return a clean vector.

vec <- c(11, 21, NA, 41, NA, 51)

vec_clean <- vec[complete.cases(vec)]

vec_clean

# [1] 11 21 41 51

You can see that NA values have been filtered out and printed remaining elements in the console.

Final thoughts

  1. If you are looking for a fast and simple solution, use x[!is.na(x)].
  2. If you need metadata for further modeling, you can use the na.omit() function.
  3. If you are looking for consistency and your input is a data frame, I highly recommend you use only the complete.cases() approach.

Recent Posts

How to Check If File and Folder Already Exists in R

Whether you are reading or writing files via programs in the file system, it is…

2 days ago

How to Check Data type of a Variable in R

When it comes to checking the data type of a variable, it depends on what…

3 days ago

Mastering grepl() Function in R

The grepl() function (stands for "grep logical") in R searches for patterns within each element…

4 days ago

zip(), unzip() and tar(), untar() Functions in R

The zip() function creates a new zip archive file. You must ensure that the zip tool…

5 days ago

How to Create Directory and File If It doesn’t Exist in R

When working with file systems, checking the directory or file existence is always better before…

6 days ago

How to Create a Grouped Boxplot in R

To create a grouped boxplot in R, we can use the ggplot2 library's aes() and…

1 week ago