How to Replace NA in R

The dplyr package is the next iteration of plyr, focus on tools for working with data frames. The key object in dplyr is a tbl, a representation of a tabular data structure.

Replace NA in R

To replace NA with specified values in R, use the replace_na() function. The replace_na() function replaces NAs with specified values. We can replace it with 0 or any other value of our choice.

Syntax

replace_na(data, replace, ...)

Arguments

data: It is a data frame or Vector.

replace: If the data is a Vector, the replace takes a single value. If the data is a data frame, the replace takes a list of values, with one value for each column that has NA values to be replaced.

Return value

If the input data is a data frame, the replace_na() method returns a data frame.

If the input data is a Vector, the replace_na() method returns a vector with a class given by the union of data and replace.

Example

To work with the tibble() function, you need to install the “dplyr” package. You can do it by the following command.

install.packages("dplyr")

To use the dplyr package in R, import the package in your R file.

library("dplyr")

Now, use the tibble() function

library("dplyr")
df <- tibble(x = c(11, 21, NA), y = c("x", NA, "y"))
print(df)

Output

Attaching package: ‘dplyr’

The following objects are masked from ‘package:stats’:

 filter, lag

The following objects are masked from ‘package:base’:

 intersect, setdiff, setequal, union

# A tibble: 3 × 2
    x     y
  <dbl> <chr>
1   11    x
2   21   NA
3  NA     y

Now, replace NA values with “NonNA” in a data frame using tidyr::replace_na() function.

library("dplyr")
df <- tibble(x = c(11, 21, NA), y = c("x", NA, "y"))
print(df)
cat("After replacing NAs", "\n")
df %>% tidyr::replace_na(list(x = "NonNA", y = "NonNA"))

Output

Attaching package: ‘dplyr’

The following objects are masked from ‘package:stats’:

 filter, lag

The following objects are masked from ‘package:base’:

 intersect, setdiff, setequal, union

# A tibble: 3 × 2
     x     y
   <dbl> <chr>
1   11     x
2   21     NA
3   NA     y

After replacing NAs

# A tibble: 3 × 2
    x       y
   <dbl>   <chr>
1   11       x
2   21       NonNA
3   NonNA        y

As you can see that we have replaced NA values with NonNA.

Replace NAs in a Vector

You can use the replace_na() function to replace NA values in Vector. In this example, we will use the above data frame and just replace one NA vector value with NonNA.

library("dplyr")
df <- tibble(x = c(11, 21, NA), y = c("x", NA, "y"))
print(df)
cat("After replacing NA in vector", "\n")
df %>% dplyr::mutate(x = tidyr::replace_na(x, "NonNA"))

Output

Attaching package: ‘dplyr’

The following objects are masked from ‘package:stats’:

 filter, lag

The following objects are masked from ‘package:base’:

 intersect, setdiff, setequal, union

# A tibble: 3 × 2
     x      y
   <dbl>  <chr>
1   11      x
2   21      NA
3   NA      y 

After replacing NA in vector

# A tibble: 3 × 2
    x       y
   <chr>   <chr>
1   11      x
2   21      NA
3  NonNA    y

In this example, we are only replacing the x column NA value with NonNA. Here, one column value works as Vector. So, we replace a vector value with NonNA.

Replace NA with 0 in R

To replace NA with 0 in data.frame, use the replace_na() function and then select all those values with NA and assign them to 0.

library("dplyr")
df <- tibble(x = c(11, 21, NA), y = c("x", NA, "y"))
print(df)
cat("After replacing NA with 0", "\n")
df %>% tidyr::replace_na(list(x = 0, y = 0))

Output

Attaching package: ‘dplyr’

The following objects are masked from ‘package:stats’:

 filter, lag

The following objects are masked from ‘package:base’:

 intersect, setdiff, setequal, union

# A tibble: 3 × 2
 x y
   <dbl>   <chr>
1   11       x
2   21       NA
3   NA       y

After replacing NA with 0

# A tibble: 3 × 2
    x        y
   <dbl>   <chr>
1   11       x
2   21       0
3    0       y

That’s it for this tutorial.

Leave a Comment