The is.na() function checks for missing values (NA) in the R object. It returns TRUE for NA values and FALSE otherwise. The valid object can be anything like a data frame, matrix, list, or vector.
It is extremely helpful in data cleaning and preparation, as it helps identify and handle missing values in a dataset.
is.na(obj)
Name | Value |
obj | It is an input object that needs to be tested for NA value. The object can be anything from a vector, list, matrix, or data frame. |
If you have a data frame and you are not sure how many NA values are there in the data frame, you can use the is.na() function and pass the data frame will return a data frame where NA values are replaced by TRUE and, in another case, FALSE.
df <- data.frame(
col1 = c(1, NA, 3),
col2 = c(NA, 5, NA),
col3 = c(7, NA, 9)
)
is.na(df)
Output
From the visual representation, you can see that we created a vector with two NA values and use is.na() function that will return TRUE for NA values and FALSE otherwise.
vec <- c(11, 21, 19, NA, 46, NA)
is.na(vec)
Output
[1] FALSE FALSE FALSE TRUE FALSE TRUE
The any() function returns whether any values are NA in the input object.
data <- c(11, 21, 19, NA, 46, NA)
any(is.na(data))
Output
[1] TRUE
In this example, any() function returns TRUE because the vector data contains at least one NA value. If it does not have a single NA value, then it returns FALSE.
data <- c(11, 21, 19, 46, 18)
any(is.na(data))
Output
[1] FALSE
When doing exploratory data analysis, finding and removing NA values is the most important part; these functions will help you find them.
You can count total NA values in a data frame by combining is.na() and sum() functions.
Let’s take an example data frame df and count the NA values.
df <- data.frame(
col1 = c(1, NA, 3),
col2 = c(NA, 5, NA),
col3 = c(7, NA, 9)
)
num_na_df <- sum(is.na(df))
num_na_df
Output
[1] 4
You can count the number of NA values in a vector using the combination of sum() and is.na() functions.
vec <- c(11, 21, 19, NA, 46, NA)
sum(is.na(vec))
Output
[1] 11 21 19 46
To deal with NA values, you might use functions like na.omit() to remove rows with NA or functions like replace(), mean(), median(), etc., to impute missing values.
We can remove the NA values from a vector using the “!” operator and is.na() function.
vec <- c(11, 21, 19, NA, 46, NA)
vec[!is.na(vec)]
Output
[1] 11 21 19 46
You can see from the above output that we removed NA values from the vector.
Krunal Lathiya is a seasoned Computer Science expert with over eight years in the tech industry. He boasts deep knowledge in Data Science and Machine Learning. Versed in Python, JavaScript, PHP, R, and Golang. Skilled in frameworks like Angular and React and platforms such as Node.js. His expertise spans both front-end and back-end development. His proficiency in the Python language stands as a testament to his versatility and commitment to the craft.
R vectors are atomic, which means they have homogeneous data types. They are contiguous in…
DataFrames are like tables that contain rows and columns. Each column can have a different…
Dates in R are stored as the number of days since 1970-01-01, so converting a…
In R, you can think of a vector as a series of values in a…
The dplyr filter() function in R subsets a data frame and retains all rows that…
The dplyr::distinct() function in R removes duplicate rows from a data frame or tibble and keeps…