Variance in R: How to Use var() Function in R

The variance is a standard of variability. The variance is calculated by taking the average of squared deviations from the mean. The variance shows you the degree of spread in your dataset. The more separated the data, the larger the variance is about the mean. Let’s see the definition of variance.

What is Variance

The Variance is determined as an average of the squared differences from the mean. To calculate the variance, follow these steps:

  1. First, calculate the mean, which is an average of the numbers.
  2. Second, for each number: subtract the Mean and square the result (the squared difference).
  3. In the last step, check out the average of those squared differences.

How to Calculate Variance in R

To calculate the variance in R, use the var() function. The var() is a built-in function that computes the sample variance of a vector. It is the measure of how much value is away from the mean value.

Syntax

var(x, y=NULL, na.rm=FALSE, use)

Parameters

x,y

It is a complex vector or matrix.

na.rm

Boolean with default FALSE meaning to leave NA values present and TRUE meaning to remove them.

use

Ignored

Compute the variance of a numeric vector

To create a numeric vector, use the c() function and pass the multiple numeric arguments. Then pass that numeric vector the var() function, which returns the variance of a vector.

weights <- c(60, 55, 50, 65, 59)
var(weights)

Output

[1] 31.7

The var() function calculates the estimated variance (with N–1 in the denominator). To calculate that first variance with N in the denominator, you have to multiply this number by (N–1)/N.

Calculate the Variance in R of the dataset

We will use the inbuilt dataset iris in this example. To use the inbuilt R dataset, you need to write data(dataset name) at the start of your file. Then you can use that dataset.

data(iris)

We will find the variance of the petal.length of the iris dataset.

data(iris)

ln <- iris$Petal.Length

var(ln)

Output

[1] 3.116278

The variance of the petal.length is 3.116278.

Sample Variance vs. Population Variance

The difference between sample variance and population variance is the correction of – 1. This correction does not really matter for large sample sizes. However, it does really matter when the dataset is small sample sizes.

To calculate the population variance, use the following function.


mean((x - mean(x)) ^ 2)

Let’s see how to calculate population variance in R.

population_variance <- function(rv) {
 mean((rv - mean(rv)) ^ 2)
}

weights <- c(60, 55, 50, 65, 59)
population_variance(weights)

Output

[1] 25.36

Conclusion

The var R function computes the sample variance of a numeric input vector. The var() method is used to compute the sample variance, not the population variance. We have already seen how to calculate population variance.

See Also

Calculate Percentile in R

Mode in R

Square root in R

Absolute value in R

Round in R

Leave a Comment