Variance in R: How to Use var() Function in R

The variance is a standard of variability. The variance is calculated by taking the average squared deviations from the mean. The variance shows you the degree of spread in your dataset. The more separated the data, the larger the variance is about the mean. Let’s see the definition of variance.

What is Variance

The variance is determined as an average of the squared differences from the mean. Variance is a numerical value that shows how widely the individual figures in a data set distribute themselves about the mean.

To calculate the variance in mathematics,

  1. First, calculate the mean, which is an average of the numbers.
  2. Second, for each number: subtract the Mean and square the result (the squared difference).
  3. In the last step, check out the average of those squared differences.

How to Calculate Variance in R

To calculate the variance in R, use the var() function. The var() is a built-in function that computes the sample variance of a vector. It measures how much value is away from the mean value.

Syntax

var(x, y=NULL, na.rm=FALSE, use)

Parameters

x,y

It is a complex vector or matrix.

na.rm

Boolean with default FALSE meaning to leave NA values present and TRUE meaning to remove them.

use

Ignored

Calculate the variance of a numeric vector in R

To create a numeric vector, use the c() function and pass the multiple numeric arguments. Then gives that numeric vector the var() function, which returns the variance of a vector.

weights <- c(60, 55, 50, 65, 59)
var(weights)

Output

[1] 31.7

The var() function calculates the estimated variance (with N–1 in the denominator). To calculate that first variance with N in the denominator, you must multiply this number by (N–1)/N.

Calculate the Variance of the dataset in R

We will use a built-in dataset iris in this example. To use the built-in dataset, write data(dataset name) at the start of your file. Then, you can use that dataset.

data(iris)

We will find the variance of the petal.length of the iris dataset.

data(iris)

ln <- iris$Petal.Length

var(ln)

Output

[1] 3.116278

The variance of the petal.length is 3.116278.

Sample Variance vs. Population Variance

The main difference between a sample and population variance relates to a variance calculation. Population variance refers to the value of variance calculated from population data, and sample variance is the variance calculated from sample data.

The correction does not matter for large sample sizes. However, it does matter when the dataset is small sample sizes. When the variance is calculated from population data, n equals the number of elements.

To calculate the population variance, use the following function.

mean((x - mean(x)) ^ 2)

Let’s see how to calculate population variance in R.

population_variance <- function(rv) {
 mean((rv - mean(rv)) ^ 2)
}

weights <- c(60, 55, 50, 65, 59)
population_variance(weights)

Output

[1] 25.36

Conclusion

The var R function computes the sample variance of a numeric input vector. The var() method calculates the sample variance, not the population variance. We have already seen how to calculate population variance.

See Also

Calculate Percentile in R

Mode in R

Square root in R

Leave a Comment