What is Variance in R and How to Use var() Function

The var() is a built-in R function that accepts a vector or matrix and computes the sample variance of a vector or matrix. The syntax of the var function is var(x) where x is a vector or matrix and returns the sample variance.

To calculate the variance in R, use the var() function. The var() function measures how much the value is away from the mean value.

The variance is an average of the squared differences from the mean. It is a numerical value that shows how widely the individual figures in a data set distribute themselves about the mean.

Syntax

``var(x, y=NULL, na.rm=FALSE, use)``

Parameters

x,y

It is a complex vector or matrix.

na.rm

Boolean with default FALSE meaning to leave NA values present and TRUE meaning to remove them.

use

Ignored

Example 1: Calculating the variance of a numeric vector in R

To create a numeric vector, use the c() function and pass the multiple numeric arguments. Then gives that numeric vector the var() function, which returns the variance of a vector.

``````weights <- c(60, 55, 50, 65, 59)
var(weights)``````

Output

``[1] 31.7``

The var() function calculates the estimated variance (with N–1 in the denominator).

To calculate that first variance with N in the denominator, you must multiply this number by (N–1)/N.

Example 2: Calculating the Variance of the dataset in R

We will use a built-in dataset iris in this example. To use the built-in dataset, write data(dataset name) at the start of your file. Then, you can use that dataset.

``data(iris)``

We will find the variance of the petal.length of the iris dataset.

``````data(iris)

ln <- iris\$Petal.Length

var(ln)``````

Output

``[1] 3.116278``

The variance of the petal.length is 3.116278.

Sample Variance vs. Population Variance

The main difference between a sample and population variance relates to a variance calculation. Population variance refers to the value of variance calculated from population data, and sample variance is the variance calculated from sample data.

The correction does not matter for large sample sizes. However, it does matter when the dataset is small sample sizes. When the variance is calculated from population data, n equals the number of elements.

To calculate the population variance, use the following function.

``mean((x - mean(x)) ^ 2)``

Let’s see how to calculate population variance in R.

``````population_variance <- function(rv) {
mean((rv - mean(rv)) ^ 2)
}

weights <- c(60, 55, 50, 65, 59)
population_variance(weights)``````

Output

``[1] 25.36``

Conclusion

The var() function is used to compute the variance of a numeric vector or matrix. It takes one or two arguments – the first argument is the numeric vector or matrix for which the variance is to be computed.

The variance is a standard of variability. It is calculated by taking the average squared deviations from the mean. It shows you the degree of spread in your dataset. The more separated the data, the larger the variance is about the mean.

To calculate the variance in mathematics,

1. First, calculate the mean, which is an average of the numbers.
2. Second, for each number: subtract the Mean and square the result (the squared difference).
3. In the last step, check out the average of those squared differences.