# What is colMeans() Function in R (5 Examples)

The colMeans() is a built-in R function that calculates the means of each column of a matrix or array. The syntax of the colMeans function is colMeans(x, na.rm=FALSE), where x is the name of the matrix or data frame, and na.rm is whether to ignore NA values and returns the mean for the specified data frame, matrix, or array columns.

### Syntax

``colMeans(x, na.rm = FALSE, dims = 1)``

### Parameters

x: It is an array of two or more dimensions containing numeric, complex, integer, or logical values or a numeric data frame.

dims: It is an integer: Which dimensions are regarded as ‘columns’ to sum over. It is over dimensions 1:dims.

na.rm: It is a logical argument. If TRUE, NA values are ignored.

### Example 1: Use the colMeans() function on matrix

Let’s create a Matrix using the matrix() function and calculate the mean of columns of the matrix.

``````rv <- rep(1:4)

mtrx <- matrix(rv, 2, 2)
mtrx
cat("The mean of rows is: ", "\n")
colMeans(mtrx)``````

Output

``````     [,1] [,2]
[1,]   1    3
[2,]   2    4

The mean of rows is:

[1] 1.5 3.5``````

The rep() function replicates numeric values, text, or the values of a vector a specific number of times.

The matrix() function will create a 2 X 2 matrix.

The mean of the first column values is 1,5 cause 1 + 2 = 3 and 3 / 2 = 1.5 and the same for the second column.

### Example 2: Calculate the mean of columns of the array in R

To create an array in R, use the array() function. Let’s create an array and use the colMeans() function to calculate the mean of columns of the array.

``````arr <- array(1:4, c(2, 2, 2))
arr
cat("The mean of columns is: ", "\n")
colMeans(arr)``````

Output

``````, , 1

[,1] [,2]
[1,]  1    3
[2,]  2    4

, , 2

[,1] [,2]
[1,]   1    3
[2,]   2    4

The mean of columns is:
[,1] [,2]
[1,]  1.5  1.5
[2,]  3.5  3.5``````

### Example 3: Calculating the mean of columns of a data frame in R

To create a data frame in R, use the data.frame() function. To calculate the mean of columns of the data frame, use the colMeans() function.

``````x <- c(2:4)
y <- c(2:4 * 2)
z <- c(2:4 * 3)
w <- c(2:4 * 4)

df <- data.frame(x, y, z, w)
df
cat("The mean of columns of df is: ", "\n")
colMeans(df)``````

Output

``````  x  y  z  w
1 2  4  6  8
2 3  6  9 12
3 4  8 12 16

The mean of columns of df is:

x  y  z  w
3  6  9 12``````

### Example 4: Calculate the mean of columns of a data set in R

You can calculate the mean of columns of the dataset in R using the colMeans() function. We will use the USArrests dataset.

``colMeans(USArrests)``

Output

`````` Murder Assault  UrbanPop  Rape
7.788   170.760  65.540   21.232``````

### Example 5: Handling NA Values (na.rm) in colMeans() function

One of the most regular issues of the R colMeans() function is the existence of NAs (i.e., missing values) in the data. Let’s see what happens when we apply our functions to data with missing values.

``````x <- c(1, 2, NA, 3)
y <- c(NA, 4, 5, 6)
z <- c(7, NA, 8, 9)
w <- c(10, 11, NA, 13)

df <- data.frame(x, y, z, w)
df
cat("The mean of columns of df is: ", "\n")
colMeans(df)``````

Output

``````   x  y  z  w
1  1 NA  7 10
2  2 4  NA 11
3 NA 5  8  NA
4 3  6  9  13

The mean of columns of df is:

x   y   z   w
NA  NA  NA  NA``````

You can see that we got all the NAs in the output because every column contains one NA. So, it will return NA in the output.

But no worries, there is an easy solution. We have to add na.rm = TRUE within our functions.

``````x <- c(1, 2, NA, 3)
y <- c(NA, 4, 5, 6)
z <- c(7, NA, 8, 9)
w <- c(10, 11, NA, 13)

df <- data.frame(x, y, z, w)
cat("The mean of columns of df is: ", "\n")
colMeans(df, na.rm = TRUE)``````

Output

``````The mean of columns of df is:
x       y       z       w
2.00000 5.00000 8.00000 11.33333``````

As you can see that it ignored the NA values and calculated the mean of the remaining column values.

Please note that handling missing values is a research topic in itself. Just ignoring NA values is usually not the best idea.

## Conclusion

The colMeans() function in R calculates the arithmetic mean of values across columns of a matrix or data frame. It takes as input a matrix or data frame and returns a numeric vector with the mean of values for each column.