The **colMeans()** is a built-in **R** function that calculates the means of each column of a matrix or array. The syntax of the colMeans function is **colMeans(x, na.rm=FALSE),** where **x** is the name of the matrix or data frame, and **na.rm** is whether to ignore **NA** values and returns the mean for the specified data frame, matrix, or array columns.

**Syntax**

`colMeans(x, na.rm = FALSE, dims = 1)`

**Parameters**

**x: **It is an array of two or more dimensions containing numeric, complex, integer, or logical values or a numeric data frame.

**dims**: It is an integer: Which dimensions are regarded as ‘**columns**’ to sum over. It is over dimensions 1:dims.

**na.rm:** It is a logical argument. If **TRUE**, NA values are ignored.

**Example 1: Use the colMeans() function on matrix**

Let’s create a Matrix using the matrix() function and calculate the mean of columns of the matrix.

```
rv <- rep(1:4)
mtrx <- matrix(rv, 2, 2)
mtrx
cat("The mean of rows is: ", "\n")
colMeans(mtrx)
```

**Output**

```
[,1] [,2]
[1,] 1 3
[2,] 2 4
The mean of rows is:
[1] 1.5 3.5
```

The **rep**() **function** replicates numeric values, text, or the values of a vector a specific number of times.

The **matrix()** function will create a **2 X 2** matrix.

The mean of the first column values is 1,5 cause **1 + 2 = 3** and **3 / 2 = 1.5** and the same for the second column.

**Example 2: Calculate the mean of columns of the array in R**

To create an array in R, use the array() function. Let’s create an array and use the **colMeans()** function to calculate the **mean** of columns of the array.

```
arr <- array(1:4, c(2, 2, 2))
arr
cat("The mean of columns is: ", "\n")
colMeans(arr)
```

**Output**

```
, , 1
[,1] [,2]
[1,] 1 3
[2,] 2 4
, , 2
[,1] [,2]
[1,] 1 3
[2,] 2 4
The mean of columns is:
[,1] [,2]
[1,] 1.5 1.5
[2,] 3.5 3.5
```

**Example 3: Calculating the mean of columns of a data frame in R**

To create a data frame in R, use the data.frame() function. To calculate the mean of columns of the data frame, use the **colMeans()** function.

```
x <- c(2:4)
y <- c(2:4 * 2)
z <- c(2:4 * 3)
w <- c(2:4 * 4)
df <- data.frame(x, y, z, w)
df
cat("The mean of columns of df is: ", "\n")
colMeans(df)
```

**Output**

```
x y z w
1 2 4 6 8
2 3 6 9 12
3 4 8 12 16
The mean of columns of df is:
x y z w
3 6 9 12
```

**Example 4: Calculate the mean of columns of a data set in R**

You can calculate the mean of columns of the dataset in R using the **colMeans()** function. We will use the **USArrests **dataset.

`colMeans(USArrests)`

**Output**

```
Murder Assault UrbanPop Rape
7.788 170.760 65.540 21.232
```

**Example 5: Handling NA Values (na.rm) in colMeans() function**

One of the most regular issues of the **R colMeans()** function is the existence of **NAs** (i.e., missing values) in the data. Let’s see what happens when we apply our functions to data with missing values.

```
x <- c(1, 2, NA, 3)
y <- c(NA, 4, 5, 6)
z <- c(7, NA, 8, 9)
w <- c(10, 11, NA, 13)
df <- data.frame(x, y, z, w)
df
cat("The mean of columns of df is: ", "\n")
colMeans(df)
```

**Output**

```
x y z w
1 1 NA 7 10
2 2 4 NA 11
3 NA 5 8 NA
4 3 6 9 13
The mean of columns of df is:
x y z w
NA NA NA NA
```

You can see that we got all the **NAs** in the output because every column contains one **NA**. So, it will return **NA** in the output.

But no worries, there is an easy solution. We have to add **na.rm = TRUE** within our functions.

```
x <- c(1, 2, NA, 3)
y <- c(NA, 4, 5, 6)
z <- c(7, NA, 8, 9)
w <- c(10, 11, NA, 13)
df <- data.frame(x, y, z, w)
cat("The mean of columns of df is: ", "\n")
colMeans(df, na.rm = TRUE)
```

**Output**

```
The mean of columns of df is:
x y z w
2.00000 5.00000 8.00000 11.33333
```

As you can see that it ignored the **NA** values and calculated the mean of the remaining column values.

Please note that handling missing values is a research topic in itself. Just ignoring **NA** values is usually not the best idea.

**Conclusion**

The **colMeans()** function in **R** calculates the arithmetic mean of values across columns of a matrix or data frame. It takes as input a matrix or data frame and returns a numeric vector with the mean of values for each column.

**See also**

Krunal Lathiya is a Software Engineer with over eight years of experience. He has developed a strong foundation in computer science principles and a passion for problem-solving. In addition, Krunal has excellent knowledge of Data Science and Machine Learning, and he is an expert in R Language.