The colMeans() function in R calculates the arithmetic mean of columns in a numeric matrix, data frame, or array. It efficiently calculates the average value for each column by summing the elements and dividing by the number of elements (or non-missing elements if specified).
In the above figure, we defined an input data frame, df, that contains three columns.
The colMeans() function accepts the whole data frame and returns a column-wise mean.
Here is the program that demonstrates the above figure:
df <- data.frame(
col1 = c(1, 2, 3),
col2 = c(4, 5, 6),
col3 = c(7, 8, 9)
)
# Calculating the mean of every columns.
colMeans(df)
# Output:
# col1 col2 col3
# 2 5 8
colMeans(x, na.rm = FALSE)
Argument | Description |
x | It represents an array of two or more dimensions containing numeric, complex, integer, or logical values or a numeric data frame. |
na.rm | It is a logical argument. If TRUE, NA values are ignored. |
What if a data frame contains NA values, and we try to calculate the mean of every column?
df <- data.frame(
col1 = c(NA, 2, 3),
col2 = c(4, NA, 6),
col3 = c(7, 8, NA)
)
# Calculate the mean of every columns.
colMeans(df)
# Output:
# col1 col2 col3
# NA NA NA
We obtained the output ‘NA’ for every column, as shown in the above figure and code, because ‘NA’ represents a missing value.
To ignore NA values in a data frame, you need to pass na.rm = TRUE.
# Create a data frame.
df <- data.frame(
col1 = c(NA, 2, 3),
col2 = c(4, NA, 6),
col3 = c(7, 8, NA)
)
# Calculate the mean of every columns.
colMeans(df, na.rm = TRUE)
# Output:
# col1 col2 col3
# 2.5 5.0 7.5
You can calculate the mean of specific columns based on your requirement by passing the column’s index.
df <- data.frame(
col1 = c(1, 2, 3),
col2 = c(4, 5, 6),
col3 = c(7, 8, 9)
)
# Calculate the mean col1 and col3
colMeans(df[c("col1", "col3")])
# Output:
# col1 col3
# 2 8
For row-wise means, you can use the rowMeans() function.
A matrix contains rows and columns, and we are only interested in the columns. It works the same as it did with a data frame.
mat <- matrix(1:6,
nrow = 2, ncol = 3,
dimnames = list(NULL, c("A", "B", "C"))
)
colMeans(mat)
# Output:
# A B C
# 3 4 5
And it returns the mean of each column.
If the input matrix contains a single element, it returns that element as a mean because there is nothing to compute against. Mean of a single value is that value only.
single_mat <- matrix(5)
colMeans(single_mat)
# Output: 5
That’s all!
Krunal Lathiya is a seasoned Computer Science expert with over eight years in the tech industry. He boasts deep knowledge in Data Science and Machine Learning. Versed in Python, JavaScript, PHP, R, and Golang. Skilled in frameworks like Angular and React and platforms such as Node.js. His expertise spans both front-end and back-end development. His proficiency in the Python language stands as a testament to his versatility and commitment to the craft.
The rowMeans() is a built-in, highly vectorized function in R that computes the arithmetic mean…
The colSums() function in R calculates the sums of columns for numeric matrices, data frames,…
The rowSums() function calculates the sum of values in each numeric row of a matrix,…
The View() is a utility function in R that invokes a more intuitive spreadsheet-style data…
The summary() is a generic function that produces the summary statistics for various R objects,…
The paste() function in R concatenates vectors after converting them to character. paste("Hello", 19, 21,…