R Advanced

colMeans(): Calculating the Mean of Columns in R Data Frame

The colMeans() function in R calculates the arithmetic mean of columns in a numeric matrix, data frame, or array. It efficiently calculates the average value for each column by summing the elements and dividing by the number of elements (or non-missing elements if specified).

In the above figure, we defined an input data frame, df, that contains three columns.

The colMeans() function accepts the whole data frame and returns a column-wise mean.

Here is the program that demonstrates the above figure:

df <- data.frame(
  col1 = c(1, 2, 3),
  col2 = c(4, 5, 6),
  col3 = c(7, 8, 9)
)

# Calculating the mean of every columns.
colMeans(df)

#  Output:
#  col1    col2    col3
#    2       5      8

Syntax

colMeans(x, na.rm = FALSE)

Parameters

Argument Description
x It represents an array of two or more dimensions containing numeric, complex, integer, or logical values or a numeric data frame.
na.rm It is a logical argument. If TRUE, NA values are ignored.

    Mean of each column and exclude NA values

    What if a data frame contains NA values, and we try to calculate the mean of every column?

    df <- data.frame(
      col1 = c(NA, 2, 3),
      col2 = c(4, NA, 6),
      col3 = c(7, 8, NA)
    )
    
    # Calculate the mean of every columns.
    colMeans(df)
    
    #  Output:
    #  col1  col2   col3
    #   NA    NA     NA

    We obtained the output ‘NA’ for every column, as shown in the above figure and code, because ‘NA’ represents a missing value.

    To ignore NA values in a data frame, you need to pass na.rm = TRUE.

    # Create a data frame.
    df <- data.frame(
      col1 = c(NA, 2, 3),
      col2 = c(4, NA, 6),
      col3 = c(7, 8, NA)
    )
    
    # Calculate the mean of every columns.
    colMeans(df, na.rm = TRUE)
    
    # Output:
    #  col1   col2   col3
    #  2.5    5.0    7.5
    

    Calculating the mean of specific columns

    You can calculate the mean of specific columns based on your requirement by passing the column’s index.

    df <- data.frame(
      col1 = c(1, 2, 3),
      col2 = c(4, 5, 6),
      col3 = c(7, 8, 9)
    )
    
    # Calculate the mean col1 and col3
    colMeans(df[c("col1", "col3")])
    
    #  Output:
    #  col1    col3
    #    2      8

    For row-wise means, you can use the rowMeans() function.

    With a Matrix

    A matrix contains rows and columns, and we are only interested in the columns. It works the same as it did with a data frame.

    mat <- matrix(1:6,
      nrow = 2, ncol = 3,
      dimnames = list(NULL, c("A", "B", "C"))
    )
    
    colMeans(mat)
    
    # Output:
    #  A   B   C
    #  3   4   5

    And it returns the mean of each column.

    Matrix with a single element

    If the input matrix contains a single element, it returns that element as a mean because there is nothing to compute against. Mean of a single value is that value only.

    single_mat <- matrix(5)
    
    colMeans(single_mat)
    
    # Output: 5

    That’s all!

    Recent Posts

    rowMeans(): Calculating the Mean of rows of a Data Frame in R

    The rowMeans() is a built-in, highly vectorized function in R that computes the arithmetic mean…

    5 days ago

    colSums(): Calculating the Sum of Columns of a Data Frame in R

    The colSums() function in R calculates the sums of columns for numeric matrices, data frames,…

    1 week ago

    rowSums(): Calculating the Sum of Rows of a Matrix or Data Frame in R

    The rowSums() function calculates the sum of values in each numeric row of a matrix,…

    2 weeks ago

    R View() Function

    The View() is a utility function in R that invokes a more intuitive spreadsheet-style data…

    3 weeks ago

    summary() Function: Producing Summary Statistics in R

    The summary() is a generic function that produces the summary statistics for various R objects,…

    4 weeks ago

    R paste() Function

    The paste() function in R concatenates vectors after converting them to character. paste("Hello", 19, 21,…

    1 month ago