R across(): Apply a Function (or Functions) Across Multiple Columns

The across() function from the dplyr package in R is used to apply a transformation to multiple columns of a data frame or tibble.

This function is helpful in combination with mutate() and summarise() functions to transform and summarize the data.

You can specify columns using tidy selection helpers like everything(), starts_with(), ends_with(), contains(), etc.

Syntax

across(.cols, .fns, ..., .names = NULL)

Parameters

  1. .cols: These are the columns to which to apply the functions.
  2. .fns: Function or list of functions to apply.
  3. .names: Naming convention for the new columns, useful when applying multiple functions.
  4. …: Additional arguments passed to the functions.

Return value

It returns a modified data frame or tibble.

Visual representation

Visual representation of dplyr across() function in R

Example 1: Applying a single function

library(dplyr)

# Sample data frame
df <- data.frame(
  col1 = c(1, 2, 3),
  col2 = c(4, 5, 6),
  col3 = c(7, 8, 9)
)
cat("---Before applying single function to multiple columns", "\n")

df

cat("---After applying single function to multiple columns", "\n")

df %>%
  mutate(across(everything(), function(x) x^2))

Output

Output of using across() function to apply single function on multiple data frame columns

Example 2: Summarizing data with multiple functions

library(dplyr)

# Sample data frame
df <- data.frame(
  col1 = c(1, 2, 3),
  col2 = c(4, 5, 6),
  col3 = c(7, 8, 9)
)
cat("---Before applying multiple functions to multiple columns", "\n")

df

cat("---After applying multiple functions to multiple columns", "\n")

df %>%
  summarise(across(everything(), list(mean = mean, sd = sd)))

Output

Output of using across() function to apply multiple functions to multiple columns in a data frame

Example 3: Applying different functions to different columns

Applying different functions to different columns

library(dplyr)

# Sample data frame
df <- data.frame(
  col1 = c(1, 2, 3),
  col2 = c(4, 5, 6),
  col3 = c(7, 8, 9)
)
cat("---Before applying different functions to different columns", "\n")

df

cat("---After applying different functions to different columns", "\n")

df %>%
  mutate(
   across(c(col1), sqrt),
   across(c(col2, col3), ~ .x / 1000)
 )

Output

Output of applying different functions to different columns

Example 4: Using conditional logic within across()

Visual representation of using conditional logic within across()

library(dplyr)

# Sample data frame
df <- data.frame(
  col1 = c(1, 2, 3),
  col2 = c(4, 5, 6),
  col3 = c(7, 8, 9)
)
cat("---Before using conditional logic", "\n")

df

cat("---After using conditional logic", "\n")

df %>%
  mutate(across(where(is.numeric), ~ ifelse(.x > 6, .x * 10, .x)))

Output

Output of applying conditional logic to the columns of data frame

Example 5: Renaming columns when applying multiple functions

library(dplyr)

# Sample data frame
df <- data.frame(
  col1 = c(1, 2, 3),
  col2 = c(4, 5, 6),
  col3 = c(7, 8, 9)
)
cat("---Before renaming multiple columns", "\n")

df

cat("---After renaming multiple columns", "\n")

df %>%
  mutate(across(everything(), list(min = min, max = max),
  .names = "{.col}_{.fn}"
))

Output

Output of renaming multiple columns

Here, the minimum and maximum values are calculated for each column, and the resulting columns are renamed according to the pattern “original column name_function name”.

When using .names, {.col} and {.fn} are placeholders for the column and function names, respectively.

Leave a Comment