Here are three ways to calculate the mean by the group in R:
- Using aggregate()
- Using group_by() with summarize() from dplyr package
- Using data.table
Method 1: Using aggregate()
The aggregate() function is used to apply a function to each group of a data frame and return the results in a new data frame.
df <- data.frame(
name = c("Krunal", "Ankit", "Rushabh", "Krunal"),
score = c(85, 90, 78, 95),
subject = c("Math", "Math", "History", "Sanskrit"),
grade = c("10th", "11th", "11th", "10th")
)
print(df)
cat("----Calculating the average by name----", "\n")
aggregate(df$score, list(df$name), FUN = mean)
Output
Method 2: Using group_by() with summarize()
You can also use the “group_by()” and “summarize()” functions from the dplyr package to calculate the mean of the score group by name.
library("dplyr")
df <- data.frame(
name = c("Krunal", "Ankit", "Rushabh", "Krunal"),
score = c(85, 90, 78, 95),
subject = c("Math", "Math", "History", "Sanskrit"),
grade = c("10th", "11th", "11th", "10th")
)
print(df)
cat("----Calculating the average by name----", "\n")
df %>% group_by(name) %>% summarise_at(vars(score), list(group = mean))
Output
Method 3: Using data.table()
library("data.table")
df <- data.frame(
name = c("Krunal", "Ankit", "Rushabh", "Krunal"),
score = c(85, 90, 78, 95),
subject = c("Math", "Math", "History", "Sanskrit"),
grade = c("10th", "11th", "11th", "10th")
)
print(df)
setDT(df)
cat("----Calculating the average by group using data.table----", "\n")
df[, list(average = mean(score)), by = name]
Output
That’s it!
Krunal Lathiya is a seasoned Computer Science expert with over eight years in the tech industry. He boasts deep knowledge in Data Science and Machine Learning. Versed in Python, JavaScript, PHP, R, and Golang. Skilled in frameworks like Angular and React and platforms such as Node.js. His expertise spans both front-end and back-end development. His proficiency in the Python language stands as a testament to his versatility and commitment to the craft.