How to Calculate Relative Frequencies in R

Here are two ways to calculate relative frequencies in R:

  1. Using dplyr package
  2. Using frequency tables

Method 1: Using the dplyr package

  1. Count the number of occurrences of each value or group.
  2. Calculate the sum of all counts to get the total.
  3. Divide each count by the total to get the relative frequency.

Example 1: Calculating the Relative Frequency of One Variable

Visual representation of Relative frequency for one variable

library(dplyr)

df <- data.frame(
  Age = c(20, 21, 19, 22, 23, 20, 21),
  Gender = c("Male", "Female", "Male", "Female", "Male", "Female", "Male"),
  Score = c(85, 90, 88, 78, 92, 80, 87)
)

df %>%
  group_by(Gender) %>%
  summarise(count = n()) %>%
  mutate(relative_frequency = count / sum(count))

Output

Output of Relative Frequency of One Variable

In this code:

  1. group_by(Gender) groups the data by Gender.
  2. summarise(count = n()) counts the number of rows in each gender group.
  3. mutate(relative_frequency = count/sum (count)) calculates the relative frequency by dividing each count by the total count.

Example 2: Calculating the Relative Frequency of Multiple Variables

To calculate the relative frequency of multiple variables simultaneously using dplyr, use the group_by() function with multiple grouping variables followed by the summarise() and mutate() functions.

Visual representation of Calculating the Relative Frequency of Multiple Variables

library(dplyr)

df <- data.frame(
  Age = c(20, 21, 19, 22, 23, 20, 21),
  Gender = c("Male", "Female", "Male", "Female", "Male", "Female", "Male"),
  Score = c(85, 90, 88, 78, 92, 80, 87)
)

df %>%
  group_by(Gender, Age) %>%
  summarise(count = n()) %>%
  mutate(relative_frequency = count / nrow(df))

Output

Output of Calculating the Relative Frequency of Multiple Variables

The resulting dataframe will contain each unique combination of Gender and Age along with its corresponding relative frequency.

Example 3: Displaying Relative Frequencies as Percentages

To display relative frequencies as percentages, you can multiply the relative frequencies by 100. Continuing from the previous example, where we calculated the relative frequencies for both Gender and Age in the dataframe, you can add the mutate() function to convert the relative frequencies into percentages:

library(dplyr)

df <- data.frame(
  Age = c(20, 21, 19, 22, 23, 20, 21),
  Gender = c("Male", "Female", "Male", "Female", "Male", "Female", "Male"),
  Score = c(85, 90, 88, 78, 92, 80, 87)
)

df %>%
  group_by(Gender, Age) %>%
  summarise(count = n()) %>%
  mutate(
    relative_frequency = count / nrow(df),
    percentage = relative_frequency * 100
  )

Output

Output of Displaying Relative Frequencies as Percentages

Method 2: Using frequency tables

The table() function calculates the frequency of each unique value in the dataset, and dividing these frequencies by the total number of values in the dataset (as obtained by length(data)) converts these into relative frequencies.

Each relative frequency represents the proportion of times a particular value appears in the dataset.

Example 1: Frequency table for one data frame column

df <- data.frame(
  Age = c(20, 21, 19, 22, 23, 20, 21),
  Gender = c("Male", "Female", "Male", "Female", "Male", "Female", "Male"),
  Score = c(85, 90, 88, 78, 92, 80, 87)
)

# Calculate relative frequency table for 'Gender' column
table(df$Gender) / length(df$Gender)

Output

 Female      Male
0.4285714  0.5714286

Example 2: Calculating for all data frame columns

df <- data.frame(
  Age = c(20, 21, 19, 22, 23, 20, 21),
  Gender = c("Male", "Female", "Male", "Female", "Male", "Female", "Male"),
  Score = c(85, 90, 88, 78, 92, 80, 87)
)

# calculate relative frequency table for each column
sapply(df, function(x) table(x) / nrow(df))

Output

Output of Calculating for all data frame columns

That’s all!

Related posts

Find the Maximum Value By Group

Summary Statistics By Group

Weighted Mean

Conditional Mean

Unique values by a group

Sum by Group

Leave a Comment