To measure the Central Tendency of Vector in R, you can use one of the following techniques based on your requirements.
- Mean: Average of the Vector.
- Median: Middle Value of the Vector.
- Mode: Most Often, value appeared in the Vector.
The median of an observation variable is the value at the middle when the data is sorted in ascending order. It is an ordinal measure of the central location of the data values. The median of the dataset is the value that, assuming the dataset is ordered from smallest to largest, falls in the middle.
Median in R
To calculate the sample median in R, use the built-in median() function. A median is a middle number, found by ordering all data points and picking out the one in the middle. If there are two middle numbers, it will take the mean of those two numbers.
The median splits the dataset into two exact parts.
median(x, na.rm = FALSE, …)
x: It is an object for which a method has been defined or a numeric vector containing the values whose median is computed.
na.rm: It is a logical value indicating whether NA values should be stripped before the computation proceeds.
…: They are potentially further arguments for methods; not used in the default method.
The median() function returns a length-one object of the same type as input.
Define a vector with 5 elements and find the median of that vector.
rv <- 1:5 median(rv)
Our vector contains 1, 2, 3, 4, 5. The total number of elements is 5, which is the odd number that means the median() function is the middle number, which is 3, it returns the 3 in the output.
If the count number of elements is even then, it will calculate the mean of two middle values.
rv <- 1:6 median(rv)
Finding a median of sorted numbers
If you want to find the median of the sorted numbers and the numbers are not sorted, use the sort() function to sort the vector and then apply the median() function on that vector.
rv <- c(11, 19, 21, 18, 46) srv <- sort(rv) median(srv)
In our example, the input vector is unsorted, and to sort the vector in R, use the sort() function. We used the sort() function to get into ascending order and then apply the median() function, which returns the 19 as the median value.
Passing NA Option
If there are missing values, then the mean() function returns NA.
rv <- c(1, 2, 3, NA, 4, 5) median(rv)
And it returns NA in the output.
You can exclude missing values by setting na.rm = TRUE.
rv <- c(1, 2, 3, NA, 4, 5) median(rv, na.rm = TRUE)
The median is known as a robust estimator of location since it ignores outliers.
rv <- c(rnorm(900), rnorm(100, sd = 1000)) median(rv)
Calculate the Median by Group in R
To get the median by the group, combine the aggregate function with the median function. Let’s use the iris dataset.
aggregate(iris$Sepal.Length, list(iris$Species), median)
Group.1 x 1 setosa 5.0 2 versicolor 5.9 3 virginica 6.5
You can also use the ChickWeight inbuilt dataset and find the median of the group.
Median is calculated by arranging elements of a vector in the ascending or descending order, and the element in the middle of the sorted vector is our median.
That is it for calculating the median in the R tutorial.