To measure the Central Tendency of Vector in R, you can use one of the following techniques based on your requirements.
- Mean: Average of the Vector.
- Median: Middle Value of the Vector.
- Mode: Most Often, value appeared in the Vector.
The median of an observation variable is the value at the middle when the data is sorted in ascending order. It is an ordinal measure of the central location of the data values. The median of the dataset is the value that, assuming the dataset is ordered from smallest to largest, falls in the middle.
A median is a middle number, found by ordering all data points and picking out the one in the middle. If there are two middle numbers, it will take the mean of those two numbers.
Median in R
To calculate a median in R, use the built-in median() function. The median() function takes an R object and splits that into two exact parts, and returns a length-one object of the same type as input.
median(x, na.rm = FALSE, …)
x: It is an object for which a method has been defined or a numeric vector containing the values whose median is computed.
na.rm: It is a logical value indicating whether NA values should be stripped before the computation proceeds.
…: They are potentially further arguments for methods; not used in the default method.
The median() function returns a length-one object of the same type as input.
Define a vector with 5 elements and find the median of that vector.
rv <- 1:5 median(rv)
Our vector contains 1, 2, 3, 4, 5. The total number of elements is 5, the odd number that means the median() function is the middle number, which is 3, which returns the 3 in the output.
If the count number of elements is even then, it will calculate the mean of two middle values.
rv <- 1:6 median(rv)
Finding a median of sorted numbers
If you want to find the median of the sorted numbers and the numbers are not sorted, use the sort() function to sort the vector and then apply the median() function on that vector.
rv <- c(11, 19, 21, 18, 46) srv <- sort(rv) median(srv)
In our example, the input vector is unsorted, and to sort the vector in R, use the sort() function. We used the sort() function to get into ascending order and then apply the median() function, which returns the 19 as the median value.
Passing NA Option
If there are missing values, then the mean() function returns NA.
rv <- c(1, 2, 3, NA, 4, 5) median(rv)
And it returns NA in the output.
You can exclude missing values by setting na.rm = TRUE.
rv <- c(1, 2, 3, NA, 4, 5) median(rv, na.rm = TRUE)
The median is known as a robust estimator of location since it ignores outliers.
rv <- c(rnorm(900), rnorm(100, sd = 1000)) median(rv)
Calculating the Median by Group in R
To find the median by the group, combine the aggregate function with the median function.
Let’s use the iris dataset.
aggregate(iris$Sepal.Length, list(iris$Species), median)
Group.1 x 1 setosa 5.0 2 versicolor 5.9 3 virginica 6.5
You can also use the ChickWeight inbuilt dataset and find the median of the group.
The Median is calculated by arranging elements of a vector in the ascending or descending order, and the element in the middle of the sorted vector is our median.
That is it for the tutorial.
Krunal Lathiya is an Information Technology Engineer by education and web developer by profession. He has worked with many back-end platforms, including Node.js, PHP, and Python. In addition, Krunal has excellent knowledge of Data Science and Machine Learning, and he is an expert in R Language. Krunal has written many programming blogs, which showcases his vast expertise in this field.