summary() Function in R with Example

To get a better idea of the distribution of your variables in the dataset, use the summary() function. If you need a quick survey of your dataset, you can, of course, always use the R str() function and look at the structure.

summary() Function in R

The summary() is an inbuilt generic function in R used to produce result summaries of various model fitting functions. The summary() method entreats specific methods that depend on the class of the first argument.

Syntax

summary(object, maxsum = 7, digits = max(3, getOption("digits")-3), …)

Parameters

object: It is an object for which a summary is desired.

maxsum: It is an integer, indicating how many levels should be shown for factors.

digits: It is an integer, used for number formatting with signif().

Return Value

The summary() function returns the value that depends on the class of its argument.

Example

Let’s apply the summary() function to a vector that will act like the R object.

vec <- 1:5
vec
cat("The summary() of vector is", "\n")
summary(vec)

Output

[1] 1 2 3 4 5
The summary() of vector is
Min. 1st Quantile  Median  Mean  3rd Quantile   Max.
1       2             3      3         4         5

As you can see from the output that the summary() of a vector returns descriptive statistics such as the minimum, the 1st quantile, the median, the mean, the 3rd quantile, and the maximum value of our input data.

summary() function on R List

To get the summary of the list in R, use the summary() function. To define a list, use the list() function and pass the elements as arguments.

vec <- 1:5
list <- list(vec)
cat("The summary() of list is", "\n")
summary(vec)

Output

The summary() of list is
Min. 1st Qu. Median Mean  3rd Qu. Max.
 1      2      3     3     4       5

summary() function on R Array

To get the summary of an array in R, use the summary() function. To create an array in Ruse the array() function. The array() function takes a vector as an argument and uses the dim parameter to create an array.

rv <- c(19, 21)
rv2 <- c(46, 4)
arr <- array(c(rv, rv2), dim = c(2, 2, 2))
cat("The summary() of array is", "\n")
summary(arr)

Output

The summary() of array is
 Min.  1st Qu. Median  Mean   3rd Qu.  Max.
 4.00  15.25   20.00   22.50  27.25   46.00

summary() function on R Matrix

To get the summary of a matrix in R, use the summary() function. To create a matrix in R, use the matrix() function, and pass the vectornrow, and ncol parameters.

rv <- c(11, 18, 19, 21)
mtrx <- matrix(rv, nrow = 2, ncol = 2)
cat("The summary() of matrix is", "\n")
summary(mtrx)

Output

The summary() of matrix is
    V1               V2
Min. :11.00       Min. :19.0
1st Qu.:12.75     1st Qu.:19.5
Median :14.50     Median :20.0
Mean :14.50       Mean :20.0
3rd Qu.:16.25     3rd Qu.:20.5
Max. :18.00       Max. :21.0

summary() function on R data frame

To get the summary of a data frame in R, use the summary() function. To create a data frame in Ruse data.frame() function.

df <- data.frame(
 service_id = c(1:5),
 service_name = c("Netflix", "Disney+", "HBOMAX", "Hulu", "Peacock"),
 service_price = c(18, 10, 15, 7, 12),
 stringsAsFactors = FALSE
)
cat("The summary() of data frame is", "\n")
summary(df)

Output

The summary() of data frame is
service_id   service_name      service_price
Min. :1       Length:5           Min. : 7.0
1st Qu.:2     Class :character   1st Qu.:10.0
Median :3     Mode :character    Median :12.0
Mean :3                          Mean :12.4
3rd Qu.:4                        3rd Qu.:15.0
Max. :5                          Max. :18.0

summary() function on Linear Regression Model

Linear regression attempts to model the relationship between two variables by fitting a linear equation to observed data. One variable is considered to be an explanatory variable, and the other is considered to be a dependent variable.

A widespread application of the summary functions is the computation of summary statistics of statistical models. Let’s see the following code.

set.seed(93274)
l_x <- rnorm(1000)
l_y <- rnorm(1000) + l_x
mod <- lm(l_y ~ l_x)
summary(mod)

Output

Call:
lm(formula = l_y ~ l_x)

Residuals:
 Min 1Q Median 3Q Max
-3.7337 -0.6964 -0.0047 0.7333 3.3489

Coefficients:
 Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.02159 0.03292 -0.656 0.512
l_x 1.00156 0.03262 30.707 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 1.041 on 998 degrees of freedom
Multiple R-squared: 0.4858, Adjusted R-squared: 0.4853
F-statistic: 942.9 on 1 and 998 DF, p-value: < 2.2e-16

Our example data consists of two randomly distributed numeric vectors. We can estimate a linear regression model.

The data object mod contains the output of our linear regression. Now, we have applied the summary() function to this model object to print summary statistics for this model.

That is it for summary() function tutorial.

 

Leave a Comment