R summary() Function

The summary() function in R returns the minimum value, first quartile (25th percentile), median (50th percentile), mean, third quartile (75th percentile), and maximum value.

In real-life data sets, this is often one of the first functions applied after data importation or model fitting to get an initial understanding of the data or model results.

To get a high-level overview of an object like a dataset, a vector, or the results of a statistical model, use this function.

Syntax

summary(data, maxsum)

Parameters

  1. data: It is an R object for which you want a summary.
  2. maxsum: An integer suggests how many levels should be shown for factors.

Return Value

The output varies greatly depending on the type of object it is applied to.

Example 1: Summary of data frame

Using summary() with data frame

df <- data.frame(
  service_id = c(1:5),
  service_name = c("Netflix", "Disney+", "HBOMAX", "Hulu", "Peacock"),
  service_price = c(18, 10, 15, 7, 12),
  stringsAsFactors = FALSE
)

cat("The summary() of data frame is", "\n")
summary(df)

Output

The summary() of data frame is
service_id   service_name      service_price
Min. :1       Length:5           Min. : 7.0
1st Qu.:2     Class :character   1st Qu.:10.0
Median :3     Mode :character    Median :12.0
Mean :3                          Mean :12.4
3rd Qu.:4                        3rd Qu.:15.0
Max. :5                          Max. :18.0

Example 2: Summary of list

Using summary() function with list

vec <- 1:5
list <- list(vec)
cat("The summary() of list is", "\n")
summary(list)

Output

The summary() of list is
    Length   Class      Mode
[1,]   5      -none-    numeric

Example 3: Summary of array

Using summary() with an array

rv <- c(19, 21)
rv2 <- c(46, 4)
arr <- array(c(rv, rv2), dim = c(2, 2, 2))

cat("The summary() of array is", "\n")

summary(arr)

Output

The summary() of array is
 Min.  1st Qu. Median  Mean   3rd Qu.  Max.
 4.00  15.25   20.00   22.50  27.25   46.00

Example 4: Summary of matrix

rv <- c(11, 18, 19, 21)
mtrx <- matrix(rv, nrow = 2, ncol = 2)

cat("The summary() of matrix is", "\n")

summary(mtrx)

Output

The summary() of matrix is
    V1               V2
Min. :11.00       Min. :19.0
1st Qu.:12.75     1st Qu.:19.5
Median :14.50     Median :20.0
Mean :14.50       Mean :20.0
3rd Qu.:16.25     3rd Qu.:20.5
Max. :18.00       Max. :21.0

Example 5: Summary of vector

Using summary() with vector

vec <- 1:5
vec

cat("The summary() of vector is", "\n")

summary(vec)

Output

[1] 1 2 3 4 5
The summary() of vector is
Min. 1st Quantile  Median  Mean  3rd Quantile   Max.
1       2             3      3         4         5

Example 6: Summary of linear regression model

Linear regression attempts to model the relationship between two variables by fitting a linear equation to observed data. One variable is considered an explanatory variable, and the other is a dependent variable.

A widespread application of the summary functions is the calculation of summary statistics of statistical models.

set.seed(93274)
l_x <- rnorm(1000)
l_y <- rnorm(1000) + l_x

mod <- lm(l_y ~ l_x)

summary(mod)

Output

Call:
lm(formula = l_y ~ l_x)

Residuals:
 Min 1Q Median 3Q Max
-3.7337 -0.6964 -0.0047 0.7333 3.3489

Coefficients:
 Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.02159 0.03292 -0.656 0.512
l_x 1.00156 0.03262 30.707 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 1.041 on 998 degrees of freedom
Multiple R-squared: 0.4858, Adjusted R-squared: 0.4853
F-statistic: 942.9 on 1 and 998 DF, p-value: < 2.2e-16

For more detailed or specific summaries, other functions like str(), table(), or specialized packages for statistical modeling might be necessary.

Leave a Comment