R Advanced

R scale(): Scaling and Centering of Matrix-like Objects

The scale() function in R centers (subtracting the mean) and/or scales (dividing by the standard deviation) the columns of a numeric matrix or data frame. It transforms data to have a mean of 0 and a standard deviation of 1 by default, which is equivalent to computing z-scores.

vec <- c(1, 2, 3, 4, 5, 6)

scale(vec)

# Output:
# [,1]
# [1,] -1.3363062
# [2,] -0.8017837
# [3,] -0.2672612
# [4,] 0.2672612
# [5,] 0.8017837
# [6,] 1.3363062
# attr(,"scaled:center")
# [1] 3.5
# attr(,"scaled:scale")
# [1] 1.870829

Syntax

scale(x, center = TRUE, scale = TRUE)

Parameters

Argument Description
x (required) It represents the data that needs to be scaled. For example, matrix, data frame, vector.
center It is a logical or numeric-like vector.

If it is set to TRUE, it subtracts the column means from each column. 

If FALSE, no centering is applied

scale It represents a logical or numeric vector.

If TRUE, divides each column by its standard deviation (using sd() with n-1 denominator for sample variance). 

If FALSE, it does not apply any scaling.

Scale the column values in a data frame

df <- data.frame(
  "col1" = c(21, 19, 11),
  "col2" = c(18, 46, 10),
  "col3" = c(5, 15, 25)
)

scale(df)

Output

Scaling the values in the Matrix

Let’s define a 3×3 matrix and scale it.

mat <- matrix(1:9, ncol = 3)

scale(mat)

# Output:
#       [,1] [,2]   [,3]
# [1,]   -1   -1.    -1
# [2,]    0    0      0
# [3,]    1    1      1
# attr(,"scaled:center")
# [1] 2  5  8
# attr(,"scaled:scale")
# [1] 1  1  1

If we set ‘scale‘ = FALSE, we try to force the scaling feature of this function to be turned off and that only centralization of the data values will occur.

mat <- matrix(1:9, ncol = 3)

scale(mat, center = c(1, 2, 3), scale = FALSE)

# Output:
#       [,1] [,2] [,3]
# [1,]   0    2    4
# [2,]   1    3    5
# [3,]   2    4    6
# attr(,"scaled:center")
# [1] 1  2  3

Passing center = FALSE

We can scale the values of a matrix by setting center = FALSE in this function.

mat <- matrix(1:9, ncol = 3)

scale(mat, center = FALSE, scale = c(1, 2, 3))

Output

     [,1] [,2]  [,3]
[1,]  1    2.0  2.333333
[2,]  2    2.5  2.666667
[3,]  3    3.0  3.000000
attr(,"scaled:scale")
[1] 1   2   3

Using  without Scales Or Centers

mat <- matrix(1:9, ncol = 3)

scale(mat, center = FALSE, scale = FALSE)

# Output:
     [,1]  [,2] [,3]
[1,]  1     4    7
[2,]  2     5    8
[3,]  3     6    9

After scaling, each column will have a mean of zero and a standard deviation of one, assuming center = TRUE and scale = TRUE.

It’s essential to apply the same scaling parameters to new data (e.g., test data in machine learning) that were used to scale the training data.

Recent Posts

file.rename(): Renaming Single and Multiple Files in R

To rename a file in R, you can use the file.rename() function. It renames a…

3 weeks ago

R prop.table() Function

The prop.table() function in R calculates the proportion or relative frequency of values in a…

3 weeks ago

exp() Function: Calculate Exponential of a Number in R

The exp() is a built-in function that calculates the exponential of its input, raising Euler's…

3 weeks ago

R split() Function: Splitting a Data

The split() function divides the input data into groups based on some criteria, typically specified…

4 weeks ago

colMeans(): Calculating the Mean of Columns in R Data Frame

The colMeans() function in R calculates the arithmetic mean of columns in a numeric matrix,…

1 month ago

rowMeans(): Calculating the Mean of rows of a Data Frame in R

The rowMeans() is a built-in, highly vectorized function in R that computes the arithmetic mean…

1 month ago