The origin of data frames begins from intensive experimental research in the world of statistical software. The tabular data is referred to by the data frames.
R Data Frame
Data Frame in R is a table or two-dimensional array-like structure in which a row contains a set of values, and each column holds values of one variable. In Data Frame, each element forms the column, and the contents of the component form the rows.
In short, a data frame is a data structure that describes cases in which there are several observations(rows) or measurements (columns). Rows and Columns form a tabular data structure.
The main use of data frame in R is to store data tables in which the vectors included in the form of a list are of equal length.
Characteristics of Data Frame in R
- The data stored or put in the data frame can be factor, numeric, or character type.
- Each column includes an equal number of data elements.
- The row names must be unique.
- The column names must not be empty.
How to Create Data Frame in R
To create a data frame in R, use the data.frame() function. The data.frame() function creates data frames, tightly coupled collections of variables that share many of the properties of matrices and lists, used as the fundamental data structure.
streaming <- data.frame(
service_id = c(1:5),
service_name = c("Netflix", "Disney+", "HBOMAX", "Hulu", "Peacock"),
service_price = c(18, 10, 15, 7, 12),
stringsAsFactors = FALSE
)
# Print the data frame.
print(streaming)
Output
service_id service_name service_price
1 1 Netflix 18
2 2 Disney+ 10
3 3 HBOMAX 15
4 4 Hulu 7
5 5 Peacock 12
By default, data.frame() function converts character vector into factor.
To suppress this behavior, we can pass the argument stringsAsFactors=FALSE.
Get the Structure of the Data Frame
To get the structure of the data frame in R, use the str() function.
streaming <- data.frame(
service_id = c(1:5),
service_name = c("Netflix", "Disney+", "HBOMAX", "Hulu", "Peacock"),
service_price = c(18, 10, 15, 7, 12),
stringsAsFactors = FALSE
)
# Print the data frame.
print(str(streaming))
Output
'data.frame': 5 obs. of 3 variables:
$ service_id : int 1 2 3 4 5
$ service_name : chr "Netflix" "Disney+" "HBOMAX" "Hulu" ...
$ service_price: num 18 10 15 7 12
NULL
Summary of Data in Data Frame
To get the statistical summary and nature of data in the data frame, use the summary() function.
streaming <- data.frame(
service_id = c(1:5),
service_name = c("Netflix", "Disney+", "HBOMAX", "Hulu", "Peacock"),
service_price = c(18, 10, 15, 7, 12),
stringsAsFactors = FALSE
)
print(summary(streaming))
Output
How to access Components of a Data Frame
To access components of the Data Frame, use either [, [[ or $ operator to access columns of the data frame. Components of the data frame can be accessed like a list or matrix.
streaming <- data.frame(
service_id = c(1:5),
service_name = c("Netflix", "Disney+", "HBOMAX", "Hulu", "Peacock"),
service_price = c(18, 10, 15, 7, 12),
stringsAsFactors = FALSE
)
streaming["service_name"]
streaming$service_price
streaming[["service_name"]]
Output
service_name
1 Netflix
2 Disney+
3 HBOMAX
4 Hulu
5 Peacock
[1] 18 10 15 7 12
[1] "Netflix" "Disney+" "HBOMAX" "Hulu" "Peacock"
In this example, we have accessed the columns using three ways.
Accessing with [[ or $ is similar. However, it differs for [ in that indexing with [ will return us a data frame, but the other two will reduce it into a vector.
Accessing data frame like a matrix
You can access the data frame like Matrix by providing an index for row and column.
To demonstrate this, we use datasets already available in R. Datasets that are available can be listed with the command library(help = “datasets”). We will use the women dataset.
You can examine the data set using functions like str() and head().
str(women)
Output
'data.frame': 15 obs. of 2 variables:
$ height: num 58 59 60 61 62 63 64 65 66 67 ...
$ weight: num 115 117 120 123 126 129 132 135 139 142 ...
We can see the first three rows of the women dataset using the head() function.
head(women, n=3)
Output
height weight
1 58 115
2 59 117
3 60 120
Now we will access the data frame like a matrix.
Let’s select only the 3rd and 4th row.
women[2:3,]
Output
height weight
2 59 117
3 60 120
Let’s select the rows with heights greater than 70.
women[women$height > 70,]
Output
height weight
14 71 159
15 72 164
Let’s see another example.
women[10:14, 2]
Output
[1] 142 146 150 154 159
In this case, the returned type is a vector since we extracted data from a single column. This behavior can be avoided by passing the argument drop=FALSE as follows.
women[10:14, 2, drop = FALSE]
Output
weight
10 142
11 146
12 150
13 154
14 159
How to modify a Data Frame in R
The data frames can be modified by using reassignment, just like matrices.
streaming <- data.frame(service_id = c(1:5), service_name = c("Netflix", "Disney+", "HBOMAX", "Hulu", "Peacock"), service_price = c(18, 10, 15, 7, 12), stringsAsFactors = FALSE) streaming cat("After changing Netflix price", "\n") streaming[1, "service_price"] <- 20 streaming
Output
service_id service_name service_price
1 1 Netflix 18
2 2 Disney+ 10
3 3 HBOMAX 15
4 4 Hulu 7
5 5 Peacock 12
After changing Netflix price
service_id service_name service_price
1 1 Netflix 20
2 2 Disney+ 10
3 3 HBOMAX 15
4 4 Hulu 7
5 5 Peacock 12
How to add a row in R Data Frame
To add rows in the data frame in R, use the rbind() function.
streaming <- data.frame(service_id = c(1:5),
service_name = c("Netflix", "Disney+", "HBOMAX", "Hulu", "Peacock"),
service_price = c(18, 10, 15, 7, 12),
stringsAsFactors = FALSE)
streaming
cat("After adding a row", "\n")
rbind(streaming, list(6, "Quibi", 5))
The rbind() function takes a data frame and the row you need to pass as R List. If you run the output,
service_id service_name service_price
1 1 Netflix 18
2 2 Disney+ 10
3 3 HBOMAX 15
4 4 Hulu 7
5 5 Peacock 12
After adding a row
service_id service_name service_price
1 1 Netflix 18
2 2 Disney+ 10
3 3 HBOMAX 15
4 4 Hulu 7
5 5 Peacock 12
6 6 Quibi 5
You can see that the new row 6th is added.
How to add a column in R Data Frame
To add a column in R Data Frame, use the cbind() function.
streaming <- data.frame(service_id = c(1:5),
service_name = c("Netflix", "Disney+", "HBOMAX", "Hulu", "Peacock"),
service_price = c(18, 10, 15, 7, 12),
stringsAsFactors = FALSE)
streaming
cat("After adding a column", "\n")
cbind(streaming, service_show=c("Stranger Things", "The Mandalorian", "Friends", "Castle Rock", "The Office"))
Output
service_id service_name service_price
1 1 Netflix 18
2 2 Disney+ 10
3 3 HBOMAX 15
4 4 Hulu 7
5 5 Peacock 12
After adding a column
service_id service_name service_price service_show
1 1 Netflix 18 Stranger Things
2 2 Disney+ 10 Mandalorian
3 3 HBOMAX 15 Friends
4 4 Hulu 7 Castle Rock
5 5 Peacock 12 The Office
How to Delete Column in R DataFrame
To remove a column in the R data frame, assign NULL to that column.
streaming <- data.frame(service_id = c(1:5),
service_name = c("Netflix", "Disney+", "HBOMAX", "Hulu", "Peacock"),
service_price = c(18, 10, 15, 7, 12),
stringsAsFactors = FALSE)
streaming
cat("After removing a service_price column", "\n")
streaming$service_price <- NULL
streaming
Output
service_id service_name service_price
1 1 Netflix 18
2 2 Disney+ 10
3 3 HBOMAX 15
4 4 Hulu 7
5 5 Peacock 12
After removing a service_price column
service_id service_name
1 1 Netflix
2 2 Disney+
3 3 HBOMAX
4 4 Hulu
5 5 Peacock
How to remove a row in R DataFrame
To remove a row from a data frame, assign NULL to that row.
streaming <- data.frame(service_id = c(1:5),
service_name = c("Netflix", "Disney+", "HBOMAX", "Hulu", "Peacock"),
service_price = c(18, 10, 15, 7, 12),
stringsAsFactors = FALSE)
streaming
cat("After removing a service_price column", "\n")
streaming <- streaming[-1, ]
streaming
Output
service_id service_name service_price
1 1 Netflix 18
2 2 Disney+ 10
3 3 HBOMAX 15
4 4 Hulu 7
5 5 Peacock 12
After removing a service_price column
service_id service_name service_price
2 2 Disney+ 10
3 3 HBOMAX 15
4 4 Hulu 7
5 5 Peacock 12
That is it for data frames in R Language.

Krunal Lathiya is a Software Engineer with over eight years of experience. He has developed a strong foundation in computer science principles and a passion for problem-solving. In addition, Krunal has excellent knowledge of Data Science and Machine Learning, and he is an expert in R Language.