Data Frame in R is a “table or two-dimensional array-like structure in which a row contains a set of values, and each column holds values of one variable”. In Data Frame, each element forms the column, and the contents of the component form the rows.
In short, a data frame is a data structure that describes cases with several observations(rows) or measurements (columns). Rows and Columns form a tabular data structure.
How to Create Data Frame in R
To create a data frame in R, you can use the “data.frame()” function. The function creates data frames, tightly coupled collections of variables that share many of the properties of matrices and lists, used as the fundamental data structure.
streaming <- data.frame(
service_id = c(1:5),
service_name = c("Netflix", "Disney+", "HBOMAX", "Hulu", "Peacock"),
service_price = c(18, 10, 15, 7, 12),
stringsAsFactors = FALSE
)
# Print the data frame.
print(streaming)
Output
service_id service_name service_price
1 1 Netflix 18
2 2 Disney+ 10
3 3 HBOMAX 15
4 4 Hulu 7
5 5 Peacock 12
Get the Structure of the Data Frame
To get the structure of the data frame in R, you can use the str() function.
streaming <- data.frame(
service_id = c(1:5),
service_name = c("Netflix", "Disney+", "HBOMAX", "Hulu", "Peacock"),
service_price = c(18, 10, 15, 7, 12),
stringsAsFactors = FALSE
)
# Print the data frame.
print(str(streaming))
Output
'data.frame': 5 obs. of 3 variables:
$ service_id : int 1 2 3 4 5
$ service_name : chr "Netflix" "Disney+" "HBOMAX" "Hulu" ...
$ service_price: num 18 10 15 7 12
NULL
Summary of Data in Data Frame
To get the statistical summary and nature of data in the data frame, use the summary() function.
streaming <- data.frame(
service_id = c(1:5),
service_name = c("Netflix", "Disney+", "HBOMAX", "Hulu", "Peacock"),
service_price = c(18, 10, 15, 7, 12),
stringsAsFactors = FALSE
)
print(summary(streaming))
Output
How to access Components of a Data Frame
To access components of the Data Frame, use either [, [[ or $ operator to access columns of the data frame. Components of the data frame can be accessed like a list or matrix.
streaming <- data.frame(
service_id = c(1:5),
service_name = c("Netflix", "Disney+", "HBOMAX", "Hulu", "Peacock"),
service_price = c(18, 10, 15, 7, 12),
stringsAsFactors = FALSE
)
streaming["service_name"]
streaming$service_price
streaming[["service_name"]]
Output
service_name
1 Netflix
2 Disney+
3 HBOMAX
4 Hulu
5 Peacock
[1] 18 10 15 7 12
[1] "Netflix" "Disney+" "HBOMAX" "Hulu" "Peacock"
Accessing data frames like a matrix
You can access the data frame like Matrix by providing an index for row and column.
To demonstrate this, we use datasets already available in R. Datasets that are available can be listed with the command library(help = “datasets”). We will use the women dataset.
You can examine the data set using functions like str() and head().
str(women)
Output
'data.frame': 15 obs. of 2 variables:
$ height: num 58 59 60 61 62 63 64 65 66 67 ...
$ weight: num 115 117 120 123 126 129 132 135 139 142 ...
We can see the first three rows of the women dataset using the head() function.
head(women, n=3)
Output
height weight
1 58 115
2 59 117
3 60 120
Now we will access the data frame like a matrix.
Let’s select only the 3rd and 4th rows.
women[2:3,]
Output
height weight
2 59 117
3 60 120
Let’s select the rows with heights greater than 70.
women[women$height > 70,]
Output
height weight
14 71 159
15 72 164
Let’s see another example.
women[10:14, 2]
Output
[1] 142 146 150 154 159
In this case, the returned type is a vector since we extracted data from a single column. This behavior can be avoided by passing the argument drop=FALSE as follows.
women[10:14, 2, drop = FALSE]
Output
weight
10 142
11 146
12 150
13 154
14 159
How to add a row in the R Data Frame
To add rows in the data frame in R, use the rbind() function.
streaming <- data.frame(service_id = c(1:5),
service_name = c("Netflix", "Disney+", "HBOMAX", "Hulu", "Peacock"),
service_price = c(18, 10, 15, 7, 12),
stringsAsFactors = FALSE)
streaming
cat("After adding a row", "\n")
rbind(streaming, list(6, "Quibi", 5))
The rbind() function takes a data frame and the row you must pass as R List. If you run the output,
service_id service_name service_price
1 1 Netflix 18
2 2 Disney+ 10
3 3 HBOMAX 15
4 4 Hulu 7
5 5 Peacock 12
After adding a row
service_id service_name service_price
1 1 Netflix 18
2 2 Disney+ 10
3 3 HBOMAX 15
4 4 Hulu 7
5 5 Peacock 12
6 6 Quibi 5
How to add a column in the R Data Frame
To add a column in R Data Frame, use the cbind() function.
streaming <- data.frame(service_id = c(1:5),
service_name = c("Netflix", "Disney+", "HBOMAX", "Hulu", "Peacock"),
service_price = c(18, 10, 15, 7, 12),
stringsAsFactors = FALSE)
streaming
cat("After adding a column", "\n")
cbind(streaming, service_show=c("Stranger Things", "The Mandalorian",
"Friends", "Castle Rock", "The Office"))
Output
service_id service_name service_price
1 1 Netflix 18
2 2 Disney+ 10
3 3 HBOMAX 15
4 4 Hulu 7
5 5 Peacock 12
After adding a column
service_id service_name service_price service_show
1 1 Netflix 18 Stranger Things
2 2 Disney+ 10 Mandalorian
3 3 HBOMAX 15 Friends
4 4 Hulu 7 Castle Rock
5 5 Peacock 12 The Office
How to Delete Column in R DataFrame
To remove a column in the R data frame, assign NULL to that column.
streaming <- data.frame(service_id = c(1:5),
service_name = c("Netflix", "Disney+", "HBOMAX", "Hulu", "Peacock"),
service_price = c(18, 10, 15, 7, 12),
stringsAsFactors = FALSE)
streaming
cat("After removing a service_price column", "\n")
streaming$service_price <- NULL
streaming
Output
service_id service_name service_price
1 1 Netflix 18
2 2 Disney+ 10
3 3 HBOMAX 15
4 4 Hulu 7
5 5 Peacock 12
After removing a service_price column
service_id service_name
1 1 Netflix
2 2 Disney+
3 3 HBOMAX
4 4 Hulu
5 5 Peacock
How to remove a row in R DataFrame
To remove a row from a data frame, assign NULL to that row.
streaming <- data.frame(service_id = c(1:5),
service_name = c("Netflix", "Disney+", "HBOMAX", "Hulu", "Peacock"),
service_price = c(18, 10, 15, 7, 12),
stringsAsFactors = FALSE)
streaming
cat("After removing a service_price column", "\n")
streaming <- streaming[-1, ]
streaming
Output
service_id service_name service_price
1 1 Netflix 18
2 2 Disney+ 10
3 3 HBOMAX 15
4 4 Hulu 7
5 5 Peacock 12
After removing a service_price column
service_id service_name service_price
2 2 Disney+ 10
3 3 HBOMAX 15
4 4 Hulu 7
5 5 Peacock 12
That is it.

Krunal Lathiya is a seasoned Computer Science expert with over eight years in the tech industry. He boasts deep knowledge in Data Science and Machine Learning. Versed in Python, JavaScript, PHP, R, and Golang. Skilled in frameworks like Angular and React and platforms such as Node.js. His expertise spans both front-end and back-end development. His proficiency in the Python language stands as a testament to his versatility and commitment to the craft.