When we attempt to remove the first row of a data frame, we are essentially selecting all the rows except for the first one. This means we are excluding the first row and retrieving all other rows in the data frame.
DataFrames are like tables with rows and columns. To remove rows, you can use indexing, and the rows are the first dimension in the square brackets. Like this: df [row_index, ]. The index of rows starts from 1.
Here are three ways:
Negative indexing is a way to exclude specific rows based on your requirements. Since our requirement is to remove the first row, we can use df[-1, ], where -1 means to exclude the first row.
df <- data.frame(
age = c(20, 21, 19, 22, 21),
gender = c("Male", "Female", "Male", "Female", "Male"),
score = c(85, 90, 88, 78, 92)
)
df[-1, ]
Output
The above output screenshot shows that the first row with row index 1 is removed from the data frame. We are removing a row by its position.
It’s important to note that the row indices have not been reset, meaning that the row index starts at 2 instead of 1.
When you have a data frame containing only one row, df[-1, ] will return an empty data frame. It won’t throw any error. If you don’t use drop = FALSE, the result might inadvertently convert to a vector (but with 0 rows, it stays as a DataFrame).
df <- data.frame(
age = c(20),
gender = c("Male"),
score = c(85)
)
df[-1, , drop = FALSE]
# [1] age gender score
# <0 rows> (or 0-length row.names)
If the data frame is already empty and you try to remove the first row, it will still return an empty data frame.
df <- data.frame()
df[-1, , drop = FALSE]
# data frame with 0 columns and 0 rows
Sometimes, you have an empty data frame and try to remove rows, which results in an expected error. We can prevent this by checking if it contains any rows before removal.
Use nrow() to avoid unintended empty results. Let’s create a reusable function to handle edge cases.
df <- data.frame()
safe_remove_first_row <- function(df) {
if (nrow(df) == 0) {
return(df)
}
df[-1, , drop = FALSE]
}
safe_remove_first_row(df)
# data frame with 0 columns and 0 rows
The dplyr slice() function selects rows from a data frame, and using slice(df, -1), we can remove the first row. However, you must ensure that dplyr is installed and loaded in your environment.
library(dplyr)
df <- data.frame(
age = c(20, 21, 19, 22, 21),
gender = c("Male", "Female", "Male", "Female", "Male"),
score = c(85, 90, 88, 78, 92)
)
df <- df %>% slice(-1)
df
Output
After removing the first row, you’ll notice that the row index has been reset, starting again at 1.
If the data frame has only one row and you remove that row, it will return an empty data frame. Unlike base R, it always preserves the DataFrame structure.
library(dplyr)
df <- data.frame(
age = c(20),
gender = c("Male"),
score = c(85)
)
df <- df %>% slice(-1)
df
# data frame with 0 columns and 0 rows
If your data frame is empty and you use the slice(-1) method, it will still return an empty data frame and will not give any error.
library(dplyr)
df <- data.frame()
df <- df %>% slice(-1)
df
# data frame with 0 columns and 0 rows
The base tail() function returns the last five rows of the data frame, but if you use a tail() function with n = -1, it excludes the first row.
df <- data.frame(
age = c(20, 21, 19, 22, 21),
gender = c("Male", "Female", "Male", "Female", "Male"),
score = c(85, 90, 88, 78, 92)
)
df <- tail(df, n = -1)
df
Output
The above output screenshot shows that after the removal, the row index does not reset. It starts from row 2.
If the data frame has a single row, it will return an empty data frame.
df <- data.frame(
age = c(20),
gender = c("Male"),
score = c(85)
)
df <- tail(df, n = -1)
df
# <0 rows> (or 0-length row.names)
If the data frame is already empty, it will return an empty data frame without any errors.
df <- data.frame()
df <- tail(df, n = -1)
df
# data frame with 0 columns and 0 rows
That’s it!
Krunal Lathiya is a seasoned Computer Science expert with over eight years in the tech industry. He boasts deep knowledge in Data Science and Machine Learning. Versed in Python, JavaScript, PHP, R, and Golang. Skilled in frameworks like Angular and React and platforms such as Node.js. His expertise spans both front-end and back-end development. His proficiency in the Python language stands as a testament to his versatility and commitment to the craft.
The append() function in R concatenates values to a vector or list at a specified…
NULL represents a null object, and sometimes, it's logical for the project to filter it…
In a real-life dataset, the last row may contain metadata, summaries, footnotes, or unwanted rows…
The basename() is a base R function that extracts the last component (or the 'base…
To grow the list, you can add an element (numeric value, character vectors, other lists,…
Duplicate elements in a vector means those elements appear more than once. Duplicates can skew…