Tibbles in R: Working with Simplified Data Frames

A tibble is a modern approach to a data frame in R, part of the tidyverse set of packages designed to make data manipulation more straightforward and less error-prone than traditional data frames.

Tibbles maintains the core functionality of data frames but modifies some older behaviors to make data analysis more user-friendly.

Key features of tibble

  1. Tibbles consistently maintains the original data type without any alteration.
  2. Automatic conversion of character types to factors, a common issue in traditional data frames, is not a concern with tibbles.
  3. Tibbles are capable of holding list-type columns.
  4. The use of unconventional or non-standard naming conventions for variables is supported in tibbles.
  5. It’s permissible to start tibble names with a numeral or include spaces within them, although such names require backtick encapsulation.
  6. In tibbles, only single-length vectors undergo recycling.
  7. Unlike traditional data frames, tibbles do not create row names.

Creating a tibble

To create a tibble in R, you can use either the “tibble()” or “as_tibble()” function.

Method 1: Using tibble()

Visual representation of creating a tibble using tibble() function

library(dplyr)

tbl <- tibble(
  col1 = 1:5,
  col2 = letters[1:5]
)

print(tbl)

Output

Output of creating a tibble using tibble() function

Method 2: Converting a data frame to tibble using as_tibble()

Visual representation of Converting a data frame to tibble using as_tibble()

library(dplyr)

df <- data.frame(
  col1 = 1:5,
  col2 = letters[1:5]
)

cat("Converting a data frame to tibble using as_tibble()", "\n")

tbl <- as_tibble(df)

print(tbl)

Output

Output of as_tibble() function to create a tibble

Working with tibbles

Subsetting

Extracting single column

Visual representation of extracting single column of a tibble

library(dplyr)

tbl <- tibble(
  col1 = 1:5,
  col2 = letters[1:5]
)

# Extracting a single column
vec <- tbl$col1

print(vec)

Output

[1] 1  2  3  4  5

Extracting multiple columns

Visual representation of extracting multiple columns

library(dplyr)

tbl <- tibble(
  col1 = 1:5,
  col2 = letters[1:5]
)

# Extracting multiple columns
sub_tb <- tbl %>% select(col1, col2)

print(sub_tb)

Since the tibble has only two columns, it will return as it is.

Extracting multiple columns of a tibble in R

Adding columns

Visual representation of adding columns to a tibble

To add a column to a table, use the dplyr::mutate() function.

library(dplyr)

tbl <- tibble(
  col1 = 1:5,
  col2 = letters[1:5]
)

tbl %>% mutate(col3 = col1 * 2)

Output

Output of adding columns to a tibble

Filtering rows

Visual representation of filtering rows of tibble

To filter rows of a tibble, use the dplyr::filter() function.

library(dplyr)

tbl <- tibble(
  col1 = 1:5,
  col2 = letters[1:5]
)

tbl %>% filter(col1 > 3)

Output

Output of filtering tibble in R

Summarizing data

Visual representation of Summarizing tibble data

library(dplyr)

tbl <- tibble(
  col1 = 1:5,
  col2 = letters[1:5]
)

tbl %>% summarise(mean_col = mean(col1))

Output

Output of summarizing the tibble

Difference between tibble and data frame

Aspect Tibble Data Frame
Printing Behavior It prints the first ten rows and columns that fit on the screen. It prints more rows and might truncate columns.
Column Subsetting It always returns the simplest data structure. It may return a data frame or a vector.
Row Names It does not support row names. It supports row names.
Non-standard Columns It allows non-syntactic column names without backticks. It requires backticks for non-syntactic column names.
Type Consistency Strictly about maintaining data types of columns. May convert characters to factors by default.

That’s all!

Leave a Comment