How to Combine Two Data Frames in R

In reality, the data does not remain on one server, they are distributed in different servers, and to analyze that data, you need to combine them. As a data scientist, you often need to perform an accurate analysis of the dataset. For that,  you need to merge data frames together with one or more common key variables to make that any sense.

Combine Two Data Frames in R

To combine two data frames in R, use the merge() function. The merge() is a built-in R function that merges two data frames by common columns or row names.

Syntax

merge(x, y, by = intersect(names(x), names(y)),
      by.x = by, by.y = by,
      sort = TRUE, …)

Arguments

x, y: They are data frames or objects to be coerced to one.

by, by.x, by.y: They are the specifications of the columns used for merging.

sort: It is an optional logical argument. Should the result be sorted by columns?

: The other optional arguments for merge() function.

Return Value

It returns the data frame. The rows are by default lexicographically sorted on the common columns but are otherwise in the order in which they occurred in the input.

Example

To create a data frame in R, use the data.frame() function.

Let’s create a first data frame with the following columns.

  1. name
  2. nationality
  3. retired

Create a Second data frame with the following columns.

  1. heronames
  2. superpowers
  3. name
  4. villians

You can see that both data frames have one common column, which is the name.

We will merge these two data frames based on the name column.

heroes <- data.frame(
 name = c("Bruce Wayne", "Clark Kent", "Diana", "Billy Batson"),
 nationality = c("US", "Kryptonite", "Amazonian", "US"),
 retired = c("no", "no", "no", "no"),
 stringsAsFactors = FALSE)

powers <- data.frame(
 heronames = c("Batman", "Superman", "Wonder Woman", "Shazam"),
 superpowers = c("Rich", "Flying", "Combat", "Electric"),
 name = c("Bruce Wayne", "Clark Kent", "Diana", "Billy Batson"),
 villians = c("Joker", "Lex Luther", "Aris", "Dr. Sivana"),
 stringsAsFactors = FALSE)

merge(heroes, powers, by.x = "name")

Output

   name         nationality  retired   heronames      superpowers    villians
1  Billy Batson    US         no         Shazam       Electric       Dr. Sivana
2  Clark Kent   Kryptonite    no         Superman     Flying         Lex Luther
3  Bruce Wayne     US         no         Batman       Rich           Joker  
4  Diana        Amazonian     no         Wonder Woman Combat         Aris

As you can see from the output that two data frame is merged based on the name column. The above output is unsorted, but we can sort it by applying one option. Let’s see how to sort the merged data frame.

Applying sort = TRUE to the merge() function

To sort the merged data frame using the merge() function, use the sort argument and pass the TRUE value.

merge(heroes, powers, by.x = "name", sort = TRUE)

And it will return the sorted output.

That is it for combining two data frames in R.

See also

Add a column to a data frame

Rename a column of a data frame in R

cbind in R

rbind in R

Leave a Comment