In reality, the data does not remain on one server, they are distributed in different servers, and to analyze that data, you need to combine them. As a data scientist, you often need to perform an accurate analysis of the dataset. For that, you need to merge data frames together with one or more common key variables to make that any sense.
Combine Two Data Frames in R
To combine two data frames in R, use the merge() function. The merge() is a built-in R function that merges two data frames by common columns or row names.
merge(x, y, by = intersect(names(x), names(y)), by.x = by, by.y = by, sort = TRUE, …)
x, y: They are data frames or objects to be coerced to one.
by, by.x, by.y: They are the specifications of the columns used for merging.
sort: It is an optional logical argument. Should the result be sorted by columns?
…: The other optional arguments for merge() function.
It returns the data frame. The rows are by default lexicographically sorted on the common columns but are otherwise in the order in which they occurred in the input.
To create a data frame in R, use the data.frame() function.
Let’s create a first data frame with the following columns.
Create a Second data frame with the following columns.
You can see that both data frames have one common column, which is the name.
We will merge these two data frames based on the name column.
heroes <- data.frame( name = c("Bruce Wayne", "Clark Kent", "Diana", "Billy Batson"), nationality = c("US", "Kryptonite", "Amazonian", "US"), retired = c("no", "no", "no", "no"), stringsAsFactors = FALSE) powers <- data.frame( heronames = c("Batman", "Superman", "Wonder Woman", "Shazam"), superpowers = c("Rich", "Flying", "Combat", "Electric"), name = c("Bruce Wayne", "Clark Kent", "Diana", "Billy Batson"), villians = c("Joker", "Lex Luther", "Aris", "Dr. Sivana"), stringsAsFactors = FALSE) merge(heroes, powers, by.x = "name")
name nationality retired heronames superpowers villians 1 Billy Batson US no Shazam Electric Dr. Sivana 2 Clark Kent Kryptonite no Superman Flying Lex Luther 3 Bruce Wayne US no Batman Rich Joker 4 Diana Amazonian no Wonder Woman Combat Aris
As you can see from the output that two data frame is merged based on the name column. The above output is unsorted, but we can sort it by applying one option. Let’s see how to sort the merged data frame.
Applying sort = TRUE to the merge() function
To sort the merged data frame using the merge() function, use the sort argument and pass the TRUE value.
merge(heroes, powers, by.x = "name", sort = TRUE)
And it will return the sorted output.
That is it for combining two data frames in R.