In reality, the data does not remain on one server, they are distributed in different servers, and to analyze that data, you need to combine them. As a data scientist, you often need to perform an accurate analysis of the dataset. For that, you need to merge data frames together with one or more common key variables to make that any sense.
Combine Two Data Frames in R
To combine two data frames in R, use the merge() function. The merge() is a built-in R function that merges two data frames by common columns or row names.
Syntax
merge(x, y, by = intersect(names(x), names(y)),
by.x = by, by.y = by,
sort = TRUE, …)
Arguments
x, y: They are data frames or objects to be coerced to one.
by, by.x, by.y: They are the specifications of the columns used for merging.
sort: It is an optional logical argument. Should the result be sorted by columns?
…: The other optional arguments for merge() function.
Return Value
It returns the data frame. The rows are by default lexicographically sorted on the common columns but are otherwise in the order in which they occurred in the input.
Example
To create a data frame in R, use the data.frame() function.
Let’s create a first data frame with the following columns.
- name
- nationality
- retired
Create a Second data frame with the following columns.
- heronames
- superpowers
- name
-
villians
You can see that both data frames have one common column, which is the name.
We will merge these two data frames based on the name column.
heroes <- data.frame(
name = c("Bruce Wayne", "Clark Kent", "Diana", "Billy Batson"),
nationality = c("US", "Kryptonite", "Amazonian", "US"),
retired = c("no", "no", "no", "no"),
stringsAsFactors = FALSE)
powers <- data.frame(
heronames = c("Batman", "Superman", "Wonder Woman", "Shazam"),
superpowers = c("Rich", "Flying", "Combat", "Electric"),
name = c("Bruce Wayne", "Clark Kent", "Diana", "Billy Batson"),
villians = c("Joker", "Lex Luther", "Aris", "Dr. Sivana"),
stringsAsFactors = FALSE)
merge(heroes, powers, by.x = "name")
Output
name nationality retired heronames superpowers villians
1 Billy Batson US no Shazam Electric Dr. Sivana
2 Clark Kent Kryptonite no Superman Flying Lex Luther
3 Bruce Wayne US no Batman Rich Joker
4 Diana Amazonian no Wonder Woman Combat Aris
As you can see from the output that two data frame is merged based on the name column. The above output is unsorted, but we can sort it by applying one option. Let’s see how to sort the merged data frame.
Applying sort = TRUE to the merge() function
To sort the merged data frame using the merge() function, use the sort argument and pass the TRUE value.
merge(heroes, powers, by.x = "name", sort = TRUE)
And it will return the sorted output.
That is it for combining two data frames in R.
See also
Rename a column of a data frame in R

Krunal Lathiya is an Information Technology Engineer by education and web developer by profession. He has worked with many back-end platforms, including Node.js, PHP, and Python. In addition, Krunal has excellent knowledge of Data Science and Machine Learning, and he is an expert in R Language. Krunal has written many programming blogs, which showcases his vast expertise in this field.