R setdiff() function can be used to find differences between two sets. Let’s deep dive into the setdiff() method.
setdiff in r
The setdiff is a built-in R function that calculates the set difference of subsets of a probability space. The setdiff() method shows which elements of a vector or data frame x do not exist in a vector or data frame y.
The elements of setdiff(x,y) are those elements in x but not in y.
x: It is either a vector or data frame.
y: It is either a vector or data frame.
It returns a data frame or subset of probability space of the same type as its arguments. If you use the setdiff() function oppositely, for example, Y and X are interchanged, you will get a different result.
Applying setdiff() method to numeric vectors in R
A vector is a fundamental data structure in R which has a sequence of items that share the same data type. To create a vector in R, use the c() function. For example, let’s create two vectors and then pass those vectors the setdiff() function.
rv <- c(19, 21, 11, 18, 22) rv2 <- c(11, 18, 20, 22, 46) setdiff(rv, rv2)
 19 21
In this example, the first vector(rv) has two 19 and 21 values that do not exist in the second vector(rv2); that’s why the setdiff() function returns these two values from the first vector. In short, the output values appear in x, but they do not appear in y.
Let’s oppositely use x and y and pass these two vectors to the setdiff() function.
rv <- c(19, 21, 11, 18, 21) rv2 <- c(11, 18, 20, 22, 46) setdiff(rv2, rv)
 20 22 46
You can see that the rv2 vector’s values will be there in the output, which does not exist in the rv vector.
Using setdiff() function on character vectors
A character vector in R consists of characters. Thus, the text in R is described by character vectors.
rv <- c("Shiba Inu", "Doge", "Bitcoin Cash") rv2 <- c("Polkadot", "Bitcoin", "Bitcoin Cash") setdiff(rv, rv2)
 "Shiba Inu" "Doge"
In this example, the output consists of character values that exist in the rv vector but not in the rv2 vector.
Applying setdiff() to data frames
A data frame is a tabular data structure in R that consists of rows and columns. To calculate the difference between two data frames in R, use the setdiff() function.
x <- data.frame( x1 = c(11, 21, 19, 46), x2 = c(51, 15, 11, 14), x3 = c(19, 21, 13, 41) ) y <- data.frame( x1 = c(11, 14, 8, 1), x2 = c(51, 15, 1, 41), x3 = c(12, 42, 43, 4) ) setdiff(x, y)
x1 x2 x3 1 11 51 19 2 21 15 21 3 19 11 13 4 46 14 41
Use third-party packages
To use the cards() function in R, first, install the prob package in your R-studio or environment.
After installing it, you need to call it on the head of the file.
We will apply the setdiff() function to the subset of the cards() data.
library("prob") kads <- cards() a <- subset(kads, suit == "Diamond") v <- subset(kads, rank == "A") setdiff(v, a)
Loading required package: combinat Attaching package: ‘combinat’ The following object is masked from ‘package:utils’: combn Loading required package: fAsianOptions Loading required package: timeDate Loading required package: timeSeries Loading required package: fBasics Loading required package: fOptions Attaching package: ‘prob’ The following objects are masked from ‘package:base’: intersect, setdiff, union rank suit 13 A Club 39 A Heart 52 A Spade
That’s it for this tutorial.
Krunal Lathiya is an Information Technology Engineer by education and web developer by profession. He has worked with many back-end platforms, including Node.js, PHP, and Python. In addition, Krunal has excellent knowledge of Data Science and Machine Learning, and he is an expert in R Language. Krunal has written many programming blogs, which showcases his vast expertise in this field.