How to Use Scatterplot in R

A scatterplot in R is a set of dotted points representing individual pieces of data on the horizontal and vertical axis. The first argument of the plot() function is the x-axis variable, and the second argument is the y-axis variable.

Syntax

plot(x, y, main, xlab, ylab, xlim, ylim, axes)

Parameters

  1. x: It is the data set whose values are the horizontal coordinates.
  2. y: It is the data set whose values are the vertical coordinates.
  3. main: It is the tile of the graph.
  4. xlab: It is the label on the horizontal axis.
  5. ylab: It is the label on the vertical axis.
  6. xlim: It is the limit of the values of x used for plotting.
  7. ylim: It is the limit of the values of y used for plotting.
  8. axes: It indicates whether both axes should be drawn on the plot.

Example 1: Simple scatterplot

For creating a scatterplot, we will use the shows_data.csv file.

From that csv file, we will use Year and IMDb columns to draw a scatterplot.

To read a csv data, use the read.csv() function.

data <- read.csv("shows_data.csv")
df <- head(data)
print(df)

Output

create a data frame in R

We will pluck the Year and IMDb columns to create a scatter plot.

Let’s create a scatterplot of 30 rows.

data <- read.csv("shows_data.csv")
df <- head(data, 30)
print(df)

x <- df$Year
y <- df$IMDb

plot(x, y, main = "IMDB vs Year",
 xlab = "Year", ylab = "IMDb Ratings",
 pch = 19)

Output

Create Scatter Plot in R with Example

Example 2: Use a built-in dataset to create a scatterplot

We will use the faithful dataset.

df <- head(faithful)
print(df)

Output

   eruptions  waiting
1   3.600       79
2   1.800       54
3   3.333       74
4   2.283       62
5   4.533       85
6   2.883       55

In the dataset faithful, we pair up the eruptions and waiting for values in the same observation as (x, y) coordinates. Then we plot the points in the Cartesian plane.

df <- head(faithful)
print(df)

duration <- faithful$eruptions
waiting <- faithful$waiting

plot(duration, waiting,
 xlab = "Eruption duration",
 ylab = "Time waited",
 main = "Duration vs Time waited"
)

Output

Create Scatter Plot

Enhanced Solution

We can generate a linear regression model of the two variables with the lm function and then draw a trend line with abline.

abline(lm(waiting ~ duration))

See the below complete code.

df <- head(faithful)
print(df)

duration <- faithful$eruptions
waiting <- faithful$waiting

plot(duration, waiting,
 xlab = "Eruption duration",
 ylab = "Time waited",
 main = "Duration vs Time waited"
 )

abline(lm(waiting ~ duration))

Output

Enhanced Solution

That’s it.

Related posts

R pch

lwd in R

R barchart

bty in R

Leave a Comment