read.xlsx() Function in R with Example

Microsoft Excel is a spreadsheet application used for creating, organizing, analyzing, and saving data in tables and has widespread use in many different applications. R is statistical data analysis language and data is its core. So to bridge the gap between Excel sheets and R language, R has implemented functions that we can use to write and manipulate Excel files.

In this tutorial, we will see how to read excel files in R using the read.xlsx() function. There are lots of third-party packages available that you can use to work with excel sheets like xlsx package, openxlsx, XLConnect, and gdata. For this example, we will use the openxlsx package which provides the read.xlsx() function.

read.xlsx() Function in R

To read the data from an Excel file in R, use the read.xlsx() function. The read.xlsx() is a built-in R function that reads the data from an Excel file or Workbook object into a data.frame.

Syntax

read.xlsx(
   xlsxFile,
   sheet = 1,
   startRow = 1,
   colNames = TRUE,
   rowNames = FALSE,
   detectDates = FALSE,
   skipEmptyRows = TRUE,
   skipEmptyCols = TRUE,
   rows = NULL,
   cols = NULL,
   check.names = FALSE,
   sep.names = ".",
   namedRegion = NULL,
   na.strings = "NA",
   fillMergedCells = FALSE
)

Parameters

xlsxFile: It is an xlsx file or URL to xlsx file.

sheet: It is the name or index of the sheet to read the data from.

startRow: It is the first row to begin looking for data. If it starts with empty rows, then the rows at the top of a file are always skipped, despite the value of the startRow.

colNames: If it is TRUE, the first row of data will be used as column names.

rowNames: If it is TRUE, the first column of data will be used as row names.

detectDates: If it is TRUE, attempt to understand dates and perform the conversion.

skipEmptyRows: If it is TRUE, then empty rows are skipped; otherwise, empty rows after the first row containing data will return a row of NAs(Not Available).

skipEmptyCols: If it is TRUE, empty columns are skipped.

rows: It is a numeric vector defining which rows in the Excel file to read. If NULL, all rows are read.

cols: It is a numeric vector defining which columns in the Excel file to read. If NULL, all columns are read.

check.names: It is a logical argument. If TRUE, then the names of the variables in the data frame are checked to ensure that they are syntactically valid variable names.

sep.names: It is a character that substitutes blanks in column names. By default, “.”

namedRegion: It is a named region in the Workbook. If not NULL startRow, rows and cols parameters are ignored.

na.strings: It is a character vector of strings that are to be interpreted as NA. Blank cells will be returned as NA.

fillMergedCells: If it is TRUE, the value in a merged cell is given to all cells within the merge.

Example

To work with excel files in R, you need to have a demo excel file which contains some sample data.

For this example, I am using the demo_data.xlsx file.

To use the read.xlsx() function in R, install the openxlsx package from the package manager. If you are working on RStudio, then that package must be installed with your preinstallation; otherwise, you need to download it.

To import the openxlsx package in R, use the following code.

library("openxlsx")

Now, you can use the read.xlsx() function and pass the external excel file path to read the file in xlsx format.

So your code looks like the following.

library("openxlsx")

df1 <- read.xlsx(xlsxFile = "demo_data.xlsx", sheet = 1, skipEmptyRows = FALSE)
df1

After running the above code, if you face the following error,

could not find function read.xlsx in r that means the package is not installed in your machine. To solve this error, you need to install the openxlsx package.

You can also face the following error.

Error in loadWorkbook(file, password = password) : argument “file” is missing, with no default Calls: read.xlsx -> loadWorkbook -> file.exists -> path.expand Execution halted

If you face this kind of error, then maybe it is a problem with your installation of the openxlsx package. To solve this error, you need to reinstall the openxlsx package with a new R session.

Now, let’s see the output of the above code.

   0 First.Name Last.Name  Gender  Country       Age  Date      Id
1  1    Dulce      Abril    Female United States 32 15/10/2017 1562
2  2    Mara      Hashimoto Female Great Britain 25 16/08/2016 1582
3  3    Philip    Gent      Male   France        36 21/05/2015 2587
4  4    Kathleen  Hanner    Female United States 25 15/10/2017 3549
5  5    Nereida   Magwood   Female United States 58 16/08/2016 2468
6  6    Gaston    Brumm     Male   United States 24 21/05/2015 2554
7  7    Etta      Hurn      Female Great Britain 56 15/10/2017 3598
8  8    Earlean   Melgar    Female United States 27 16/08/2016 2456
9  9    Vincenza  Weiland   Female United States 40 21/05/2015 6548

There are lots of arguments you can pass to the function according to your requirements.

That is it for read.xlsx() function in R.

See also

read.csv() function in R

write.csv() function in R

Leave a Comment