Reading Excel Spreadsheets with Python and xlrd

Last month we looked at how to create Microsoft Excel (i.e. *.xls) files using the xlwt package. Today we will be looking at how we can read an *.xls/*.xlsx file using a package called xlrd. The xlrd package can be run on Linux and Mac as well as Windows. This is great when you need to process an Excel file on a Linux server.

We will start out by reading the first Excel file we created in our previous article.

Let’s get started!


Reading an Excel Spreadsheet

In this section, we will look at a function that demonstrates different ways of reading an Excel file. Here’s the code example:

import xlrd

#----------------------------------------------------------------------
def open_file(path):
    """
    Open and read an Excel file
    """
    book = xlrd.open_workbook(path)
    
    # print number of sheets
    print book.nsheets
    
    # print sheet names
    print book.sheet_names()
    
    # get the first worksheet
    first_sheet = book.sheet_by_index(0)
    
    # read a row
    print first_sheet.row_values(0)
    
    # read a cell
    cell = first_sheet.cell(0,0)
    print cell
    print cell.value
    
    # read a row slice
    print first_sheet.row_slice(rowx=0,
                                start_colx=0,
                                end_colx=2)
    
#----------------------------------------------------------------------
if __name__ == "__main__":
    path = "test.xls"
    open_file(path)

Let’s break this down a bit. First we import xlrd and then in our function, we open the Excel workbook that was passed in. The next couple of lines show how to introspect the book. We find out how many worksheets there are in the workbook and we print out their names. Next we extract the first worksheet via the sheet_by_index method. We can read an entire row from the worksheet using the row_values method. If we want to get a particular cell’s value, we can call the cell method and pass it the row and column indexes. Finally we use xlrd’s row_slice method to read a portion of the row. As you can see, this last method accepts a row index and the starting and ending column indexes to determine what to return. The row_slice method returns a list of cell instances.

This makes it very easy to iterate over a group of cells. Here’s a small snippet to demonstrate:

cells = first_sheet.row_slice(rowx=0,
                              start_colx=0,
                              end_colx=2)
for cell in cells:
    print cell.value

The xlrd package supports the following types of cells: text, number (i.e. float), dates (any number format that “looks” like a date), Boolean, error and empty/blank. The package also supports extracting data from named cells, although the project doesn’t support all types of named cells. The reference text is a bit vague on what exactly it does not support though.

If you need to copy cell formatting you will need to download the xlutils package.


Wrapping Up

At this point you should know enough to read most Excel files that were built using Microsoft’s XLS format. There is another package that also supports reading xls/xlsx files called the openpyxl project. You might want to check it out as an alternative.


Related Reading

9 thoughts on “Reading Excel Spreadsheets with Python and xlrd”

  1. I want to extract comments and hyperlinks of cell in .xls file using xlrd library.

    Plz let me know the functions and sample code

    Thanks in advance.

  2. Hi, Is it possible to run iterations of data in excel using this library? I want to be able to read the content of all the columns in row1 and use it in the script and then in the next without the script stopping do a second execution using the data inside the next row2. In this way i will be able to run a script multiple times using different data that will be stored in excel rows. thanks

  3. Pingback: Mike Driscoll: Top Ten Articles of 2016 | Adrian Tudor Web Designer and Programmer

  4. i want to extract all data from a sheet by name and write it over another sheet by name is it possible?

  5. I’m pretty sure you can use xlrd to check the sheet name before extracting its contents. Then you would use xlwt to write it to another sheet or file.

  6. Pingback: Top 10 Most Read Mouse vs Python Articles of 2019 - The Mouse Vs. The Python

Comments are closed.