How to Access Excel File In Python

Sharing is caring!

Last Updated on July 14, 2022 by Jay

Sometimes we want to access just certain data in an Excel file in Python instead of reading the whole spreadsheet/table into pandas. This tutorial will show you how to use the openpyxl library to access Excel data in Python.

Library

openpyxl is a third-party Python library we can download by using pip. Open up a command prompt, then type:

pip install openpyxl

We’ll use a simple Excel file for this demonstration, so there’s no need to download anything. Below screenshots show what’s inside the example Excel spreadsheet with 2 tabs:

  • Sheet1 contains some date, String, and number values
  • calc contains just one value 3.14
excel_sheet1_example
excel_sheet2_example

Access Sheets of An Excel File Using Python openpyxl

We can use Python to read an Excel file using the openpyxl.load_workbook() method. wb is the variable name we chose to represent the Excel workbook. Upon checking wb.sheetnames, we can see there are two sheets in the file.

from openpyxl import load_workbook
wb = load_workbook('book1.xlsx')

wb.sheetnames
['Sheet1', 'calc']

wb.active will always give the first sheet (index = 0). We can also access other sheets by passing individual sheet names into the wb, think about the wb is like a dictionary of worksheets, and we can use the sheet names (key) to access each sheet (value).

ws = wb.active
ws
<Worksheet "Sheet1">

wb['calc']
<Worksheet "calc">

Another way to access individual sheets is by calling the worksheets attribute, then using the index to pick a sheet.

wb.worksheets
[<Worksheet "Sheet1">, <Worksheet "calc">]

wb.worksheets[1]
<Worksheet "calc">

Access Data of an Excel Sheet

Accessing One Cell

Let’s access some data points inside the Sheet1 (or the variable ws in Python). We can use the same Excel column-row notation to specify which cell we want to get. For example, ws[‘B3’] returns a Cell object that represents cell B3 of the Sheet1. To get data inside that cell, we call the .value attribute. Conveniently, we can also use the .data_type attribute to check what data is inside a cell.

excel_sheet1_example
ws['B3']
<Cell 'Sheet1'.B3>

ws['B3'].value
datetime.datetime(2022, 3, 1, 0, 0)

ws['B3'].data_type
'd'

ws['C6'].value
'Watermelon'

ws['D8'].value
600

You can also access cells using the cell(row, column) method. Although this approach seems rather redundant compared to the Excel’s column-row notation, it provides an easier way to access cells programmatically.

Also note that cell (3,3) refers to cell C3, where the index starts from 1 instead of 0 in this expression.

ws.cell(row=3, column=3).value
'Apple'

Accessing Many Cells

We can reference columns, rows, or ranges using the same notations as in Excel. For example:

  • ws[‘B’] – returns column B
  • ws[3] – returns row 3, again the index starts from 1
  • Then, ws[‘B2:D8’] – returns range B2 to D8

Note the below code returns a tuple of Cell objects, we need to call the .value attribute to get the actual values inside those cells.

ws['B']
(<Cell 'Sheet1'.B1>,
 <Cell 'Sheet1'.B2>,
 <Cell 'Sheet1'.B3>,
 <Cell 'Sheet1'.B4>,
 <Cell 'Sheet1'.B5>,
 <Cell 'Sheet1'.B6>,
 <Cell 'Sheet1'.B7>,
 <Cell 'Sheet1'.B8>)

ws[3]
(<Cell 'Sheet1'.A3>,
 <Cell 'Sheet1'.B3>,
 <Cell 'Sheet1'.C3>,
 <Cell 'Sheet1'.D3>)

for row in ws['B2:D8']:
    for cell in row:
        print(cell.value)
Date
Item
Sales
2022-03-01 00:00:00
Apple
100
2022-03-02 00:00:00
Banana
200
2022-03-03 00:00:00
Orange
300
2022-03-04 00:00:00
Watermelon
400
2022-03-05 00:00:00
Cherry
500
2022-03-06 00:00:00
Peach
600

Additional Resources

How to Use Python to Read Excel Formula

How to Work with Excel Named Range in Python

Write Data to Excel using Python

3 comments

  1. Accessing Many Cells: what about using Defined Names (range of cells) of Excel to access for instance a table identified with Defined Names?

Leave a Reply

Your email address will not be published. Required fields are marked *