How To Read Google Sheet Into Pandas

Sharing is caring!

Last Updated on July 21, 2022 by Jay

Sometimes we might need to read data from a Google Sheet into pandas. Let’s take a look at how to do that.

It’s surprisingly simple, but first we need to “publish” the Google Sheet.

Share And Publish A Google Sheet

First, let’s publish the Google Sheet to the web. On the Google Sheet window, head to File -> Share -> Publish to web.

Then, the following window will pop up, and we’ll select either Excel or CSV from the dropdown under “Web page.” We also have the option to publish either the entire document or just publish specific tabs within the document.

google_sheet_publish
google sheet publish file type
google sheet publish document
google sheet publish document

I will publish the entire workbook, so I’ll keep the Entire Document option. Then click on the Publish button.

Google then will generate a link to our spreadsheet and will display that link in the next window. Here is the link if you want to try it:

Published google sheet link
Published google sheet link
https://docs.google.com/spreadsheets/d/e/2PACX-1vQzorMfVQKIcsBiBTC3DdCf36PYnwWsYSb0tJl_GcbbiWPFbwrwt_Cg8XraVqaRdFRt0MLvVDfM3DyM/pub?output=xlsx

Note how the above link ends with “output=xlsx”. If we change that from xlsx to csv, it will also work and download the .csv file instead of the .xlsx file.

Technically this is the link to the Excel/CSV file. If we simply enter this link into a browser, the file will be downloaded to our computer. Since we have published this workbook, anyone with the URL can access this file.

Read Google Sheet Into Pandas

So far we have done the file preparation, it’s time to bring the data into pandas, and it’s surprisingly simple! One thing to note is that we have to match the pandas method with the file type, see the below example:

import pandas as pd


excel = pd.read_excel('https://docs.google.com/spreadsheets/d/e/2PACX-1vQzorMfVQKIcsBiBTC3DdCf36PYnwWsYSb0tJl_GcbbiWPFbwrwt_Cg8XraVqaRdFRt0MLvVDfM3DyM/pub?output=xlsx')

csv = pd.read_csv('https://docs.google.com/spreadsheets/d/e/2PACX-1vQzorMfVQKIcsBiBTC3DdCf36PYnwWsYSb0tJl_GcbbiWPFbwrwt_Cg8XraVqaRdFRt0MLvVDfM3DyM/pub?output=csv')

The tables we retrieved from the two methods are different. This is because the spreadsheet uses a formula to randomly generate numbers (=RAND()) in the some_data column. So this is expected!

Additional Resources

Python pandas read Excel files

Leave a Reply

Your email address will not be published. Required fields are marked *