Add New Data To Master Excel File Using Python

Sharing is caring!

Last Updated on October 22, 2022 by Jay

This tutorial will walk through the best way I know of to add new data to a master Excel file using Python. There are several ways to add data to a master Excel file, I think this is the best way because it can preserve the master data formats and everything inside the file.

Install Library

For this project, we’ll use the xlwings Python library. In my opinion, xlwings is the best Python library for manipulating Excel files.

pip install xlwings

Sample Files

Feel free to download the sample Excel files for this project.

The master file will have some pre-set formats and cell background colors, and we would like to keep those.

The new data files are plain data file, can be either Excel or CSV format.

Working With The Master Excel File

We can use xlwings to open Excel files. As soon as we execute the following code, our Excel should open with the master file.

import xlwings as xw

master_wb = xw.Book('master_file.xlsx')
Master Excel File

Three observations from the master file:

  1. There are two tabs, both with data
  2. Each tab has its own formatting and background colors
  3. We want to add new data to the last row of the existing data for each tab. In the above screenshot, we want to add new data starting row 5 for the “Canada” tab

Getting Excel Sheets

Getting all Excel sheets is as easy as calling the sheets attribute as shown below. Note it returns a Sheets object, which is a collection of Excel sheets, we can use indexing to access each individual sheet. To get the sheet names, simply call the .name attribute.

master_sheets = master_wb.sheets
master_sheets
Sheets([<Sheet [master_file.xlsx]Canada>, <Sheet [master_file.xlsx]USA>])

master_sheets[0].name
'Canada'

Figuring Out Where To Put New Data

We will put new data starting from the last (empty) row of the master file. By eyeballing it, we know it should be row 5 (for Canada tab). So how to find the last row of data in Excel? It’s by pressing Ctrl + down arrow. If you select cell A1 and press ctrl + down arrow, your cursor will move down to A4. The next row (row 5) is our destination.

To get A4 programmatically:

master_sheets[0].range('A1').end('down').address
'$A$4'

master_sheets[0].range('A1').end('down').row
4

Open & Read New Data File

We’ll open the new data file like we did with the master file. To get data from all the (non-empty) rows and columns. Use the .expand() method to expand the range selection. Note we expand from cell A2, as row 1 is the header row which we don’t need.

newdata_wb = xw.Book(r"new_data.xlsx")

data_range = newdata_wb.sheets[0].range('A2').expand()
<Range [new_data.xlsx]Sheet1!$A$2:$G$6>

new_data_raw = newdata_wb.sheets[0].range('A2').expand().value
new_data_raw 
[['10/15/2022', 10.0, 'Waterloo', 'Canada', 'Event 10', 10.1, 1.0],
 ['10/16/2022', 11.0, 'St Louis', 'USA', 'Event 11', 10.2, 1.01],
 ['10/17/2022', 12.0, 'Boston', 'USA', 'Event 12', 10.3, 1.02],
 ['10/19/2022', 13.0, 'Ottawa', 'Canada', 'Event 13', 10.4, 111.11],
 ['4/1/2022', 5.0, 'Austin', 'USA', 'Event 5', 11.11, 1.23]]

So we got a list of lists. Both countries are inside the lists, let’s separate them and remove the country from each sub-lists. Using Canada as an example. Here we are using a list comprehension so we can avoid long loops. If you need help understanding list comprehension, read the following:

[i[0:3] + i[4:] for i in new_data_raw if i[3] == 'Canada']

[['10/15/2022', 10.0, 'Waterloo', 'Event 10', 10.1, 1.0],
 ['10/19/2022', 13.0, 'Ottawa', 'Event 13', 10.4, 111.11]]

Transfer Data To Master File

Let’s write the complete code for the steps up to this point:

import xlwings as xw

master_wb = xw.Book('master_file.xlsx')
newdata_wb = xw.Book("new_data.xlsx")

master_sheets = master_wb.sheets
new_data_raw = newdata_wb.sheets[0].range('A2').expand().value

## loop through each sheet in master file, move data over to the appropriate tab
for s in master_sheets:
    temp_data = [i[0:3] + i[4:] for i in new_data_raw if i[3] == s.name]
    newrow = master_wb.sheets[s.name].range('A1').end('down').row + 1
    master_wb.sheets[s.name].range(newrow, 1).value = temp_data

After running the above code, our master file will look like the following:

Add New Data To Master File

The new data that we added doesn’t have consistent formatting! Let’s fix that.

Transfer Format To New Data Range

We’ll copy the cell format from row 2 and paste formats only to the rest of the data range. We’ll do so by modifying the above loop to the following:

for s in master_sheets:
    temp_data = [i[0:3] + i[4:] for i in new_data_raw if i[3] == s.name]
    newrow = master_wb.sheets[s.name].range('A1').end('down').row + 1
    master_wb.sheets[s.name].range(newrow, 1).value = temp_data
    master_wb.sheets[s.name].range('A2').expand('right').copy()
    master_wb.sheets[s.name].range('A2').expand().paste(paste='formats')
Formatting The Master File

As we see the copy/paste will also copy the cell background color. However the Date isn’t exactly how we want it to look like. We recognize the integer number in A5 and A6 are basically Excel’s date values, so we can change their format like the below:

master_wb.sheets[0].range('A5:A6').number_format = 'mm/dd/yyyy'

Additional Resources

2 comments

  1. Hello, Jay. Thanks for sharing all of this helpful information. I would like to make a request. Could you please post on how to use Python to create something like this: VBA-Multiple-Find-and-Replace by DVN-Anakin on Github. Preferably, the one without the backlogs, which can actually replace every word in the Word document. And it also works while taking into account
    that the Excel file is big in size.

Leave a Reply

Your email address will not be published. Required fields are marked *