Last Updated on July 14, 2022 by Jay
In this tutorial, we’ll use Python to create a high-fidelity Excel spreadsheet. “High-fidelity” means that the Python-generated Excel spreadsheet will look like a real Excel file created by people – contains values, formulas, different formats, as well as charts.
There is quite a lot to cover so this is part 1 of the series, we’ll use Python to create an Excel spreadsheet that contains formulas.
You might be already familiar that the easier way to dump some data into an Excel file is using pandas library,
pd.to_excel(). While that’s true, we only get a value-only Excel file, no formulas, no format, etc.
xlsxwriter library allows us to create such a high-fidelity Excel file. It’s a powerful library for writing Excel files. However, as the name suggests, it writes Excel files but doesn’t read existing files.
xlsxwriter is also one of the Excel writer engines adopted by pandas. It’s probably safe to say, if pandas rely on this library, then we are in good hands using it.
As always, we’ll install the library.
pip install xlsxwriter
Getting familiar with xlsxwriter
Let’s start by setting up an Excel file, and the spreadsheets inside.
import xlsxwriter wb = xlsxwriter.Workbook(r'C:\Users\JZ\Desktop\PythonInOffice\high_fidelity_excel\hf_excel.xlsx') wb.add_worksheet() wb.add_worksheet('input') wb.add_worksheet() wb.close()
Workbook() constructor will create an Excel file at the folder location specified by the argument. The method
add_worksheet() creates a new sheet/tab inside the Excel file. By default, if no argument is passed into
add_worksheet(), the sheets will be named “Sheet1”, “Sheet2”, etc. Just like would you would expect Excel to behave.
We can specify sheet names by passing a string value into the method. Note although there’s no tab named “Sheet2”, when we execute the 3rd
add_worksheet() method, it creates a tab named “Sheet3”.
.close() method is important! If we don’t include this line, our Excel file will not be saved.
Referencing cells and ranges
We can use either “A1” or (row, column) notation to reference the cells in Excel. Since Python index starts from 0 – (0, 0) refers to “A1”, and (1,1) actually refers to “B2”.
The “A1” notation is easier for humans to read, plus this is what we need when using Excel formulas, like C10 = A1 * 2. However, the (row, column) notation is easier to program with. Conveniently,
xlsxwriter provides a method
xl_rowcol_to_cell() to convert the (row, column) notation to the “A1” notation with ease. Note the below code on how to import the method.
from xlsxwriter.utility import xl_rowcol_to_cell >>> xl_rowcol_to_cell(0,0) 'A1' >>> xl_rowcol_to_cell(1,1) 'B2' >>> xl_rowcol_to_cell(10,25) 'Z11'
And you probably guessed it already – there are several other functions for similar purposes.
xl_cell_to_rowcol() does the opposite – it converts “A1” notation to (0,0).
xl_col_to_name() converts an integer column number to a column letter. Again, note the 0 starting index.
>>> from xlsxwriter.utility import xl_col_to_name >>> xl_col_to_name(25) 'Z' >>> xl_col_to_name(0) 'A'
xl_range() converts (row, column) notation to a range notation such as “A1:C10”. It takes 4 arguments: (start_row, start_column, end_row, end_column). Only integer values are valid arguments.
xl_range_abs() is similar to the above method but instead, it returns the absolute reference, i.e. when we need the “$” sign to fix the cells.
from xlsxwriter.utility import xl_range, xl_range_abs >>> xl_range(0,0,10,25) 'A1:Z11' >>> xl_range_abs(0,0,0,0) '$A$1' >>> xl_range_abs(0,0,10,25) '$A$1:$Z$11'
Wait… how do we get worksheets?
I left this out on purpose, as you’ll see soon. Because
xlsxwriter can not read an existing Excel file, it’s kind of pointless to have a
.get_worksheet() method, isn’t it? That said, such a method actually exists and it’s
Workbook.get_worksheet_by_name(). Pass in a tab name as an argument and you’ll have the worksheet object.
Writing raw data (hard-code values) into Excel
Now we are familiar with our “Excel” environment, let’s create our file. We’ll use the same file name, so the previous example file will be overwritten. Note that we won’t get a message/warning when the Excel file is being overwritten, so be careful with the name if you want to keep a file.
import xlsxwriter wb = xlsxwriter.Workbook(r'C:\Users\JZ\Desktop\PythonInOffice\high_fidelity_excel\hf_excel.xlsx') ws_1 = wb.add_worksheet('input') ws_1.write(0,0,"hello excel") wb.close()
In the above code block, see how we can assign
Workbook.add_worksheet() to a variable? The
add_worksheet() method returns a Worksheet object
ws_1, which is the first actual tab inside the Excel file. Again, because we are only writing an Excel file, we can assign the tabs to a variable/object and reference it anytime later. Therefore, making the
Workbook.get_worksheet_by_name() not so useful.
Worksheet.write() first takes a set of two integer values, which is the (row, column) notation of the cells, in the example above, we are writing the message “hello excel” into cell A1.
Remember to save & close the workbook. Open it in Excel, you’ll see the “hello excel” in cell A1 of the “input” tab.
Creating formulas and link to cells
It’s surprisingly easy to write Excel formulas with the
xlsxwriter library. We literally just write the exact same Excel formula in a string, then
.write() to the file.
import xlsxwriter wb = xlsxwriter.Workbook(r'C:\Users\JZ\Desktop\PythonInOffice\high_fidelity_excel\hf_excel.xlsx') ws_1 = wb.add_worksheet('input') ws_1.write(0,0,1) # write value 1 to cell A1 ws_1.write(1,0,2) # write value 2 to cell A2 ws_1.write(2,0,3) # write value 3 to cell A3 ws_1.write(3,0,"=SUM(A1:A3)") #write the Excel formula =SUM(A1:A3) in cell A4
Now we know the basics, let’s have some fun with it and use Python to create an Excel cashflow projection model that isn’t just hard-code values!