Last Updated on July 14, 2022 by Jay
Password encrypted Excel files are annoying, in this tutorial I’ll show you how to use Python to remove Excel password.
My clients always send password-encrypted Excel files that contain sensitive personal data. Although I appreciate their effort to protect the data, it’s annoying to enter the password every time I try to open the files. So I have to get rid of the password for all the files once I receive them. For a lazy person like me, there’s no way I would do that manually for hundreds of files.
Library
To remove the password for Excel files, we’ll use the msoffcrypto library. We can use pip to get install the library, note the name used for installation is msoffcrypto-tool.
pip install msoffcrypto-tool
Getting File Names
We’ll use the pathlib library to get the file path for all the Excel files. Note the .iterdir() will iterate through everything inside the folder, including both files and directories. We just need to add a check to grab just the .xlsx files.
The pathlib.glob(‘*.xlsx’) method returns all path that matches the pattern ‘*.xlsx’, which means any path that ends with ‘.xlsx’, or the Excel files.
import pathlib
url = pathlib.Path(r'C:\Users\jay\Desktop\PythonInOffice\remove_excel_password')
excel_files = list(url.glob('*.xlsx'))
Remove Excel Password Using Python
The msoffcrypto library has two key methods: load_key and decrypt:
- load_key(): prepare password
- decrypt(): unlock the file by removing the password, then save the file to disk
Let’s continue with our list of file paths. Note that all my Excel files use the same password ‘1234’, make sure to replace it with your password when testing the code.
Note the below function will open each password-encrypted Excel file in Python using the binary mode ‘rb’. Then it will remove the password (by decrypting) and save the file into a new location output_folder.
def unlock(filename, passwd, output_folder):
temp = open(filename, 'rb')
excel = msoffcrypto.OfficeFile(temp)
excel.load_key(passwd)
out_path = pathlib.Path(output_folder)
if not out_path.exists():
out_path.mkdir()
with open(str(out_path/filename.name), 'wb') as f:
excel.decrypt(f)
temp.close()
Because we are using pathlib, each filename is actually a pathlib object, to get just the file name from the object, we can call the .name attribute, which will return the string value of the file name.
Note the .load_key() method only prepares the password, and it doesn’t unlock the file. .decrypt() unlocks the file, then saves the file to disk.
>>> excel_files[0]
WindowsPath('C:/Users/jay/Desktop/PythonInOffice/remove_excel_password/secret_1.xlsx')
>>> type(excel_files[0])
<class 'pathlib.WindowsPath'>
>>> excel_files[0].name
'secret_1.xlsx'
Read Password-Encrypted Excel Files Directly
If you are trying to just read Excel files and don’t want to bother with re-saving them. Follow this short tutorial on how to read password-encrypted Excel files directly into pandas without saving them.
Putting It Together
import msoffcrypto
import pathlib
url = pathlib.Path(r'C:\Users\jay\Desktop\PythonInOffice\remove_excel_password')
excel_files = list(url.glob('*.xlsx'))
def unlock(filename, passwd, output_folder):
temp = open(filename, 'rb')
excel = msoffcrypto.OfficeFile(temp)
excel.load_key(passwd)
out_path = pathlib.Path(output_folder)
if not out_path.exists():
out_path.mkdir()
with open(str(out_path/filename.name), 'wb') as f:
excel.decrypt(f)
temp.close()
for i in excel_files:
unlock(i, '1234', r'C:\Users\jay\Desktop\PythonInOffice\remove_excel_password\nopass')
Hi I get this error:
PS C:\Users\rob\Downloads> python .\excel.py
Traceback (most recent call last):
File “C:\Users\rob\Downloads\excel.py”, line 21, in
unlock(i, ‘1234’, r’C:\test\nopass’)
File “C:\Users\rob\Downloads\excel.py”, line 17, in unlock
excel.decrypt(p)
File “C:\Users\rob\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\msoffcrypto\format\ooxml.py”, line 201, in decrypt
raise exceptions.InvalidKeyError(“The file could not be decrypted with this password”)
msoffcrypto.exceptions.InvalidKeyError: The file could not be decrypted with this password
excel.py:
————–
import msoffcrypto
import pathlib
url = pathlib.Path(r’C:\test’)
excel_files = list(url.glob(‘*.xlsx’))
def unlock(filename, passwd, output_folder):
temp = open(filename, ‘rb’)
excel = msoffcrypto.OfficeFile(temp)
excel.load_key(passwd)
out_path = pathlib.Path(output_folder)
if not out_path.exists():
out_path.mkdir()
with open(str(out_path/filename.name), ‘wb’) as p:
excel.decrypt(p)
temp.close()
for i in excel_files:
unlock(i, ‘1234’, r’C:\test\nopass’)