msoffcrypto-tool (formerly ms-offcrypto-tool) is Python tool and library for decrypting encrypted MS Office files with password, intermediate key, or private key which generated its escrow key.
- Installation
- Examples
- Supported encryption methods
- Tests
- Todo
- Resources
- Use cases and mentions
- Contributors
- Credits
pip install msoffcrypto-tool
msoffcrypto-tool encrypted.docx decrypted.docx -p Passw0rd
Password is prompted if you omit the password argument value:
$ msoffcrypto-tool encrypted.docx decrypted.docx -p
Password:
Test if the file is encrypted or not (exit code 0 or 1 is returned):
msoffcrypto-tool document.doc --test -v
Password and more key types are supported with library functions.
Basic usage:
import msoffcrypto
encrypted = open("encrypted.docx", "rb")
file = msoffcrypto.OfficeFile(encrypted)
file.load_key(password="Passw0rd") # Use password
with open("decrypted.docx", "wb") as f:
file.decrypt(f)
encrypted.close()
Basic usage (in-memory):
import msoffcrypto
import io
import pandas as pd
decrypted = io.BytesIO()
with open("encrypted.xlsx", "rb") as f:
file = msoffcrypto.OfficeFile(f)
file.load_key(password="Passw0rd") # Use password
file.decrypt(decrypted)
df = pd.read_excel(decrypted)
print(df)
Advanced usage:
# Verify password before decryption (default: False)
# The ECMA-376 Agile/Standard crypto system allows one to know whether the supplied password is correct before actually decrypting the file
# Currently, the verify_password option is only meaningful for ECMA-376 Agile/Standard Encryption
file.load_key(password="Passw0rd", verify_password=True)
# Use private key
file.load_key(private_key=open("priv.pem", "rb"))
# Use intermediate key (secretKey)
file.load_key(secret_key=binascii.unhexlify("AE8C36E68B4BB9EA46E5544A5FDB6693875B2FDE1507CBC65C8BCF99E25C2562"))
# Check the HMAC of the data payload before decryption (default: False)
# Currently, the verify_integrity option is only meaningful for ECMA-376 Agile Encryption
file.decrypt(open("decrypted.docx", "wb"), verify_integrity=True)
- ECMA-376 (Agile Encryption/Standard Encryption)
- MS-DOCX (OOXML) (Word 2007-)
- MS-XLSX (OOXML) (Excel 2007-)
- MS-PPTX (OOXML) (PowerPoint 2007-)
- Office Binary Document RC4 CryptoAPI
- MS-DOC (Word 2002, 2003, 2004)
- MS-XLS (Excel 2002, 2003, 2007, 2010) (experimental)
- MS-PPT (PowerPoint 2002, 2003, 2004) (partial, experimental)
- Office Binary Document RC4
- MS-DOC (Word 97, 98, 2000)
- MS-XLS (Excel 97, 98, 2000) (experimental)
- ECMA-376 (Extensible Encryption)
- XOR Obfuscation
- MS-XLS (Excel 2002, 2003) (experimental)
- MS-DOC (Word 2002, 2003, 2004?)
- Word 95 Encryption (Word 95 and prior)
- Excel 95 Encryption (Excel 95 and prior)
- PowerPoint 95 Encryption (PowerPoint 95 and prior)
PRs are welcome!
poetry install
poetry run coverage run -m pytest -v
- Add tests
- Support decryption with passwords
- Support older encryption schemes
- Add function-level tests
- Add API documents
- Publish to PyPI
- Add decryption tests for various file formats
- Integrate with more comprehensive projects handling MS Office files (such as oletools?) if possible
- Add the password prompt mode for CLI
- Improve error types (v4.12.0)
- Redesign APIs (v6.0.0)
- Introduce something like
ctypes.Structure
- Support encryption
- Isolate parser
- "Backdooring MS Office documents with secret master keys" https://secuinside.com/archive/2015/2015-1-9.pdf
- Technical Documents https://msdn.microsoft.com/en-us/library/cc313105.aspx
- [MS-OFFCRYPTO] Agile Encryption https://msdn.microsoft.com/en-us/library/dd949735(v=office.12).aspx
- [MS-OFFDI] Microsoft Office File Format Documentation Introduction https://learn.microsoft.com/en-us/openspecs/office_file_formats/ms-offdi/24ed256c-eb5b-494e-b4f6-fb696ad2b4dc
- LibreOffice/core https://github.com/LibreOffice/core
- LibreOffice/mso-dumper https://github.com/LibreOffice/mso-dumper
- wvDecrypt https://www.skynet.ie/~caolan/Packages/wvDecrypt.html
- Microsoft Office password protection - Wikipedia https://en.wikipedia.org/wiki/Microsoft_Office_password_protection#History_of_Microsoft_Encryption_password
- office2john.py https://github.com/magnumripper/JohnTheRipper/blob/bleeding-jumbo/run/office2john.py
- herumi/msoffice https://github.com/herumi/msoffice
- DocRecrypt https://blogs.technet.microsoft.com/office_resource_kit/2013/01/23/now-you-can-reset-or-remove-a-password-from-a-word-excel-or-powerpoint-filewith-office-2013/
- Apache POI - the Java API for Microsoft Documents https://poi.apache.org/
- https://repology.org/project/python:msoffcrypto-tool/versions (kudos to maintainers!)
- https://checkroth.com/unlocking-password-protected-files.html
- https://github.com/jbremer/sflock/commit/3f6a96abe1dbb4405e4fb7fd0d16863f634b09fb
- https://isc.sans.edu/forums/diary/Video+Analyzing+Encrypted+Malicious+Office+Documents/24572/
- https://github.com/shombo/cyberstakes-writeps-2018/tree/master/word_up
- https://github.com/willi123yao/Cyberthon2020_Writeups/blob/master/csit/Lost_Magic
- https://github.com/dtjohnson/xlsx-populate
- https://github.com/opendocument-app/OpenDocument.core/blob/233663b039/src/internal/ooxml/ooxml_crypto.h
- https://github.com/jaydadhania08/PHPDecryptXLSXWithPassword
- https://github.com/epicentre-msf/rpxl
- Excel、データ整理&分析、画像処理の自動化ワザを完全網羅! 超速Python仕事術大全 (伊沢剛, 2022)
- "Analyse de documents malveillants en 2021", MISC Hors-série N° 24, "Reverse engineering : apprenez à analyser des binaires" (Lagadec Philippe, 2021)
- シゴトがはかどる Python自動処理の教科書 (クジラ飛行机, 2020)
- The sample file for XOR Obfuscation is from: https://github.com/openwall/john-samples/tree/main/Office/Office_Secrets