These are scripts that explain basic usage of PyMuPDF using jupyter notebook features. Just click on one of the .ipynb
files to see its fully rendered session!
Over time this script collection will be extended. Your contribution is very welcome!
1page.pdf
- 1-pager PDF used as a test file by several notebooksblacked.pdf
- 1-pager PDF with three words covered by black rectangles. Used bydetect-hidden.ipynb
which demonstrates how badly done "redactions" can be detected - detects hidden text.partial_ocr.pdf
- 1-pager PDF containing normal text and two images that overlap each other.
dehyphenate-flag.ipynb
- shows the effect of flagTEXT_DEHYPHENATE
on text search and extraction.detect-hidden.ipynb
- shows how to detect text which is hidden by objects "drawn above" it.journalling1.ipynb
- introduction to PDF Journallingjournalling2.ipynb
- chapter 2 of PDF Journallingjournalling3.ipynb
- chapter 3 of PDF Journallingnew-circle-annot.ipynb
- simple example for adding an annotation with desired propertiesocr-illegible.ipynb
- OCR: how to dynamically make unrecognized characters readablepartial-ocr.ipynb
- OCRs a page in full and in partial mode and explain the difference. Requires PyMuPDF v1.19.1.testpage-performance.ipynb
- compare performance of text extraction and search methods, with and without a separately preparedTextPage
object.object-algebra.ipynb
- explains details on how points, rectangles quads can be added and multiplied as if they were ordinary numbers. This is an extension to the respective chapter of the documentation.