Name		Name	Last commit message	Last commit date
parent directory ..
1page-snap.log		1page-snap.log
1page-snap.pdf		1page-snap.pdf
1page.pdf		1page.pdf
README-OCR.md		README-OCR.md
README.md		README.md
blacked.pdf		blacked.pdf
dehyphenate-flag.ipynb		dehyphenate-flag.ipynb
detect-hidden.ipynb		detect-hidden.ipynb
input.pdf		input.pdf
input.pdf-status.log		input.pdf-status.log
journalling1.ipynb		journalling1.ipynb
journalling2.ipynb		journalling2.ipynb
new_circle_annot.ipynb		new_circle_annot.ipynb
object-algebra.ipynb		object-algebra.ipynb
ocr-illegible.ipynb		ocr-illegible.ipynb
optional-content.ipynb		optional-content.ipynb
page-rectangles.ipynb		page-rectangles.ipynb
partial-ocr.ipynb		partial-ocr.ipynb
partial-ocr.pdf		partial-ocr.pdf
show_image.py		show_image.py
testpage-performance.ipynb		testpage-performance.ipynb

README.md

PyMuPDF JUPYTER Notebooks

These are scripts that explain basic usage of PyMuPDF using jupyter notebook features. Just click on one of the .ipynb files to see its fully rendered session!

Over time this script collection will be extended. Your contribution is very welcome!

Example Files

1page.pdf - 1-pager PDF used as a test file by several notebooks
blacked.pdf - 1-pager PDF with three words covered by black rectangles. Used by detect-hidden.ipynb which demonstrates how badly done "redactions" can be detected - detects hidden text.
partial_ocr.pdf- 1-pager PDF containing normal text and two images that overlap each other.

Notebooks

dehyphenate-flag.ipynb - shows the effect of flag TEXT_DEHYPHENATE on text search and extraction.
detect-hidden.ipynb - shows how to detect text which is hidden by objects "drawn above" it.
journalling1.ipynb - introduction to PDF Journalling
journalling2.ipynb - chapter 2 of PDF Journalling
journalling3.ipynb - chapter 3 of PDF Journalling
new-circle-annot.ipynb - simple example for adding an annotation with desired properties
ocr-illegible.ipynb - OCR: how to dynamically make unrecognized characters readable
partial-ocr.ipynb - OCRs a page in full and in partial mode and explain the difference. Requires PyMuPDF v1.19.1.
testpage-performance.ipynb - compare performance of text extraction and search methods, with and without a separately prepared TextPage object.
object-algebra.ipynb - explains details on how points, rectangles quads can be added and multiplied as if they were ordinary numbers. This is an extension to the respective chapter of the documentation.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

jupyter-notebooks

jupyter-notebooks

README.md

PyMuPDF JUPYTER Notebooks

Example Files

Notebooks

Files

jupyter-notebooks

Directory actions

More options

Directory actions

More options

Latest commit

History

jupyter-notebooks

Folders and files

parent directory

README.md

PyMuPDF JUPYTER Notebooks

Example Files

Notebooks