A lightweight and efficient text content extractor mainly for OOXML files (typically referring to docx/xlsx/pptx).
-
Updated
Dec 11, 2023 - Go
A lightweight and efficient text content extractor mainly for OOXML files (typically referring to docx/xlsx/pptx).
Java library. Detect top-level selector on the HTML page.
Add a description, image, and links to the content-extractor topic page so that developers can more easily learn about it.
To associate your repository with the content-extractor topic, visit your repo's landing page and select "manage topics."