Tika-Python is a Python binding to the Apache Tika™ REST services allowing Tika to be called natively in the Python community.
-
Updated
Apr 14, 2024 - Python
Tika-Python is a Python binding to the Apache Tika™ REST services allowing Tika to be called natively in the Python community.
Extract and Visualize location from any file
📄🚀 Unleash a powerful Document Search Engine with Apache NiFi for lightning-fast, comprehensive text indexing and search.
Apache Tika Server as Debian GNU/Linux and Ubuntu Linux package
Text extraction from scanned pdf documents in java
Configurable Tika Server docker image. https://hub.docker.com/repository/docker/kujira/tika
Tesseract OCR wrapper for Apache Tika and/or Open Semantic ETL caching the OCR results, so Tika-Server or Open Semantic ETL has not to reprocess slow and expensive OCR on same images again
Application in php to test load of pdf files, using docker-compose and apache-tika.
A dockerized image of Apache Tika Server - https://tika.apache.org/
A doc searcher of the documents on the local host that is based on: Tika+OCR, ElasticSearch and Kibana
Polymer 3.0 app for Apache Tika.
A Windows Installer (MSI) for the windows service wrapper of the tika JSR 311 network server.
Container-ized (Docker) GeoTopicParser-Enabled Apache Tika Server with Lucene Geo Gazetteer.
Our project is a testament to this need, offering a comprehensive solution that combines modern technologies and architectures to create a powerful document search engine. This engine is not just a tool but a sophisticated ecosystem designed to handle complex data processing and retrieval tasks.
A windows service wrapper for the tika JSR 311 network server.
Web crawler with search indexing
If you are too lazy to read the whole document then generate wordart and keywords.
Add a description, image, and links to the tika-server topic page so that developers can more easily learn about it.
To associate your repository with the tika-server topic, visit your repo's landing page and select "manage topics."