- France
-
14:25
(UTC +01:00) - maxhalford.github.io
- @halford_max
- in/maxhalford
- https://scholar.google.com/citations?user=erRNNi0AAAAJ
Highlights
- Pro
🔬 Document processing
Transforms PDF, Documents and Images into Enriched Structured Data
docTR (Document Text Recognition) - a seamless, high-performing & accessible library for OCR-related tasks powered by Deep Learning.
🪼 a python library for doing approximate and phonetic matching of strings.
A tool for handwritten text (straight and skewed) line segmentation based on a statistical approach.
A Unified Toolkit for Deep Learning Based Document Image Analysis
A command-line tool and Rust library with Python bindings for generating regular expressions from user-provided test cases
Community maintained fork of pdfminer - we fathom PDF
Language, engine, and tooling for expressing, testing, and evaluating composable language rules on input strings.
Receipt Scanner Prototype using AngularJS, (PYTHON) Flask & OpenCV. University full-stack SPA web app course project 2014.
Lark is a parsing toolkit for Python, built with a focus on ergonomics, performance and modularity.
A synthetic data generator for text recognition
A Python module to convert natural language numerics into ints and floats.
Custom recipe and utilities for document processing
Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
竜 TatSu generates Python parsers from grammars in a variation of EBNF
🔖 A toolkit for making domain-specific probabilistic parsers
A Python library for reading and writing PDF, powered by QPDF