MaxHalford

Max Halford MaxHalford

🌱 Head of Data @carbonfact 🍥 Doing machine learning on streams in my spare time

1k followers · 219 following

@carbonfact
France
14:25 (UTC +01:00)
maxhalford.github.io
@halford_max
in/maxhalford
https://scholar.google.com/citations?user=erRNNi0AAAAJ

Achievements

x3 x4 x4 x3

Achievements

x3 x4 x4 x3

Highlights

Organizations

Stars

🔬 Document processing

20 repositories

axa-group / Parsr

Transforms PDF, Documents and Images into Enriched Structured Data

JavaScript 5,823 310 Updated Dec 3, 2023

mindee / doctr

docTR (Document Text Recognition) - a seamless, high-performing & accessible library for OCR-related tasks powered by Deep Learning.

Python 3,842 442 Updated Nov 13, 2024

qurator-spk / dinglehopper

An OCR evaluation tool

Python 64 14 Updated Oct 11, 2024

jamesturk / jellyfish

🪼 a python library for doing approximate and phonetic matching of strings.

Jupyter Notebook 2,067 160 Updated Oct 29, 2024

Samir55 / Image2Lines

A tool for handwritten text (straight and skewed) line segmentation based on a statistical approach.

C++ 39 21 Updated Jun 29, 2018

Layout-Parser / layout-parser

A Unified Toolkit for Deep Learning Based Document Image Analysis

Python 4,905 470 Updated Aug 15, 2024

pemistahl / grex

A command-line tool and Rust library with Python bindings for generating regular expressions from user-provided test cases

Rust 7,298 173 Updated Nov 8, 2024

pdfminer / pdfminer.six

Community maintained fork of pdfminer - we fathom PDF

Python 5,947 930 Updated Aug 2, 2024

facebook / duckling

Language, engine, and tooling for expressing, testing, and evaluating composable language rules on input strings.

Haskell 4,083 726 Updated Oct 3, 2024

jasalt / kuittiskanneri

Receipt Scanner Prototype using AngularJS, (PYTHON) Flask & OpenCV. University full-stack SPA web app course project 2014.

JavaScript 118 47 Updated Dec 7, 2022

mrabarnett / mrab-regex

C 446 49 Updated Nov 7, 2024

lark-parser / lark

Lark is a parsing toolkit for Python, built with a focus on ergonomics, performance and modularity.

Python 4,901 415 Updated Oct 26, 2024

Calamari-OCR / calamari

Line based ATR Engine based on OCRopy

Python 1,048 209 Updated Nov 12, 2024

Belval / TextRecognitionDataGenerator

A synthetic data generator for text recognition

Python 3,283 977 Updated Jul 18, 2024

jaidevd / numerizer

A Python module to convert natural language numerics into ints and floats.

Python 224 23 Updated Sep 26, 2024

ljvmiranda921 / prodigy-pdf-custom-recipe

Custom recipe and utilities for document processing

Python 198 20 Updated Jun 19, 2022

JaidedAI / EasyOCR

Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.

Python 24,477 3,158 Updated Sep 24, 2024

neogeny / TatSu

竜 TatSu generates Python parsers from grammars in a variation of EBNF

Python 408 48 Updated Nov 6, 2024

datamade / parserator

🔖 A toolkit for making domain-specific probabilistic parsers

Python 797 82 Updated Sep 26, 2024

pikepdf / pikepdf

A Python library for reading and writing PDF, powered by QPDF

Python 2,182 191 Updated Nov 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Max Halford MaxHalford

Sponsors

Achievements

Achievements

Highlights

Organizations

Block or report MaxHalford

🔬 Document processing

axa-group / Parsr

mindee / doctr

qurator-spk / dinglehopper

jamesturk / jellyfish

Samir55 / Image2Lines

Layout-Parser / layout-parser

pemistahl / grex

pdfminer / pdfminer.six

facebook / duckling

jasalt / kuittiskanneri

mrabarnett / mrab-regex

lark-parser / lark

Calamari-OCR / calamari

Belval / TextRecognitionDataGenerator

jaidevd / numerizer

ljvmiranda921 / prodigy-pdf-custom-recipe

JaidedAI / EasyOCR

neogeny / TatSu

datamade / parserator

pikepdf / pikepdf