-
Co-Founder of Xeol
- NYC
- https://www.xeol.io/
- in/shihanwan
- @shihanwan
Stars
Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training andβ¦
Tesseract Open Source OCR Engine (main repository)
Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.
Composable building blocks to build Llama Apps
A Python wrapper around the MediaInfo library
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
Datasets, Transforms and Models specific to Computer Vision
Robust Speech Recognition via Large-Scale Weak Supervision
A type-safe typescript SQL query builder
Augment AI agents with long-term memory through knowledge graph π§
The official Python library for the OpenAI API
π« Industrial-strength Natural Language Processing (NLP) in Python
Weaviate is an open-source vector database that stores both objects and vectors, allowing for the combination of vector search with structured filtering with the fault tolerance and scalability of β¦
Python packaging and dependency management made easy
RDFLib is a Python library for working with RDF, a simple yet powerful language for representing information.
Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.
π₯ Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.
The Open Source Memory Layer For Autonomous Agents
Stipple Effect is a pixel art editor that supports animation and scripting (available on Windows, macOS and Linux)
A guide for researching ways of funding open source projects.
Letta (formerly MemGPT) is a framework for creating LLM services with memory.