- Shanghai, China
-
01:08
(UTC +08:00) - https://github.com/fsspec/tosfs
Lists (2)
Sort Name ascending (A-Z)
Starred repositories
Marks issues and pull requests that have not had recent interaction
Processing engine and React components for constructing configuration-based data transformation and processing pipelines.
An open-source RAG-based tool for chatting with your documents.
Supercharge Your LLM Application Evaluations 🚀
Research and development (R&D) is crucial for the enhancement of industrial productivity, especially in the AI era, where the core aspects of R&D are mainly focused on data and models. We are commi…
Pythonic file-system interface for TOS(Tinder Object Storage)https://tosfs.readthedocs.io/en/latest/
Open source project for data preparation of LLM application builders
Streaming WARC/ARC library for fast web archive IO
A specification that python filesystems should adhere to.
10 Weeks, 20 Lessons, Data Science for All!
The standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.
This is my personal template collection. Here you'll find templates, and configurations for various tools, and technologies.
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
OCR, layout analysis, reading order, table recognition in 90+ languages
Ingest, parse, and optimize any data format ➡️ from documents to multimedia ➡️ for enhanced compatibility with GenAI frameworks
A modular graph-based Retrieval-Augmented Generation (RAG) system
🧑🚀 全世界最好的LLM资料总结 | Summary of the world's best LLM resources.
A simple, high-throughput file client for mounting an Amazon S3 bucket as a local file system.
Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
A next-generation crawling and spidering framework.
Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.
A privacy-first, self-hosted, fully open source personal knowledge management software, written in typescript and golang.
Collection of resources on the applications of Large Language Models (LLMs) in Audio AI.
Open, Multi-modal Catalog for Data & AI
A Data Streaming Library for Efficient Neural Network Training