Block or Report
Block or report bravery
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseStars
Language
Sort by: Recently starred
OCR, layout analysis, reading order, line detection in 90+ languages
Math OCR model that outputs LaTeX and markdown
🚀CodiumAI PR-Agent: An AI-Powered 🤖 Tool for Automated Pull Request Analysis, Feedback, Suggestions and More! 💻🔍
A snappy, keyboard-centric terminal user interface for interacting with large language models. Chat with ChatGPT, Claude, Llama 3, Phi 3, Mistral, Gemma and more.
Retrieval Augmented Generation (RAG) chatbot powered by Weaviate
CodiumAI Cover-Agent: An AI-Powered Tool for Automated Test Generation and Code Coverage Enhancement! 💻🤖🧪🐞
SuperCLUE: 中文通用大模型综合性基准 | A Benchmark for Foundation Models in Chinese
Drop in a screenshot and convert it to clean code (HTML/Tailwind/React/Vue)
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving
Evaluation framework for your Retrieval Augmented Generation (RAG) pipelines
Question and Answer based on Anything.
Forward-Looking Active REtrieval-augmented generation (FLARE)
GPT based autonomous agent that does online comprehensive research on any given topic
A minimal GPU design in Verilog to learn how GPUs work from the ground up
Retrieval and Retrieval-augmented LLMs
Ace interviews with AI practice. Our agent role-plays personalized interview tailored to your background, listening and replying like a real interviewer. Train across personas for any situation.
Accessible large language models via k-bit quantization for PyTorch.
Official repo for "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models"
Summarize existing representative LLMs text datasets.
Bicleaner is a parallel corpus classifier/cleaner that aims at detecting noisy sentence pairs in a parallel corpus.
Benchmarking long-form factuality in large language models. Original code for our paper "Long-form factuality in large language models".
中文Mixtral混合专家大模型(Chinese Mixtral MoE LLMs)
MNBVC(Massive Never-ending BT Vast Chinese corpus)超大规模中文语料集。对标chatGPT训练的40T数据。MNBVC数据集不但包括主流文化,也包括各个小众文化甚至火星文的数据。MNBVC数据集包括新闻、作文、小说、书籍、杂志、论文、台词、帖子、wiki、古诗、歌词、商品介绍、笑话、糗事、聊天记录等一切形式的纯文本中文数据。