Lists (1)
Sort Name ascending (A-Z)
Starred repositories
Fcitx5 input method framework and engines ported to Android
ReST-MCTS*: LLM Self-Training via Process Reward Guided Tree Search (NeurIPS 2024)
A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 and reasoning techniques.
An automated pipeline for evaluating LLMs for role-playing.
A natural language interface for computers
A Comprehensive Toolkit for High-Quality PDF Content Extraction
[NeurIPS 2023] We use large language models as commonsense world model and heuristic policy within Monte-Carlo Tree Search, enabling better-reasoned decision-making for daily task planning problems.
Zotero is a free, easy-to-use tool to help you collect, organize, annotate, cite, and share your research sources.
Ongoing research training transformer models at scale
Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.
RewardBench: the first evaluation tool for reward models.
A simulation framework for RLHF and alternatives. Develop your RLHF method without collecting human data.
Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.
A unified evaluation framework for large language models
An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & Mixtral)
A prototype repo for hybrid training of pipeline parallel and distributed data parallel with comments on core code snippets. Feel free to copy code and launch discussions about the problems you hav…
nanobind: tiny and efficient C++/Python bindings
Reference implementation for DPO (Direct Preference Optimization)
The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.
BertViz: Visualize Attention in NLP Models (BERT, GPT2, BART, etc.)
LlamaIndex is a data framework for your LLM applications
BeaverTails is a collection of datasets designed to facilitate research on safety alignment in large language models (LLMs).