Firefly: 大模型训练工具，支持训练Qwen2、Yi1.5、Phi-3、Llama3、Gemma、MiniCPM、Yi、Deepseek、Orion、Xverse、Mixtral-8x7B、Zephyr、Mistral、Baichuan2、Llma2、Llama、Qwen、Baichuan、ChatGLM2、InternLM、Ziya2、Vicuna、Bloom等大模型

Python 5,377 484 Updated Jul 16, 2024

EleutherAI / lm-evaluation-harness

A framework for few-shot evaluation of language models.

Python 5,999 1,592 Updated Jul 30, 2024

pytorch-labs / gpt-fast

Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.

Python 5,399 492 Updated Jul 30, 2024

huggingface / trl

Train transformer language models with reinforcement learning.

Python 8,901 1,096 Updated Jul 30, 2024

open-compass / opencompass

OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.

Python 3,429 358 Updated Jul 30, 2024

abaheti95 / LoL-RL

Advantage Leftover Lunch Reinforcement Learning (A-LoL RL): Improving Language Models with Advantage-based Offline Policy Gradients

Python 23 6 Updated Mar 20, 2024

PKU-Alignment / safe-rlhf

Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback

Python 1,263 115 Updated Jun 13, 2024

uclaml / SPIN

The official implementation of Self-Play Fine-Tuning (SPIN)

Python 905 79 Updated May 8, 2024

naver / disco

A Toolkit for Distributional Control of Generative Models

Python 68 4 Updated Sep 4, 2023

ContextualAI / HALOs

A library with extensible implementations of DPO, KTO, PPO, ORPO, and other human-aware loss functions (HALOs).

Python 648 36 Updated May 30, 2024

eric-mitchell / direct-preference-optimization

Reference implementation for DPO (Direct Preference Optimization)

Python 1,914 149 Updated May 23, 2024

YiyangZhou / POVID

[Arxiv] Aligning Modalities in Vision Large Language Models via Preference Fine-tuning

Python 57 1 Updated Apr 30, 2024

vwxyzjn / cleanrl

High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)

Python 5,004 575 Updated Jul 26, 2024

Eladlev / AutoPrompt

A framework for prompt tuning using Intent-based Prompt Calibration

Python 1,929 157 Updated Jul 28, 2024

LargeWorldModel / LWM

Python 7,036 544 Updated Jul 25, 2024

google-research / scenic

Scenic: A Jax Library for Computer Vision Research and Beyond

Python 3,172 421 Updated Jul 25, 2024

PKU-YuanGroup / MoE-LLaVA

Mixture-of-Experts for Large Vision-Language Models

Python 1,862 114 Updated May 15, 2024

sustcsonglin / flash-linear-attention

Efficient implementations of state-of-the-art linear attention models in Pytorch and Triton

Python 785 45 Updated Jul 30, 2024

The-AI-Summer / self-attention-cv

Implementation of various self-attention mechanisms focused on computer vision. Ongoing repository.

Python 1,154 156 Updated Sep 14, 2021

lucidrains / self-rewarding-lm-pytorch

Implementation of the training framework proposed in Self-Rewarding Language Model, from MetaAI

Python 1,275 71 Updated Apr 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

yechenzhi

Achievements

Achievements

Block or report yechenzhi

Starred repositories

sgl-project / sglang

project-numina / aimo-progress-prize

huggingface / diffusion-models-class

karpathy / LLM101n

axolotl-ai-cloud / axolotl

opendilab / awesome-RLHF

pytorch / ao

hbin0701 / Self-Explore

pytorch / torchtune

lucidrains / PaLM-rlhf-pytorch

yangjianxin1 / Firefly