Comprehensive toolkit for Reinforcement Learning from Human Feedback (RLHF) training, featuring instruction fine-tuning, reward model training, and support for PPO and DPO algorithms with various c…

Python 114 11 Updated Mar 18, 2024

IBM / ModuleFormer

ModuleFormer is a MoE-based architecture that includes two different types of experts: stick-breaking attention heads and feedforward experts. We released a collection of ModuleFormer-based Languag…

Python 217 14 Updated Apr 10, 2024

myshell-ai / JetMoE

Reaching LLaMA2 Performance with 0.1M Dollars

Python 961 79 Updated Jul 23, 2024

amodaresi / AdapLeR

Python 22 3 Updated Nov 26, 2022

FlagOpen / FlagEmbedding

Retrieval and Retrieval-augmented LLMs

Python 7,138 520 Updated Oct 10, 2024

michaelfeil / infinity

Infinity is a high-throughput, low-latency REST API for serving text-embeddings, reranking models and clip

Python 1,357 105 Updated Oct 15, 2024

the-seeds / imitater

Imitate OpenAI with Local Models

Python 85 9 Updated Aug 27, 2024

THUDM / LongAlign

[EMNLP 2024] LongAlign: A Recipe for Long Context Alignment of LLMs

Python 204 13 Updated Apr 22, 2024

charSLee013 / Embedding-API

An application providing a RESTful API similar to OpenAI Embedding, supporting BERT, SBERT, and CoSENT models for generating text embedding vectors.

Python 1 1 Updated Nov 9, 2023

trestad / mitigating-reversal-curse

Code for paper 'Are We Falling in a Middle-Intelligence Trap? An Analysis and Mitigation of the Reversal Curse'

Python 11 Updated Aug 2, 2024

github / CodeSearchNet

Datasets, tools, and benchmarks for representation learning of code.

Jupyter Notebook 2,197 386 Updated Jan 31, 2022

shangqing-liu / ContraBERT

Enhacing Code Pre-trained Models by Contrastive Learning

Python 28 8 Updated Mar 8, 2023

QwenLM / Qwen

The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.

Python 13,702 1,112 Updated Sep 24, 2024

UNITES-Lab / MC-SMoE

[ICLR 2024 Spotlight] Code for the paper "Merge, Then Compress: Demystify Efficient SMoE with Hints from Its Routing Policy"

Python 64 9 Updated Jun 6, 2024

llm-random / llm-random

Python 170 12 Updated Oct 14, 2024

cofe-ai / Mu-scaling

Research without Re-search: Maximal Update Parametrization Yields Accurate Loss Prediction across Scales

Python 30 1 Updated Jul 17, 2023

open-compass / opencompass

OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.

Python 3,906 410 Updated Oct 15, 2024

deepseek-ai / DeepSeek-MoE

DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models

Python 982 48 Updated Jan 16, 2024

NVIDIA / Megatron-LM

Ongoing research training transformer models at scale

Python 10,267 2,304 Updated Oct 14, 2024

microsoft / CodeXGLUE

CodeXGLUE

C# 1,534 364 Updated Apr 23, 2024

AI21Labs / Parallel-Context-Windows

Python 99 12 Updated Jun 20, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Yuxiang Guo XXares

Block or report XXares

Stars

hal-314 / fastai-batch-size-finder

shreyansh26 / An-Empirical-Model-of-Large-Batch-Training

sahsaeedi / triple-preference-optimization

Eshe0922 / ReposVul

liyucheng09 / llm-compressive

yule-BUAA / MergeLM

allenai / open-instruct

allenai / reward-bench

huggingface / alignment-handbook

raghavc / LLM-RLHF-Tuning-with-PPO-and-DPO