Mergoo

mergoo is a library for easily merging multiple LLM experts, and efficiently train the merged LLM. With mergoo, you can efficiently integrate the knowledge of different generic or domain-based LLM experts.

🚀 Features

Supports recent merging methods including Mixture-of-Experts and Layer-wise merging
Flexible merging choice for each layer
Base Models supported : Llama and Mistral
Trainers supported : 🤗 Trainer, SFTrainer
Device Supported: CPU, MPS, GPU
Training choices: Finetune Only Router of MoE layers, Fully fine-tuning of Merged LLM

If you like the project, consider leaving a ⭐️

Installation

Install by pip:

pip install mergoo

Install latest unstable version on Github:

pip install git+https://github.com/Leeroo-AI/mergoo

Install it from the source:

git clone https://github.com/Leeroo-AI/mergoo
cd mergoo
pip install -e .

Quick Start

Merging Models
A sample usage of config and create the merged model

import torch
from mergoo.compose_experts import ComposeExperts

model_id = "data/mistral-math-code-moe"
config = {
    "model_type": "mistral",
    "num_experts_per_tok": 2,
    "experts": [
        {"expert_name": "base_expert", "model_id": "mistralai/Mistral-7B-v0.1"},
        {"expert_name": "expert_1", "model_id": "meta-math/MetaMath-Mistral-7B"},
        {"expert_name": "expert_2", "model_id": "ajibawa-2023/Code-Mistral-7B"}
    ],
    "router_layers": ["gate_proj", "up_proj", "down_proj"]
}

# create checkpoint
expertmerger = ComposeExperts(config, torch_dtype=torch.float16)
expertmerger.compose()
expertmerger.save_checkpoint(model_id)

Loading / Finetunning Merged models

from transformers import Trainer
from mergoo.models.modeling_mistral import MistralForCausalLM

model = MistralForCausalLM.from_pretrained("data/mistral-math-code-moe") 
# NOTE: 'gate' / router layers are untrained hence weight loading warning would appeare for them

trainer = Trainer( ... )
trainer.train()

📚 Learn More:

After finishing the Quick Start guide, you can explore the tutorials below to further familiarize yourself with mergoo.

Notebook	Details
Unified MoE with Domain Experts	Build a unifined Mixture-of-Experts model with domain-based LLM experts, inspired by BTX Research.

Mergoo Roadmap and Contributing

As an open-source library in a fast evolving domain, we welcome contributions, whether it is introducing new features, enhancing infrastructure, or improving documentation.

Here is mergoo roadmap:

Feel free to suggest new features and/or contribute to mergoo roadmap!

Join our community!

🚀 We love to here your feedback, please join Leeroo community:

Have a question not listed here? Open a GitHub Issue or send us an email!

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.github/workflows		.github/workflows
examples		examples
mergoo		mergoo
notebooks		notebooks
static		static
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
LICENSE		LICENSE
SECURITY.md		SECURITY.md
pyproject.toml		pyproject.toml
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Mergoo

🚀 Features

Installation

Quick Start

📚 Learn More:

Mergoo Roadmap and Contributing

Join our community!

About

Releases

Packages

Languages

License

tawawhite/mergoo

Folders and files

Latest commit

History

Repository files navigation

Mergoo

🚀 Features

Installation

Quick Start

📚 Learn More:

Mergoo Roadmap and Contributing

Join our community!

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages