Skip to content

[ICML 2024 Poster] Code for the paper "MoE-RBench: Towards Building Reliable Language Models with Sparse Mixture-of-Experts"

License

Notifications You must be signed in to change notification settings

UNITES-Lab/MoE-RBench

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

$\texttt{MoE-RBench}$: Towards Building Reliable Language Models with Sparse Mixture-of-Experts

License: MIT

Code for "MoE-RBench: Towards Building Reliable Language Models with Sparse Mixture-of-Experts". Our implementation is based on LLMs-Finetuning-Safety and llama-recipes

Overview

Mixture-of-Experts (MoE) has gained increasing popularity as a promising framework for scaling up large language models (LLMs). However, the reliability assessment of MoE lags behind its surging applications. Moreover, when transferred to new domains such as in fine-tuning MoEmodelssometimes underperform their dense counterparts. Motivated by the research gap and counter-intuitive phenomenon, we propose MoE-RBench, the first comprehensive assessment of SMoE reliability from three aspects: (i) safety and hallucination, (ii) resilience to adversarial attacks, and (iii) out-of-distribution robustness. Extensive models and datasets are tested to compare the MoE to dense networks from these reliability dimensions. Our empirical observations suggest that with appropriate hyperparameters, training recipes, and inference techniques, we can build the MoE model more reliably than the dense LLM. In particular, we find that the robustness of SMoE is sensitive to the basic training settings. We hope that this study can provide deeper insights into how to adapt the pre-trained MoEmodel to other tasks with higher-generation security, quality, and stability.

Setup

# Cuda 11.8 is strongly recommended
conda create -n moe_rbench python=3.9 -y && conda activate moe_rbench
git clone https://github.com/UNITES-Lab/MoE-RBench && cd MoE-RBench
pip install -r requirements
wandb login 

Datasets

Tasks Datasets
Safety MaliciousInstructions, CoNa, Controversial, Alpaca
Hallucination TruthfulQA, NQ
OOD Robustness Style-OOD, BOSS
Adv Robustness SNLI, SNLI-hard, ANLI
  1. All datasets except BOSS and SNLI-hard can be downloaded and processed automatically.
  2. To evaluate models with BOSS, you should follow the pipeline in the [BOSS-repo])(https://github.com/lifan-yuan/OOD_NLP) and place the processed data in the appropriate folder in MoE-RBench/robustness-evaluation/ft-datasets.
  3. To evaluate models with SNLI-hard, download raw data from here, and follow the instructions of NLI task in BOSS to process the dataset, and palce it under MoE-RBench/robustness-evaluation/ft_datasets/snli_dataset/data.

Robustness Evaluation

  1. Scripts for robustness evaluation on all models and tasks are listed in MoE_RBench/robustness-evaluation/scripts.
  2. One example is given as follow:
torchrun --nnodes 1 --nproc_per_node gpu_num --master-port xxx \
finetuning_ood_style.py \ 
--freeze_router \ # for MoE models only, if set, routers will be frozen
--model_type switch \ # (switch,t5,pythia,molm,open_llama,llama_moe)
--project_name wandb_project --expname wandb_exp \
--run_validation --run_ood \ # for ood evaluation only
--batch_size_training 64 --val_batch_size 128 --weight_decay 0 \
--dropout_rate 1e-1 --expert_dropout_rate 2e-1 --lr 5e-5 \
--num_epochs 2 --enable_fsdp --pure_bf16 \ # or pure_fp16
--dataset dataset \ # see details in scripts
--model_name PATH/TO/MODEL \
--dist_checkpoint_root_folder PATH/TO/DIR \
--dist_checkpoint_folder NAME_OF_CKPT \
  1. The checkpoints shared by fsdp can be transferred to a whole checkpoint by running convert_model.sh.

Safety Evaluation

  1. Scripts for instruction fine-tuning of all models are listed in MoE_RBench/safety-evaluation/scripts. A example is given as follow:
torchrun --nnodes 1 --nproc_per_node gpu_num --master-port xxx finetuning.py \
--batch_size_training 64 --lr 2e-5 \
--gradient_accumulation_steps 1 --weight_decay 0 \
--num_epochs 1 \
--dataset alpaca_dataset \
--enable_fsdp \
--report \
--dist_checkpoint_root_folder PATH/TO/DIR \
--model_name PATH/TO/MODEL --pure_bf16 \
--dist_checkpoint_folder NAME_OF_CKPT
  1. The checkpoints shared by fsdp can be transferred to a whole checkpoint by running convert_model.sh.

  2. For evaluation, please refer to safety-tuned-llamas.

Truthfulness Evaluation

Please refer to DoLa.

Citation

@inproceedings{
chen2024moerbench,
title={$\texttt{MoE-RBench}$: Towards Building Reliable Language Models with Sparse Mixture-of-Experts},
author={Guanjie Chen and Xinyu Zhao and Tianlong Chen and Yu Cheng},
booktitle={The Forty-first International Conference on Machine Learning},
year={2024},
url={https://openreview.net/forum?id=LyJ85kgHFe}
}

About

[ICML 2024 Poster] Code for the paper "MoE-RBench: Towards Building Reliable Language Models with Sparse Mixture-of-Experts"

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages