Chimera llm-interference speedup

Chimera

this respository is aimed at speeding up llm inference.

Chimera

environment

python 3.10

transformers 4.31.0

pytorch 2.1.2

dataset 2.13.1

huggingface_hub 0.16.4

fschat 0.2.28

datasets

Sharedgpt

Chimera Weights

Base Model	Chimera on hugging Face
Vicuna-7b-v1.3	anonymous6690/Chimera-Vicuna-7b-v1.3
Vicuna-13b-v1.3	anonymous6690/Chimera-Vicuna-13b-v1.3
LlaMA-2-7b	anonymous6690/Chimera-LlaMA-2-7b
LlaMA-2-13b	anonymous6690/Chimera-LlaMA-2-13b

model

supprt vicuna , llama-2 and mistral

git lfs install
git clone https://huggingface.co/lmsys/vicuna-13b-v1.5

git lfs install
git clone https://huggingface.co/lmsys/vicuna-13b-v1.5

training

easy example

this is a sample

if you want to use wandb to watch the accuracy of prediction, please use your own wandb key and change the wandb.init("your key") in the train.py

cd ./chimera
torchrun --nproc_per_node=1   ./train.py --model_name_or_path ../model/vicuna-7b-v1.3 \
    --data_path ../data/ShareGPT_Vicuna_unfiltered/train.json \
    --eval_data_path  "../data/ShareGPT_Vicuna_unfiltered/small_test.json" \
    --output_dir chimera_0125 \
    --num_train_epochs 2 \
    --per_device_train_batch_size 4 \
    --per_device_eval_batch_size 4\
    --gradient_accumulation_steps 8 \
    --evaluation_strategy "steps" \
    --eval_steps 50 \
    --save_strategy "steps" \
    --save_steps 200 \
    --save_total_limit 2 \
    --learning_rate 2e-4 \
    --weight_decay 0.0 \
    --warmup_ratio 0.01 \
    --lr_scheduler_type "cosine" \
    --logging_steps 1 \
    --model_max_length 1024 \
    --lazy_preprocess True \
    --chimera_num_heads 4 \
    --chimera_num_layers 1

evaluate

cd ./chimera
python test.py model_path

Acknowledge

Our project is based on a lot of excellent work such as Medusa ,vicuna

Name		Name	Last commit message	Last commit date
Latest commit History 72 Commits
chimera		chimera
data		data
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Chimera llm-interference speedup

Chimera

Chimera

environment

datasets

Chimera Weights

model

training

easy example

evaluate

Acknowledge

About

Releases

Packages

Contributors 3

Languages

License

kafkayu/Chimera

Folders and files

Latest commit

History

Repository files navigation

Chimera llm-interference speedup

Chimera

Chimera

environment

datasets

Chimera Weights

model

training

easy example

evaluate

Acknowledge

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages