Skip to content

Latest commit

 

History

History
44 lines (31 loc) · 6.38 KB

model.md

File metadata and controls

44 lines (31 loc) · 6.38 KB

Models

Instruction-following Models and Chat Models

The fine-tuned 🦙 Vigogne models come in two types: instruction-following models and chat models. The instruction-following models are optimized to generate concise and helpful responses to user instructions, similar to text-davinci-003. Meanwhile, the chat models are designed for multi-turn dialogues, but they also perform well in instruction-following tasks, similar to gpt-3.5-turbo.

You can access the weights for these models on the 🤗 Hugging Face Hub. For further insights into the training data used, you can find additional details in the vigogne/data.

Recommended Models

Here is a list of recommended models for this project. These models have been trained using more diverse and higher-quality data, along with an optimized training process. It is advisable to use these models as a priority for your project. For alternative models, please refer to the Other Models section.

Model Type Foundation model Data Description
Vigostral-7B-Chat Chat Mistral-7B-v0.1
Vigogne-2-7B-Chat-V2.0 Chat Llama-2-7B 520K chat data Check out our blog for more details.
Vigogne-2-13B-Chat Chat Llama-2-13B 520K chat data Check out our blog for more details.

Legacy Models

Due to performance and licensing concerns, the models below are no longer recommended for general use. However, they could still be useful in specific scenarios.

Model Type Foundation model Data Description
Vigogne-2-7B-Chat-V1.0 Chat Llama-2-7B 420K chat data
Vigogne-7B-Chat Chat LLaMA-7B 420K chat data Research use only
Vigogne-13B-Chat Chat LLaMA-13B 420K chat data Research use only
Vigogne-falcon-7B-Chat Chat Falcon-7B 420K chat data
Vigogne-2-7B-Instruct Instruction-following Llama-2-7B 260K instruct data
Vigogne-2-13B-Instruct Instruction-following Llama-2-13B 260K instruct data
Vigogne-7B-Instruct Instruction-following LLaMA-7B 260K instruct data Research use only
Vigogne-13B-Instruct Instruction-following LLaMA-13B 260K instruct data Research use only
Vigogne-33B-Instruct Instruction-following LLaMA-33B 260K instruct data Research use only
Vigogne-Falcon-7B-Instruct Instruction-following Falcon-7B 260K instruct data
Vigogne-MPT-7B-Instruct Instruction-following MPT-7B 260K instruct data
Vigogne-Bloom-7B1-Instruct Instruction-following BLOOM-7B1 260K instruct data

Pretrained Models

The majority of the training corpus used to train the original LLaMA model is in English. In this case, we have gathered a substantial amount of French corpus and used it to continue the pretraining process. This language adaptive pretraining will improve the model's performance when processing French data.

The training process is still ongoing since it is a computationally expensive task that requires significant resources.