[TUTORIAL] Create a synthetic dataset with Mistral and distilabel #35

sdiazlor · 2024-05-22T07:07:25Z

In this tutorial, we will generate some instructions using the self-instruct approach and then will generate two possible answers using MistralAI models. Then, we will use a higher-level model to judge the answers (mistral-large). Finally, we will use the argilla package to analyze the dataset and push it to HF.

Runnable in Colab
Added to README

Thanks in advance @sophiamyang

gabrielmbmb

LGTM! Thanks for pulling this @sdiazlor

distilabel_synthetic_dpo_dataset.ipynb

sophiamyang · 2024-06-10T13:11:45Z

Hi @sdiazlor, thanks for the PR! Could you add your notebook to the third-party folder https://github.com/mistralai/cookbook/tree/main/third_party?

pandora-s-git · 2024-06-10T13:17:33Z

Hi there ! Honnest question, what exactly is the advantage here of using 2 smaller models and judge the answers with Mistral Large? Is it really cheaper or/and better than just generating with mistral large directly a dataset without a Judge? I do understand that for DPO you are required to have two possible answers and get the best one, but will this dataset be better than one generated directly by Large?

sdiazlor · 2024-06-11T14:27:13Z

@pandora-s-git Thanks for your question. The main issue is that DPO highly helps the model to generate answers that are aligned with people's preferences. A basic QA dataset does not provide insights into the model of what's better or worse, and diversity is reduced. That said, using two small models usually is cheaper, sometimes free if they are open source, and their answers usually have a good quality, the larger model here will help not only to create the DPO dataset but also for the annotators (which is highly recommended to obtain a better high-quality dataset) to make their decisions. However, other alignment approaches have also appeared and they require less data as KTO or DOVE. If you want, this blog may be interesting for you.

sdiazlor · 2024-06-11T14:38:33Z

@sophiamyang Notebook moved to the third_party folder.

pandora-s-git · 2024-06-11T14:45:25Z

@pandora-s-git Thanks for your question. The main issue is that DPO highly helps the model to generate answers that are aligned with people's preferences. A basic QA dataset does not provide insights into the model of what's better or worse, and diversity is reduced. That said, using two small models usually is cheaper, sometimes free if they are open source, and their answers usually have a good quality, the larger model here will help not only to create the DPO dataset but also for the annotators (which is highly recommended to obtain a better high-quality dataset) to make their decisions. However, other alignment approaches have also appeared and they require less data as KTO or DOVE. If you want, this blog may be interesting for you.

Yeah after some thinking on my own I figured it actually made sense seeing how DPO usually works, thanks for the blog tho will check it out!!

sdiazlor added 6 commits May 21, 2024 18:51

docs: add tutorial

99f5de5

docs: update README

6b8bd0a

docs: improve some writing

767c66a

docs: fix typos

0396ee0

docs: remove blank space

11fa7d7

docs: fix typo

e827b5c

gabrielmbmb approved these changes May 22, 2024

View reviewed changes

distilabel_synthetic_dpo_dataset.ipynb Outdated Show resolved Hide resolved

sdiazlor and others added 3 commits May 22, 2024 09:44

docs: apply feedback

31330c4

docs: fix typo

8562232

Update distilabel_synthetic_dpo_dataset.ipynb

2d1ea1f

sdiazlor and others added 2 commits June 11, 2024 16:32

Merge branch 'main' into main

9297256

docs: move to third-party

8752d55

sophiamyang merged commit fc21916 into mistralai:main Jun 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[TUTORIAL] Create a synthetic dataset with Mistral and distilabel #35

[TUTORIAL] Create a synthetic dataset with Mistral and distilabel #35

sdiazlor commented May 22, 2024 •

edited

Loading

gabrielmbmb left a comment

sophiamyang commented Jun 10, 2024 •

edited

Loading

pandora-s-git commented Jun 10, 2024

sdiazlor commented Jun 11, 2024 •

edited

Loading

sdiazlor commented Jun 11, 2024

pandora-s-git commented Jun 11, 2024

[TUTORIAL] Create a synthetic dataset with Mistral and distilabel #35

[TUTORIAL] Create a synthetic dataset with Mistral and distilabel #35

Conversation

sdiazlor commented May 22, 2024 • edited Loading

gabrielmbmb left a comment

Choose a reason for hiding this comment

sophiamyang commented Jun 10, 2024 • edited Loading

pandora-s-git commented Jun 10, 2024

sdiazlor commented Jun 11, 2024 • edited Loading

sdiazlor commented Jun 11, 2024

pandora-s-git commented Jun 11, 2024

sdiazlor commented May 22, 2024 •

edited

Loading

sophiamyang commented Jun 10, 2024 •

edited

Loading

sdiazlor commented Jun 11, 2024 •

edited

Loading