mlx-whatsapp

This is an experimental project to convert your chat backups from Whatsapp to finetune mistral using mlx. The lora.py, models.py, models directory and convert.py are from https://github.com/ml-explore/mlx-examples

How to backup your chats

Go to whatsapp -> Settings -> Export Chat -> Select group conversation -> Without Media

Download Mistral and convert to quantized version

Install the dependencies:

pip install -r requirements.txt

Next, download and convert the model. The following command will download mistral from huggingface and convert it to quantized version

python convert.py --hf-path mistralai/Mistral-7B-v0.1 -q

Converting the files

Save your file exported from whatsapp as chat.txt. Then create the training files below

python whatsapp.py --input_file chat.txt --output_file chat.jsonl --test_file data/test.jsonl --train_file data/train.jsonl --valid_file data/valid.jsonl

By default the test and validation files take 30 samples. You can adjust them.

Training

 python lora.py --model mlx_model --train --iters 600 --data ./data --batch-size 2 --adapter-file whatsapp.npz

Inference

python lora.py --model ./mlx_model \
               --adapter-file ./whatsapp.npz \
               --max-tokens 500 \
               --prompt \
               "Mickey Mouse: Hey Minnie, are we going to the fair?
               Minnie: "

Combine your adapter and model together

python fuse.py --model mlx_model --adapter-file whatsapp.npz --save-path fused

Now the folder fused contains safetensors that can be used directly with transformers.

Warning

A word of caution - Dont upload your fused models to public sites such a huggingface as your model can leak personal data that you trained it on.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

mlx-whatsapp

How to backup your chats

Download Mistral and convert to quantized version

Converting the files

Training

Inference

Combine your adapter and model together

Warning

About

Releases

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
data		data
models		models
.gitignore		.gitignore
README.md		README.md
convert.py		convert.py
fuse.py		fuse.py
lora.py		lora.py
models.py		models.py
requirements.txt		requirements.txt
utils.py		utils.py
whatsapp.py		whatsapp.py

gavi/mlx-whatsapp

Folders and files

Latest commit

History

Repository files navigation

mlx-whatsapp

How to backup your chats

Download Mistral and convert to quantized version

Converting the files

Training

Inference

Combine your adapter and model together

Warning

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages