Skip to content

Commit

Permalink
update doc
Browse files Browse the repository at this point in the history
  • Loading branch information
Judd committed May 9, 2024
1 parent fb8690c commit adb3856
Showing 1 changed file with 44 additions and 15 deletions.
59 changes: 44 additions & 15 deletions docs/fun.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,23 +2,52 @@

## Layer Shuffling

Llama-3 70B has been self-merged into a [120B](https://huggingface.co/mlabonne/Meta-Llama-3-120B-Instruct) model,
which sounds quite interesting.
With layer shuffling, one can duplicate/delete/rearrange one or more layers on the fly.
Before shuffling a model's layers, use `--show` to view basic information about it.

Now, using `--layer_spec`, we can do the same thing on-the-fly. For example, here we create a Llama-3 13.8B from 8B model and chat with it:
1. Self merging

```sh
./bin/main -i -m path/to/llama3-8b.bin --layer_spec 0:8,4:12,8:16,12:20,16:24,20:28,24:32
Llama-3 70B has been self-merged into a [120B](https://huggingface.co/mlabonne/Meta-Llama-3-120B-Instruct) model,
which sounds quite interesting.

________ __ __ __ __ ___
/ ____/ /_ ____ _/ /_/ / / / / |/ /_________ ____
/ / / __ \/ __ `/ __/ / / / / /|_/ // ___/ __ \/ __ \
/ /___/ / / / /_/ / /_/ /___/ /___/ / / // /__/ /_/ / /_/ /
\____/_/ /_/\__,_/\__/_____/_____/_/ /_(_)___/ .___/ .___/
You are served by LlaMa3, /_/ /_/
with 13264949248 (13.3B) parameters.
```
Now, using `--layer_spec`, we can do the same thing on-the-fly. For example, here we create a Llama-3 13.3B from 8B model and chat with it:

The layers of new model are 0-1-2-3-4-5-6-7, 4-5-6-7-8-9-10-11, ... layers from the original model.
```sh
./bin/main -i -m path/to/llama3-8b.bin --layer_spec 0:8,4:12,8:16,12:20,16:24,20:28,24:32

________ __ __ __ __ ___
/ ____/ /_ ____ _/ /_/ / / / / |/ /_________ ____
/ / / __ \/ __ `/ __/ / / / / /|_/ // ___/ __ \/ __ \
/ /___/ / / / /_/ / /_/ /___/ /___/ / / // /__/ /_/ / /_/ /
\____/_/ /_/\__,_/\__/_____/_____/_/ /_(_)___/ .___/ .___/
You are served by LlaMa3, /_/ /_/
with 13264949248 (13.3B) parameters.
```

The layers of new model are 0-1-2-3-4-5-6-7, 4-5-6-7-8-9-10-11, ... layers from the original model.

2. Layer pruning

Now, lets remove some layers from LlaMa3 8B and ask it to write a poem:

```.sh
./bin/main -i -m path/to/llama3-8b.bin --layer_spec 0:10,10:20:2,20:
________ __ __ __ __ ___
/ ____/ /_ ____ _/ /_/ / / / / |/ /_________ ____
/ / / __ \/ __ `/ __/ / / / / /|_/ // ___/ __ \/ __ \
/ /___/ / / / /_/ / /_/ /___/ /___/ / / // /__/ /_/ / /_/ /
\____/_/ /_/\__,_/\__/_____/_____/_/ /_(_)___/ .___/ .___/
You are served by LlaMa3, /_/ /_/
with 6939701248 (6.9B) parameters.

You > write a poem about Llama, the LLM released by Meta
A.I. > In the land of Meta, where the digital realm
Was once so vast, and now it's explored
The LLM, a model, so refined,
Was released to the world, a marvel
To learn and understand, it's not just
A myth, but a marvel, a Llama, a dream
...
```

Before shuffling a model's layers, use `--show` to view basic information about it.

0 comments on commit adb3856

Please sign in to comment.