Skip to content

Commit

Permalink
gguf : document Mixtral changes in spec (#646)
Browse files Browse the repository at this point in the history
* add new tensor names

* add new keys

* fix tensor names

* gguf : change wording a bit

---------

Co-authored-by: Georgi Gerganov <[email protected]>
  • Loading branch information
ariez-xyz and ggerganov committed Dec 13, 2023
1 parent 95cdaf9 commit a027a92
Showing 1 changed file with 8 additions and 0 deletions.
8 changes: 8 additions & 0 deletions docs/gguf.md
Original file line number Diff line number Diff line change
Expand Up @@ -285,6 +285,8 @@ In the following, `[llm]` is used to fill in for the name of a specific LLM arch
- `[llm].tensor_data_layout: string`: When a model is converted to GGUF, tensors may be rearranged to improve performance. This key describes the layout of the tensor data. This is not required; if not present, it is assumed to be `reference`.
- `reference`: tensors are laid out in the same order as the original model
- further options can be found for each architecture in their respective sections
- `[llm].expert_count: uint32`: Number of experts in MoE models (optional for non-MoE arches).
- `[llm].expert_used_count: uint32`: Number of experts used during each token token evaluation (optional for non-MoE arches).
#### Attention
Expand Down Expand Up @@ -341,6 +343,8 @@ The following sections describe the metadata for each model architecture. Each k
.swapaxes(1, 2)
.reshape(weights.shape))
```
- `llama.expert_count`
- `llama.expert_used_count`

##### MPT

Expand Down Expand Up @@ -553,6 +557,10 @@ where N signifies the block number a layer belongs to, and where `BB` could be:
- `ffn_up`: Feed-forward network "up" layer
- `ffn_gate`: Feed-forward network "gate" layer
- `ffn_down`: Feed-forward network "down" layer
- `ffn_gate_inp`: Expert-routing layer for the Fee-forward network in MoE models
- `ffn_gate_exp`: Feed-forward network "gate" layer per expert in MoE models
- `ffn_down_exp`: Feed-forward network "down" layer per expert in MoE models
- `ffn_up_exp`: Feed-forward network "up" layer per expert in MoE models

## Version History

Expand Down

0 comments on commit a027a92

Please sign in to comment.