diff --git a/docs/gguf.md b/docs/gguf.md index f4e86d22e..d07ad276f 100644 --- a/docs/gguf.md +++ b/docs/gguf.md @@ -20,13 +20,16 @@ The key difference between GGJT and GGUF is the use of a key-value structure for ### GGUF Naming Convention -GGUF follow a naming convention of `--x-.gguf` +GGUF follow a naming convention of `(-)-(x)-(-).gguf` The components are: 1. **Model**: A descriptive name for the model type or architecture. + - This can be derived from gguf metadata `general.name` substituting spaces for dashes. 2. **Version**: (Optional) Denotes the model version number, formatted as `v.` - If model is missing a version number then assume `v0.0` (Prerelease) -3. **ExpertsCount**: Indicates the number of experts found in a Mixture of Experts based model. + - This can be derived from gguf metadata `general.version` +3. **ExpertsCount**: (Optional) Indicates the number of experts found in a Mixture of Experts based model. + - This can be derived from gguf metadata `llama.expert_count` 4. **Parameters**: Indicates the number of parameters and their scale, represented as ``: - `Q`: Quadrillion parameters. - `T`: Trillion parameters. @@ -34,6 +37,10 @@ The components are: - `M`: Million parameters. - `K`: Thousand parameters. 5. **EncodingScheme**: Indicates the weights encoding scheme that was applied to the model. Content, type mixture and arrangement however are determined by user code and can vary depending on project needs. +6. **Shard**: (Optional) Indicates and denotes that the model has been split into multiple shards, formatted as `-of-`. + - *ShardNum* : Shard position in this model. Must be 5 digits padded by zeros. + - Shard number always starts from `00001` onwards (e.g. First shard always starts at `00001-of-XXXXX` rather than `00000-of-XXXXX`). + - *ShardTotal* : Total number of shards in this model. Must be 5 digits padded by zeros. #### Parsing Above Naming Convention @@ -41,19 +48,63 @@ To correctly parse a well formed naming convention based gguf filename, it is re For example: - * `mixtral-v0.1-8x7B-KQ2.gguf`: + * `Mixtral-v0.1-8x7B-KQ2.gguf`: - Model Name: Mixtral - Version Number: v0.1 - Expert Count: 8 - Parameter Count: 7B - Weight Encoding Scheme: KQ2 + - Shard: N/A * `Hermes-2-Pro-Llama-3-8B-F16.gguf`: - Model Name: Hermes 2 Pro Llama 3 - - Version Number: v0.0 (`-` missing) - - Expert Count: 0 (`x` missing) + - Version Number: v0.0 + - Expert Count: 0 - Parameter Count: 8B - Weight Encoding Scheme: F16 + - Shard: N/A + + * `Grok-v1.0-100B-Q4_0-00003-of-00009.gguf"` + - Model Name: Grok + - Version Number: v1.0 + - Expert Count: 0 + - Parameter Count: 100B + - Weight Encoding Scheme: Q4_0 + - Shard: 3 out of 9 total shards + +You can also try using `/^(?[A-Za-z0-9\s-]+)(?:-v(?\d+)\.(?\d+))?-(?:(?\d+)x)?(?\d+[A-Za-z]?)-(?[\w_]+)(?:-(?\d{5})-of-(?\d{5}))?\.gguf$/` regular expression to extract all the values above as well. Just don't forget to convert `-` to ` ` for the model name. + +
Example Node.js Regex Function + +```js +#!/usr/bin/env node +const ggufRegex = /^(?[A-Za-z0-9\s-]+)(?:-v(?\d+)\.(?\d+))?-(?:(?\d+)x)?(?\d+[A-Za-z]?)-(?[\w_]+)(?:-(?\d{5})-of-(?\d{5}))?\.gguf$/; + +function parseGGUFFilename(filename) { + const match = ggufRegex.exec(filename); + if (!match) + return null; + const {model_name, major = '0', minor = '0', experts_count = null, parameters, encoding_scheme, shard = null, shardTotal = null} = match.groups; + return {modelName: model_name.trim().replace(/-/g, ' '), version: `v${major}.${minor}`, expertsCount: experts_count ? +experts_count : null, parameters, encodingScheme: encoding_scheme, shard: shard ? +shard : null, shardTotal: shardTotal ? +shardTotal : null}; +} + +const testCases = [ + {filename: 'Mixtral-v0.1-8x7B-KQ2.gguf', expected: { modelName: 'Mixtral', version: 'v0.1', expertsCount: 8, parameters: '7B', encodingScheme: 'KQ2', shard: null, shardTotal: null }}, + {filename: 'Grok-v1.0-100B-Q4_0-00003-of-00009.gguf', expected: { modelName: 'Grok', version: 'v1.0', expertsCount: null, parameters: '100B', encodingScheme: 'Q4_0', shard: 3, shardTotal: 9 }}, + {filename: 'Hermes-2-Pro-Llama-3-8B-F16.gguf', expected: { modelName: 'Hermes 2 Pro Llama 3', version: 'v0.0', expertsCount: null, parameters: '8B', encodingScheme: 'F16', shard: null, shardTotal: null }}, + {filename: 'Hermes-2-Pro-Llama-3-v32.33-8Q-F16.gguf', expected: { modelName: 'Hermes 2 Pro Llama 3', version: 'v32.33', expertsCount: null, parameters: '8Q', encodingScheme: 'F16', shard: null, shardTotal: null }}, + {filename: 'not-a-known-arrangement.gguf', expected: null}, +]; + +testCases.forEach(({ filename, expected }) => { + const result = parseGGUFFilename(filename); + const passed = JSON.stringify(result) === JSON.stringify(expected); + console.log(`${filename}: ${passed ? "PASS" : "FAIL"}`); +}); +``` + +
+ ### File Structure