Add ChatGLM Model Support #516

Qubitium · 2024-06-08T02:50:28Z

Add ChatGLM Model loading. Code adapted from vllm main.

Notable Changes:

model_config.get_num_kv_heads() and .get_total_num_kv_heads() ported from vllm to correctly caculate kv heads from model config json. Found that chatglm with sglang main code retrieved wrong kv heads from config causing kv cache to contain the wrong shape.
New EntryClassRemapping property added to model entry definition to help with future compat. Chatglm has ChatGLMModel set in config.json when model loader needs ChatGLMForCausalLM.

Remapping code:

	    # compat: some models such as chatglm has incorrect class set in config.json
            # usage: [ tuple("From_Entry_Class_Name": EntryClass), ]
            if hasattr(module, "EntryClassRemapping") and isinstance(module.EntryClassRemapping, list):
                for remap in module.EntryClassRemapping:
                    if isinstance(remap, tuple) and len(remap) == 2:
                        model_arch_name_to_cls[remap[0]] = remap[1]

Usage:

EntryClass = ChatGLMForCausalLM
# compat: glm model.config class == ChatGLMModel
EntryClassRemapping = [("ChatGLMModel", ChatGLMForCausalLM)]

TESTS:

PASSED ChatGLM
PASSED Non-ChatGLM: regression test for new get_num_kv_heads
PASSED TP=2
PASSED DP=2

Qubitium and others added 3 commits June 8, 2024 01:50

add chatgml skeleton

df4be5b

Fix headcount for kv cache

d018591

refractor chatglm class remapping

d69ef31

Qubitium marked this pull request as ready for review June 11, 2024 03:11

Qubitium changed the title ~~WIP: Add ChatGLM Model Support~~ Add ChatGLM Model Support Jun 11, 2024

merrymercy merged commit a8c787d into sgl-project:main Jun 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add ChatGLM Model Support #516

Add ChatGLM Model Support #516

Qubitium commented Jun 8, 2024 •

edited

Loading

Add ChatGLM Model Support #516

Add ChatGLM Model Support #516

Conversation

Qubitium commented Jun 8, 2024 • edited Loading

Qubitium commented Jun 8, 2024 •

edited

Loading