Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mistral model no longer loads following PR#101 #107

Closed
johndun opened this issue Jan 26, 2024 · 2 comments · Fixed by #108
Closed

Mistral model no longer loads following PR#101 #107

johndun opened this issue Jan 26, 2024 · 2 comments · Fixed by #108

Comments

@johndun
Copy link
Contributor

johndun commented Jan 26, 2024

The get_model_cls_by_arch_name introduced in Dynamic model class loading PR removes the hard-coded mapping between MistralForCausalLM and LlamaForCausalLM causing issues trying to local host Mistral-7b model as of sglang version 0.1.9. I have tested that adding the following simple models/mistral.py file allows hosting the mistral-7b model.

from sglang.srt.models.llama2 import LlamaForCausalLM


class MistralForCausalLM(LlamaForCausalLM):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)


EntryClass = MistralForCausalLM
@johndun johndun changed the title Mistral model no longer loads following CR#101 Mistral model no longer loads following PR#101 Jan 26, 2024
@comaniac
Copy link
Collaborator

Thanks for pointing out this issue and the workaround. I'll take a look today.

@comaniac
Copy link
Collaborator

Ok it turns out that we should do exactly what you proposed. Mistral config does use MistralForCausalLM, so we should look for this class instead of using a hard-coded mapping. I'll file a PR for it now and make you a co-author. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants