-
Notifications
You must be signed in to change notification settings - Fork 984
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to convert a model parallel model to hugging face model? #880
Comments
Hi! I can look into this--you should be able to convert this to Huggingface without issues regardless of model-parallel size with our current scripts, so this is surprising and concerning. What conversion script are you running, and what commit of this repository and DeepSpeed version are you using? |
The commit is |
Hi! Thanks for sharing. A couple things:
|
I modify the huggingface code to support rmsnorm in gptneox. I use the pp instead of mp to train the model. And load the parameter into huggingface model refer to |
I see--I think I'm a bit confused with what cases work and do not work for you, and what code you're using for conversion If I'm understanding correctly you're experiencing the following:
Could you try the following?
For your model with PP>1, you'll want to try using (an edited RMSNorm version of) |
@guozhiyao Hey, following up on this. |
Closing this due to inactivity. Feel free to reopen if you'd like to continue investigating @guozhiyao |
Is your feature request related to a problem? Please describe.
I train a model with
"model-parallel-size": 2
, and try to convert it to hugging-face model. I refer totools/convert_to_hf.py
for conversion, and the model can load parameters normally, but the generated results are random.While I train a model with
"model-parallel-size": 1
, and use the same code to convert and generate, the result is normal. So I suspect it is because of the conversion code.Describe the solution you'd like
A clear and concise description of what you want to happen.
Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.
Additional context
Add any other context or screenshots about the feature request here.
The text was updated successfully, but these errors were encountered: