-
Notifications
You must be signed in to change notification settings - Fork 977
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Running on a single GPU #612
Comments
@StellaAthena it does not. had to uncomment those lines or it won't even merge the layer checkpoints |
before running merge20b.py the error was RuntimeError: Error(s) in loading state_dict for EmbeddingPipe: |
I would suggest redownloading the slim weights into a new directory, to be sure that you're starting from a known point. |
I’ve been trying to reproduce your issue and failing… I concur with @HughPH that it’s probably worth deleting everything and starting again. |
@huey2531 Did you make any progress with this? |
ok will delete everything and start from scratch |
@huey2531 I can confirm that another individual got this running last week without hitting that error. |
@huey2531 Did you get it working? |
This looks like a tokenizer mismatch to me:
Do you have the right tokenizer for the 20B model configured when running generate.py? Should be something like this
|
@huey2531 are you still trying to get this going? @StellaAthena It's been 3 weeks, I'd suggest this could probably be closed if another week passes without activity. |
Closing due to inactivity. |
I'm still working on this. stuck on some dependency issues and need to reinstall the OS... |
yes, i have the correct tokenizer |
Do you have |
probably not. I simply changed the path of my original config to 20B_checkpoints_merged |
now I'm stuck at #628 during installation. did not encounter this issue a few weeks ago. |
@huey2531 This seems to be something that broke recently inside of Triton. I can't install on a fresh machine but my previously existing implementations (from a couple weeks ago) work fine lol |
@StellaAthena what version is your Triton? pip show triton |
It says
|
Closing due to inactivity |
tried merging the checkpoints as described for single GPU
python tools/merge20b.py --input_dir ./20B_checkpoints --output_dir ./20B_checkpoints_merged
However Im getting this error when generating
RuntimeError: Error(s) in loading state_dict for EmbeddingPipe:
size mismatch for word_embeddings.weight: copying a param with shape torch.Size([50432, 6144]) from checkpoint, the shape in current model is torch.Size([50304, 6144]).
How can I adjust to make the current model match size 50432? or is it the other way around?
The text was updated successfully, but these errors were encountered: