Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mario_lm = MarioLM(lm=BASE, tokenizer=BASE) The parameter here prompts an error. #20

Open
54457616 opened this issue Jun 30, 2023 · 9 comments

Comments

@54457616
Copy link

  1. TrainingConfig and MarioGPTTrainer cannot be used.
  2. mario_lm = MarioLM(lm=BASE, tokenizer=BASE) The parameter here prompts an error. Is there no specific program for this training?

Can you give some specific suggestions?

屏幕截图 2023-06-30 232140
屏幕截图 2023-06-30 232151

@shyamsn97
Copy link
Owner

Hey! What version of mario-gpt are you running? Can you try a pip install mario-gpt —upgrade?

@54457616
Copy link
Author

屏幕截图 2023-07-01 000046
屏幕截图 2023-07-01 000108
屏幕截图 2023-07-01 000116
屏幕截图 2023-07-01 000133
The PYTHON version used by the system is 3.10, and the corresponding program version is the latest download from the website. The first step ran smoothly without any problems, as shown in the figure, and the above error will appear in the second step.

@shyamsn97
Copy link
Owner

Can I see the full stacktrace? Because from what I see above it looks like the error is coming from:

mario_lm = MarioLM(lm_path=BASE, tokenizer_path=BASE)

But below it looks like its working? Doesn't really look like an issue with the trainer / training config.

I ran the code in a new clean workspace:

>>> import torch
>>> from mario_gpt import MarioDataset, MarioLM, TrainingConfig, MarioGPTTrainer
>>> BASE = "distilgpt2"
>>> mario_lm = MarioLM(lm_path=BASE, tokenizer_path=BASE)
Using distilgpt2 lm
/home/shyam/miniconda3/envs/py39/lib/python3.9/site-packages/transformers/models/auto/modeling_auto.py:1352: FutureWarning: The class `AutoModelWithLMHead` is deprecated and will be removed in a future version. Please use `AutoModelForCausalLM` for causal language models, `AutoModelForMaskedLM` for masked language models and `AutoModelForSeq2SeqLM` for encoder-decoder models.
  warnings.warn(
Some weights of GPT2LMHeadModel were not initialized from the model checkpoint at distilgpt2 and are newly initialized: ['transformer.h.0.crossattention.c_attn.weight', 'transformer.h.3.crossattention.c_attn.weight', 'transformer.h.4.crossattention.bias', 'transformer.h.5.crossattention.bias', 'transformer.h.2.crossattention.q_attn.weight', 'transformer.h.3.ln_cross_attn.weight', 'transformer.h.2.crossattention.c_proj.weight', 'transformer.h.2.crossattention.c_proj.bias', 'transformer.h.2.ln_cross_attn.weight', 'transformer.h.5.crossattention.c_proj.bias', 'transformer.h.3.crossattention.c_proj.bias', 'transformer.h.0.crossattention.c_proj.bias', 'transformer.h.5.crossattention.c_proj.weight', 'transformer.h.5.ln_cross_attn.weight', 'transformer.h.3.crossattention.masked_bias', 'transformer.h.1.crossattention.c_proj.weight', 'transformer.h.5.crossattention.c_attn.weight', 'transformer.h.1.crossattention.masked_bias', 'transformer.h.1.crossattention.c_proj.bias', 'transformer.h.3.crossattention.c_proj.weight', 'transformer.h.0.ln_cross_attn.weight', 'transformer.h.1.crossattention.bias', 'transformer.h.3.crossattention.bias', 'transformer.h.5.crossattention.masked_bias', 'transformer.h.5.crossattention.q_attn.weight', 'transformer.h.1.crossattention.q_attn.weight', 'transformer.h.1.crossattention.c_attn.weight', 'transformer.h.4.crossattention.q_attn.weight', 'transformer.h.0.crossattention.bias', 'transformer.h.3.crossattention.q_attn.weight', 'transformer.h.0.crossattention.masked_bias', 'transformer.h.4.crossattention.c_proj.bias', 'transformer.h.4.crossattention.c_attn.weight', 'transformer.h.2.crossattention.bias', 'transformer.h.0.crossattention.c_proj.weight', 'transformer.h.4.crossattention.c_proj.weight', 'transformer.h.2.crossattention.masked_bias', 'transformer.h.1.ln_cross_attn.weight', 'transformer.h.0.crossattention.q_attn.weight', 'transformer.h.4.ln_cross_attn.weight', 'transformer.h.4.crossattention.masked_bias', 'transformer.h.2.crossattention.c_attn.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Using distilgpt2 tokenizer

Can you try doing a pip uninstall mario-gpt and then re run the python setup.py install step again?

@54457616
Copy link
Author

54457616 commented Jul 1, 2023

I will redeploy according to your suggestion, the main problem before is mario_lm = MarioLM(lm=BASE, tokenizer=BASE)
The second question is TrainingConfig, MarioGPTTrainer.

@54457616
Copy link
Author

54457616 commented Jul 1, 2023

屏幕截图 2023-07-01 215806

  1. The program has been uninstalled

屏幕截图 2023-07-01 223251

  1. The program is installed successfully

屏幕截图 2023-07-01 224932

  1. Sampling runs correctly

屏幕截图 2023-07-01 225520

  1. The front of Train is running normally

屏幕截图 2023-07-01 230040

  1. Error running after Train

屏幕截图 2023-07-01 230120
屏幕截图 2023-07-01 230143

@shyamsn97
Copy link
Owner

Ah looks like accelerator changed their api. I’ll update it!

@54457616
Copy link
Author

Whether the relevant modification is completed, I look forward to your revision, I hope to continue to debug your results, thank you.

@shyamsn97
Copy link
Owner

Should be fixed now! Let me know if you still have errors

@54457616
Copy link
Author

54457616 commented Aug 3, 2023

With your revision, my local operation can be successfully completed at present. Can you give a general description of the relevant files you have modified? And how can the content generated by training be better for fine-tuning the model? Before you provide the modification, after your reminder, I can run normally under the original code when I use the accelerate==0.16.0 previous version.
屏幕截图 2023-08-03 231650

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants