Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restrict model options in transformers examples #2189

Closed
KickItLikeShika opened this issue Sep 9, 2021 · 7 comments · Fixed by #2190
Closed

Restrict model options in transformers examples #2189

KickItLikeShika opened this issue Sep 9, 2021 · 7 comments · Fixed by #2190

Comments

@KickItLikeShika
Copy link
Contributor

KickItLikeShika commented Sep 9, 2021

In transformers example https://github.com/pytorch/ignite/tree/master/examples/contrib/transformers it's up to user to override the default model which is bert-base-uncased, a lot of models take similar inputs to BERT and similar outputs too, but models like distilbert-base-uncased, distilroberta-base, bart-base (and many other models) will not work here as they work on a bit different way regarding to the inputs and outputs of the model, check here for more info: https://huggingface.co/transformers/model_doc/distilbert.html#distilbertmodel

So we will get an error similar to this while using DistilBERT

KeyError: 'token_type_ids'

And also if we used something like XLNet that won't work well and we will have dimensions issue, because XLNet doesn't return a pooler_output, check here https://huggingface.co/transformers/model_doc/xlnet.html#transformers.XLNetModel

So what we should do is:

  • We don't provide the option of choosing models, and we just keep BERT, and if the user wants to change the model, he can do that manually
  • Keep it that way and mention in the docs that, the models that can be used here are BERT, RoBERTa (and if there is others we mentions them).

EDIT: After deciding to go with the first option, what we need to do now is to remove model from the argument, and set config['model'] = 'bert-base-uncased' manually inside run method, and mention in the docs we are using BERT by default, and maybe leave a note about if the user wants to try another model, he should that himself.

@vfdev-5
Copy link
Collaborator

vfdev-5 commented Sep 9, 2021

Thanks for reporting that @KickItLikeShika , I think the first option is good enough.

Could you please provide few links on the code and explain in details on how to do that. Such that someone new could try to tackle the issue. Thanks

@KickItLikeShika
Copy link
Contributor Author

@vfdev-5 Updated the description.

@Ishan-Kumar2
Copy link
Contributor

@KickItLikeShika Another way I think we can fix this is by passing the dictionary that tokenizer returns instead of making our own dictionary in dataset.py. So models which don't require token_type_ids their tokenizer will not produce them also. Have a look at this example of Classification from the HF repo.

@KickItLikeShika
Copy link
Contributor Author

@Ishan-Kumar2 seems like a good idea for resolving the input issues, but we still have the same issue related to the Transformer output, most of other transformers don't have a pooled_output, I don't see how your suggestion can solve this, what do you think?

@Ishan-Kumar2
Copy link
Contributor

Ishan-Kumar2 commented Sep 9, 2021

We can replace the model with AutoModelForSequenceClassification and then have the loss computed like this

outputs = model(**inputs, labels=labels)
loss = outputs.loss
logits = output.logits

This should work for both, looking at these examples for XLNet and Bert right?

@KickItLikeShika
Copy link
Contributor Author

Initially it looks good, what do you think @vfdev-5? if that's OK, can you try this out please? @Ishan-Kumar2

@vfdev-5
Copy link
Collaborator

vfdev-5 commented Sep 9, 2021

OK for me if you think it could be understandable by other users.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants