-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Instantiate model from automodel #601
Instantiate model from automodel #601
Conversation
…eTrainedModel instance
@@ -60,8 +60,8 @@ def __init__( | |||
trust_remote_code=trust_remote_code, | |||
) | |||
|
|||
|
|||
else: | |||
elif isinstance(pretrained, str): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something I realized last minute, bit cleaner than using an assertion. The else block raises a TypeError
, which is more descriptive.
lm_eval/models/gpt2.py
Outdated
198, | ||
198, | ||
31373, | ||
], self.tokenizer.encode("hello\n\nhello") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This tokenizer assert is no longer required! it's a holdover from earlier commits where this model type was assumed to be gpt2
if using a GPT2Tokenizer type.
|
||
|
||
# Initialize model | ||
if isinstance(pretrained, transformers.PreTrainedModel): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor nit: it'd be nice if we could confirm this is of type AutoModelForCausalLM
or related subclasses, since this LM subclass only assumes a causal decoder-only model type.
@@ -72,6 +76,11 @@ def simple_evaluate( | |||
lm = lm_eval.models.get_model(model).create_from_arg_string( | |||
model_args, {"batch_size": batch_size, "max_batch_size": max_batch_size, "device": device} | |||
) | |||
elif isinstance(model, transformers.PreTrainedModel): | |||
lm = HFLM( | |||
pretrained=model, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also want to pass batch_size=batch_size
to this I believe. Agree that we should assume the user has already placed their model onto the correct device though!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lm = lm_eval.models.get_model("hf-causal")
instead of instantiating HFLM directly here too preferably.
Thanks so much for this PR, and I apologize for the slow review! It looks great, but left a couple minor nitpicks. Happy to return to these later today to fix and merge. EDIT: Testing this now. |
PRed changes to your branch here: svenhendrikx#1 once these are merged, LGTM! |
Fixes for passing AutoModel
Thanks again for the contribution!! |
…-from-Automodel Instantiate model from automodel
This pull request implements #521, adding functionality to allow users to run lm-eval tasks directly on
transformers.PreTrainedModel
instances usingsimple_evaluate
. It contains three commits:Add logic to the
HFLM
class, such that atransformers.PreTrainedModel
instance can be passed as thepretrained
argument, as @haileyschoelkopf suggested here: Add a way to instantiate from HF.AutoModel #521Add an init.py file to the
bigbench_resources
directory, such that it is included in the build. If you don't do this, you'll get an error when trying to import the tasks, if you install the package using pip and the GitHub link.Add logic to
simple_evaluate
, such that you can pass it atransformers.PreTrainedModel
instance as well. I chose to directly instantiate the object, whereas when a string is passed, themodels.get_model
function is used. Directly instantiating it seemed like the simplest solution, but you could also add the functionality to theget_model
function. Let me know what you think.This is my first contribution to lm-eval, so feel free to share tips