Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a way to instantiate from HF.AutoModel #521

Closed
svenhendrikx opened this issue May 23, 2023 · 15 comments
Closed

Add a way to instantiate from HF.AutoModel #521

svenhendrikx opened this issue May 23, 2023 · 15 comments

Comments

@svenhendrikx
Copy link

svenhendrikx commented May 23, 2023

Hi everyone,

I'm looking to use some of the tasks in my own code, which already instantiate a transformers.AutoModelForxyz object. I know that there are wrappers available for hugging face models, such as the HFLM class, however, it seems I can only pass a namestring into the constructor, after which the object instantiates a hugging face model under the hood. It would be great if there was a way to instantiate an HFLM object (or something equivalent which I can pass to simple_evaluate) using an existing hf instance.

My question is therefore twofold:

  1. Have I overlooked some method which allows me to do what I just described?
  2. If not, would you agree that this would be a contribution to the package. This issue is currently a bit of a show stopper for my usecase, so if you do, I'd like to pick up the issue/contribute to a solution.

Let me know what you think, and of course, thanks a lot for sharing this awesome project with the community.

Kind regards,

Sven

@StellaAthena
Copy link
Member

In general, yes this sounds like a great contribution. I'm a little unclear on what the blocker is though... what is your usecase where it's important to pre-instantiate the model? That sounds like you have a hosted inference you're looking to query, and we currently support the OpenAI API and if you're using something custom you can add support for your thing based on that.

@svenhendrikx
Copy link
Author

We're looking to use lm-eval as a backend for a benchmarks component, which is part of a larger framework. The idea is that a user can load a model once, and use it in different components of the framework without having to reload it. Being able to initialize a HFLM instance using the loaded model solves that problem for our benchmarks module.

Perhaps there's a different solution to the problem, but to me it also seems like a convenient option for lm-eval to include.

@haileyschoelkopf
Copy link
Contributor

This seems like a great contribution! I see 2 potential paths forward:

  1. Allowing pretrained="str" to take either a string or to take the pre-instantiated HF model, and adding different logic in the HFLM init() method to not instantiate the model in that method if pre-instantiated.
  2. creating an LMWrapper LM subclass that is meant to more generally take any LM object that has generate() and __call__() methods implemented and wrap it so that the BaseLM 3 request types are usable.

@svenhendrikx
Copy link
Author

To me, option 1 seems like the clearest way to provide this. I don't think adding a layer of abstraction for this necessarily helps understanding the code, unless there's some benefits that I'm missing.

@haileyschoelkopf
Copy link
Contributor

Agreed, think that makes sense! option 2 would be for supporting models that may not be HF models (to interface easier with other libs for evals during training) but it probably makes sense to just do 1.

@StellaAthena
Copy link
Member

I agree this makes sense, though for clarity a “—hosted” flag or something like that might be a good idea? Happy to have this added to the library though.

@svenhendrikx
Copy link
Author

I was conceptualizing this more as a feature for those who are looking to use the HFLM class as an import, so I haven't clearly thought of any command line arguments for that usecase. Which usecase do you see for such a flag? Perhaps providing a path to a serialized model object or sth?

@StellaAthena
Copy link
Member

I was conceptualizing this more as a feature for those who are looking to use the HFLM class as an import, so I haven't clearly thought of any command line arguments for that usecase. Which usecase do you see for such a flag? Perhaps providing a path to a serialized model object or sth?

I have personally never typed import lm_eval in my life so it didn’t occur to me that that’s how your code would be set up. I think you’re right that people would probably be mostly using it as a package, not via the CLI interface.

@gakada
Copy link
Contributor

gakada commented Jun 5, 2023

I did the CLI part in #550, it casts the base class to one of the children using the model config, should be easy to also accept an initialized transformers model, with a change in evaluator as well (since that is what is going to be imported).

@svenhendrikx
Copy link
Author

Thanks @gakada I'll have a look. I'm not sure if either simple_evaluate or evaluate need any changes since I assume HFLM is supposed to function as a facade, and those methods simply check if the lm argument is an instance of base.LM. I can see how this is needed if you would do something like specifying a .pt file as @haileyschoelkopf mentioned, since evaluate then implements the logic for instantiating an HFLM object, but that's not what I'm currently trying to do. Of course, correct me if I missed something.

I've managed to find some time to work on this, and I expect to push something the coming days.

@gakada
Copy link
Contributor

gakada commented Jun 13, 2023

I mean simple_evaluate doesn't support passing something like transformers.AutoModelForCausalLM, so it will need few lines to check if it is a transformers model to pass to a respectively selected class constructor in the framework (which also should be changed to accept initialized transformers models). Which is just more concise than constructing a framework model from a transformers model and then passing it to simple_evaluate, can pass it directly and let simple_evaluate do the mapping.

@svenhendrikx
Copy link
Author

I think I understand what you mean, I've added the logic. It seems I'm not authorized to push the branch, if someone could authorize me, I'll create a PR for review.

@StellaAthena
Copy link
Member

I think I understand what you mean, I've added the logic. It seems I'm not authorized to push the branch, if someone could authorize me, I'll create a PR for review.

You should be able to make a fork and open a PR, but I’ll send you an invite to the EAI org.

@svenhendrikx
Copy link
Author

Hi Everyone,

I created a PR last week, I'll link it here too for convenience:
#601

@haileyschoelkopf
Copy link
Contributor

Thanks for the PR!! Will review it this morning :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants