TGI support - API evaluation of HF models #869

ManuelFay · 2023-09-19T09:09:01Z

Since HF TGI's PR was merged, it should possible to integrate TGI endpoints to the lm-evaluation-harness supported APIs.

Any plans to do so ? This would enable decorrelating the evaluation machine from the served model and largely help facilitate evaluation and hosting !

Thanks a lot for the great work !

haileyschoelkopf · 2023-09-19T19:31:07Z

Hi! We'd love to move toward hosting models as endpoints to make evaluation faster and more lightweight than using HF models locally.

Adding vLLM, TGI, and support for inference on a separate machine / in a subprocess is on the roadmap long-term, but we don't have an ETA on it--if you are interested in helping contribute such a feature, let us know!

ManuelFay · 2023-09-22T11:28:06Z

I am, but won't have time over the next couple of weeks and will probably resort to using the lm-eval-harness as is (or add a few tasks) ! Thanks again for the great work !

sfriedowitz · 2023-10-23T21:02:44Z

Adding vLLM, TGI, and support for inference on a separate machine / in a subprocess is on the roadmap long-term

@haileyschoelkopf I've been looking into this idea a bit, as it's something that would be incredibly useful for my organization. One thing I'm curious about is if it is clear what API protocol an external model would need to satisfy to be compatible with lm-eval-harness. For instance, HELM recently introduced support for externally hosted models for the Neurips challenge, where encoding/decoding of tokens is handled externally by the service. That protocol involves three POST endpoints, /encode, /decode, and /process.

Is there a single protocol that a vLLM or TGI powered service would have to satisfy to be queryable by lm-eval-harness?

Cheers,
Sean

ishaan-jaff · 2023-11-01T01:40:34Z

I believe LiteLLM can help with this - we allow you to call TGI LLMs in the Completion Input/Output format
Thanks @Vinno97 cc @ManuelFay @haileyschoelkopf

ishaan-jaff · 2023-11-02T01:36:20Z

here's a tutorial on using our openai proxy server to call HF TGI models with lm-evaluation harness
docs: https://docs.litellm.ai/docs/tutorials/lm_evaluation_harness

Usage

Step 1: Start the local proxy

litellm --model huggingface/bigcode/starcoder

OpenAI Compatible Endpoint at http:https://0.0.0.0:8000/

Step 2: Set OpenAI API Base

$ export OPENAI_API_BASE="http:https://0.0.0.0:8000"

Step 3: Run LM-Eval-Harness

$ python3 main.py \
  --model gpt3 \
  --model_args engine=huggingface/bigcode/starcoder \
  --tasks hellaswag

ManuelFay · 2023-11-02T01:42:03Z

That's very cool, thanks !

ManuelFay · 2023-11-08T16:44:16Z

I have a problem with your code snippet @ishaan-jaff:

KeyError: 'Could not automatically map huggingface/my_model to a tokeniser. Please use tiktoken.get_encoding to explicitly get the tokeniser you expect.'

ishaan-jaff · 2023-11-08T16:47:57Z

@ManuelFay are you on the big refactor branch ?

can i see your code

how you start the litellm proxy
the command you're using to call lm harness

ManuelFay · 2023-11-08T17:09:05Z

Yup big-refactor branch:

Start proxy: litellm --model "huggingface/manu/llama-oscar-fr"
Command to start: python main.py --model openai-completions --model_args engine=huggingface/manu/llama-oscar-fr --tasks hellaswag
(not sure we should continue this discussion here though, does not relate to the issue)

ishaan-jaff · 2023-11-08T17:43:00Z

Agreed - I send you a linkedin request @ManuelFay - you can also DM on discord about this: https://discord.com/invite/wuPM9dRgDw

haileyschoelkopf added help wanted Contributors and extra help welcome. feature request A feature that isn't implemented yet. labels Sep 19, 2023

Vinno97 mentioned this issue Oct 25, 2023

[Feature]: Add support for OpenAI's echo parameter. BerriAI/litellm#699

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TGI support - API evaluation of HF models #869

TGI support - API evaluation of HF models #869

ManuelFay commented Sep 19, 2023

haileyschoelkopf commented Sep 19, 2023

ManuelFay commented Sep 22, 2023

sfriedowitz commented Oct 23, 2023 •

edited

ishaan-jaff commented Nov 1, 2023 •

edited

ishaan-jaff commented Nov 2, 2023

ManuelFay commented Nov 2, 2023

ManuelFay commented Nov 8, 2023

ishaan-jaff commented Nov 8, 2023

ManuelFay commented Nov 8, 2023

ishaan-jaff commented Nov 8, 2023

TGI support - API evaluation of HF models #869

TGI support - API evaluation of HF models #869

Comments

ManuelFay commented Sep 19, 2023

haileyschoelkopf commented Sep 19, 2023

ManuelFay commented Sep 22, 2023

sfriedowitz commented Oct 23, 2023 • edited

ishaan-jaff commented Nov 1, 2023 • edited

ishaan-jaff commented Nov 2, 2023

Usage

ManuelFay commented Nov 2, 2023

ManuelFay commented Nov 8, 2023

ishaan-jaff commented Nov 8, 2023

ManuelFay commented Nov 8, 2023

ishaan-jaff commented Nov 8, 2023

sfriedowitz commented Oct 23, 2023 •

edited

ishaan-jaff commented Nov 1, 2023 •

edited