Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test New LLMs (Llama2, CodeLlama, etc.) on Chat-UI? #1

Open
wants to merge 35 commits into
base: main
Choose a base branch
from

Conversation

krrishdholakia
Copy link

Hi @psinger,

Notice you forked chat-ui. if you're trying to test other LLMs (codellama, wizardcoder, etc.) with it, I just wrote a 1-click proxy to translate openai calls to huggingface, anthropic, togetherai, etc. api calls.

code

$ pip install litellm

$ litellm --model huggingface/bigcode/starcoder

#INFO:     Uvicorn running on http:https://0.0.0.0:8000

>> openai.api_base = "http:https://0.0.0.0:8000"

Here's the PR on adding openai to chat-ui: huggingface#452

I'd love to know if this solves a problem for you

eltociear and others added 30 commits June 9, 2023 09:35
* Update README.md

* Update package-lock.json
…ce#298)

* Moved all huggingchat branding behind an env variable

* Refactored branding to use multiple env variables

* pr review

* prettier

* move the ethics modal behind the flag PUBLIC_APP_DISCLAIMER

* inline chat ui logo so it would take the color

* flex-none
* Update README.md

Added endpoints variable information to default model conf in readme

* Update .env

Added endpoints variable and commented out

* Update .env

Removed .env change - Moving information to readme per feedback

* Update README.md

Updated the readme to include custom endpoint parameters.

endpoints url, authorization & weight are now defined

* Update README.md

Adjusted endpoints information to refer to adding parameters instead of adjusting parameters as they do not exist in the default .env being provided.

* Update README.md

Formatting
* web search retries

* remove test error lol
…ce#319)

* Add support for HF endpoint for summary

* add fail-safe for summarization
* add optional timestamp field to messages

* Add a `hashConv` function that only uses a subset of the message for hashing
* Add ability to define custom model/dataset URLs

* lint

---------

Co-authored-by: Nathan Sarrazin <[email protected]>
* Update README.md

* Update README.md

Co-authored-by: Julien Chaumond <[email protected]>

* Align with header

* lint

* fixed markdown table of content

---------

Co-authored-by: Julien Chaumond <[email protected]>
Co-authored-by: Nathan Sarrazin <[email protected]>
* disable login on first message

* update banner here too

* modal wording tweaks

* prevent NaN

---------

Co-authored-by: Victor Mustar <[email protected]>
* disable login on first message

* update banner here too

* modal wording tweaks

* prevent NaN

* fix login wall

* fix flicker

* lint

* put modal text behind login check

* fix bug with sending messages without login

* fix misalignment between ui and api

* fix data update on disable login

---------

Co-authored-by: Nathan Sarrazin <[email protected]>
* Update README.md

* Update README.md

Co-authored-by: Julien Chaumond <[email protected]>

* Update README.md

---------

Co-authored-by: Julien Chaumond <[email protected]>
The userMessageToken, assistantMessageToken, messageEndToken, and
parameters.stop settings in `MODELS` do not have to be a token. They can
be any string.
* rm open assistant branding

* Update SettingsModal.svelte

* make settings  work with a dynamic list of models

* fixed types

---------

Co-authored-by: Nathan Sarrazin <[email protected]>
AndreasMadsen and others added 5 commits August 2, 2023 14:17
The chat generation removes parameters.stop and <|endoftext|>
from the generated text. And additionally trims trailing whitespace.

This PR copies that behavior to the summarize functionality, when the
summary is produced by a the chat model.
* allow different user and assistant end-token

For models like Llama2, the EndToken is not the same for a userMessage
and an assistantMessage. This implements `userMessageEndToken` and
`assistantMessageEndToken` which overwrites the messageEndToken
behavior.

This PR also allows empty strings as userMessageToken and
assistantMessageToken and makes this the default. This adds additional
flexibility, which is required in the case of Llama2 where the first
userMessage is effectively different because of the system message.

Note that because `userMessageEndToken` and `assistantMessageToken` are
nearly always concatenated, it is almost redundant to have both. The
exception is `generateQuery` for websearch which have several
consecutive user messages.

* Make model branding customizable based on env var (huggingface#345)

* rm open assistant branding

* Update SettingsModal.svelte

* make settings  work with a dynamic list of models

* fixed types

---------

Co-authored-by: Nathan Sarrazin <[email protected]>

* trim and remove stop-suffixes from summary (huggingface#369)

The chat generation removes parameters.stop and <|endoftext|>
from the generated text. And additionally trims trailing whitespace.

This PR copies that behavior to the summarize functionality, when the
summary is produced by a the chat model.

* add a login button when users are logged out (huggingface#381)

* add fallback to message end token if there's no specified tokens for user & assistant

---------

Co-authored-by: Florian Zimmermeister <[email protected]>
Co-authored-by: Nathan Sarrazin <[email protected]>
* Use modelUrl instead of building it from model name

* Preserve compatibility with optional modelUrl config

Use modelUrl if informed, else use the previous behavior.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet