Test New LLMs (Llama2, CodeLlama, etc.) on Chat-UI? #5

krrishdholakia · 2023-09-29T18:53:01Z

Notice you forked chat-ui. if you're trying to test other LLMs (codellama, wizardcoder, etc.) with it, I just wrote a 1-click proxy to translate openai calls to huggingface, anthropic, togetherai, etc. api calls.

code

$ pip install litellm

$ litellm --model huggingface/bigcode/starcoder

#INFO:     Uvicorn running on http:https://0.0.0.0:8000

>> openai.api_base = "http:https://0.0.0.0:8000"

Here's the PR on adding openai to chat-ui: huggingface#452

I'd love to know if this solves a problem for you

* fix stop btn * use media query

Co-authored-by: coyotte508 <[email protected]>

…gface#259) * Add incremental build + multi layer docker for size reduction * Update .dockerignore * fix cache miss * pr comment .gitignore --------- Co-authored-by: Eliott C <[email protected]>

* Added a few sections to the readme and reorganized it * Applied feedback on readme

…uggingface#270) (huggingface#271) * Shared convos can be opened without being logged in * Fix creating conversation from shared conv Closes huggingface#270

* basic poc for web search * * Hide feature when serpapi_key is not defined * handle error case where serpapi failed * only use user queries for generating the query * Update src/lib/buildPrompt.ts Co-authored-by: Eliott C. <[email protected]> * Update src/routes/+layout.server.ts Co-authored-by: Eliott C. <[email protected]> * refactored getQueryFromPrompt * add jsdom to package.json * begin work on fetching webpage content * Update .env Co-authored-by: Eliott C. <[email protected]> * prettier fix * Add feature for scraping webpages * refactored search functionality - now gets triggered from a separate endpoint - results are stored in db - results can be displayed in their own endpoint * Added a stream to send updates from backend on web-search endpoint * made stream more reliable * Add front-end to web search feature * made sure the web results button appears on newly posted messages * close modal when message is done generating * removed log statements * Add button to open modal on loading messages too * replace modal by collapsable menu * make sure shared conversations also show search details * Use spinner for collapse menu * Fix alignment of "stop generating" button * Fix loading indicators - spinner only shows when web search is searching - text loader shows after the web search is done * fix loading icon when web search is disabled * Update search messages & clean up summary string * Fix alignment of timeline * Use existing switch * Add a background to tooltip & center it * fix like making search messages disappear * use correct spinner * fix state issues * lint * fix bug with empty search messages * fix like bug ? * fix modal bug * error handling * fix like bug * slice scraped text so it fits in context * misc UI * bottom buttons simplify and fix * made sure snap scrolling also works on web search updates * loader * margin * remove unused function * linter * quickfix duplicate websearch --------- Co-authored-by: Eliott C. <[email protected]> Co-authored-by: Victor Mustar <[email protected]>

* update ui misc improvements * scale icon * padding

* bump version to 0.3 * lint

* broke up websearch into multiple endpoints * Refactored loading to use the load function instead of client side fetching * lint * Chat Logo Home Screen Bookmark icons for iOS (huggingface#279) * prettier fix for huggingface#279 * fix eslint --------- Co-authored-by: Carolyn Marie <[email protected]>

explictly -> explicitly

* Update README.md * Update package-lock.json

…ce#298) * Moved all huggingchat branding behind an env variable * Refactored branding to use multiple env variables * pr review * prettier * move the ethics modal behind the flag PUBLIC_APP_DISCLAIMER * inline chat ui logo so it would take the color * flex-none

* Update README.md Added endpoints variable information to default model conf in readme * Update .env Added endpoints variable and commented out * Update .env Removed .env change - Moving information to readme per feedback * Update README.md Updated the readme to include custom endpoint parameters. endpoints url, authorization & weight are now defined * Update README.md Adjusted endpoints information to refer to adding parameters instead of adjusting parameters as they do not exist in the default .env being provided. * Update README.md Formatting

…ce#302)

* web search retries * remove test error lol

…uggingface#332)

Should fix the docker container

The default case should be to use the model preprompt which wasn't being done. Closes huggingface#414

This reverts commit 87c6937.

* Fix reuqest body * update webSearchQueryPromptTemplate * update generate google query parser * Add today's date to google search query creator * crawl top stories if exts; remove answer_box & knowledgeGraph * Create paragraph chunks from top articles * flattened paragprah chunks * update status texts * add gradio client * call gradio app for RAG * Web scrape only "p, li, span" els * add MAX_N_CHUNKS * gradio result typing * parse only <p> elements * rm dev change * update typing WebSearch * buld RAG prompt * Rm dev change * change websearch context msg from user to assisntat type * use hosted gradio app * fix lint * prompt engineering * more prompt engineering * MAX_N_PAGES_SCRAPE = 10 * better error msg * more prompt engineering * revert websearch prompt to previous * rm `top_stories` from websearch as the results are not good * Stop using gradio client, use regular fetch * chore * Rm websearchsummary references as it is no longer used * update readme * Apply suggestions from code review Co-authored-by: Julien Chaumond <[email protected]> * Use tfjs to do embeddings in server node * fix websearch component disapperar after finishing generation * Show sources of closest embeddings used in RAG * fix prompting and also add current date * add comment * comment for search query * sources * hide www * using hostname direclty * Show successful web pages instead of failed ones * rm noisy messages * google query generation using previous messaages as context * handle falcon generation * bring back Browsing webpage msg --------- Co-authored-by: Julien Chaumond <[email protected]> Co-authored-by: Victor Mustar <[email protected]>

* Update README.md * add description of websearch on readme * Apply suggestions from code review Co-authored-by: Victor Muštar <[email protected]> * Update README.md --------- Co-authored-by: Mishig Davaadorj <[email protected]> Co-authored-by: Mishig <[email protected]>

* adjustments and mobile modal * use dvh unit * margin

* Add latex support with marked-katex-extension * Add renderer * Fix marked default option problem * Fix linting error * Fix lock error

* Bump mongodb from 5.3.0 to 5.8.0 Bumps [mongodb](https://github.com/mongodb/node-mongodb-native) from 5.3.0 to 5.8.0. - [Release notes](https://github.com/mongodb/node-mongodb-native/releases) - [Changelog](https://github.com/mongodb/node-mongodb-native/blob/v5.8.0/HISTORY.md) - [Commits](mongodb/node-mongodb-native@v5.3.0...v5.8.0) --- updated-dependencies: - dependency-name: mongodb dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> * Store IP in messageEvents * IP based rate limit * Revert "IP based rate limit" This reverts commit 87c6937. * ip rate limit * move rate limit event to top * Add rate limiting to websearch and title summary (huggingface#433) * [Websearch] update (huggingface#427) * Fix reuqest body * update webSearchQueryPromptTemplate * update generate google query parser * Add today's date to google search query creator * crawl top stories if exts; remove answer_box & knowledgeGraph * Create paragraph chunks from top articles * flattened paragprah chunks * update status texts * add gradio client * call gradio app for RAG * Web scrape only "p, li, span" els * add MAX_N_CHUNKS * gradio result typing * parse only <p> elements * rm dev change * update typing WebSearch * buld RAG prompt * Rm dev change * change websearch context msg from user to assisntat type * use hosted gradio app * fix lint * prompt engineering * more prompt engineering * MAX_N_PAGES_SCRAPE = 10 * better error msg * more prompt engineering * revert websearch prompt to previous * rm `top_stories` from websearch as the results are not good * Stop using gradio client, use regular fetch * chore * Rm websearchsummary references as it is no longer used * update readme * Apply suggestions from code review Co-authored-by: Julien Chaumond <[email protected]> * Use tfjs to do embeddings in server node * fix websearch component disapperar after finishing generation * Show sources of closest embeddings used in RAG * fix prompting and also add current date * add comment * comment for search query * sources * hide www * using hostname direclty * Show successful web pages instead of failed ones * rm noisy messages * google query generation using previous messaages as context * handle falcon generation * bring back Browsing webpage msg --------- Co-authored-by: Julien Chaumond <[email protected]> Co-authored-by: Victor Mustar <[email protected]> * bump to 0.6.0 (huggingface#434) * Update README.md (huggingface#435) * Update README.md * add description of websearch on readme * Apply suggestions from code review Co-authored-by: Victor Muštar <[email protected]> * Update README.md --------- Co-authored-by: Mishig Davaadorj <[email protected]> Co-authored-by: Mishig <[email protected]> * Mobile: fix model selection (huggingface#448) * adjustments and mobile modal * use dvh unit * margin * fix lint on main * Add latex support with marked-katex-extension (huggingface#450) * Add latex support with marked-katex-extension * Add renderer * Fix marked default option problem * Fix linting error * Fix lock error --------- Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Nathan Sarrazin <[email protected]> Co-authored-by: Mishig <[email protected]> Co-authored-by: Julien Chaumond <[email protected]> Co-authored-by: Victor Mustar <[email protected]> Co-authored-by: Mishig Davaadorj <[email protected]> Co-authored-by: Blanchon <[email protected]>

…gingface#451) * feat: Improve error handling and parsing of MODELS environment variable * Add more verbose parsing error * Lint * improve message * lint * refactor error handling and default values in models * improve * format --------- Co-authored-by: Nathan Sarrazin <[email protected]>

* Use `gte-base` as the emebdding model * use `bge-small-en-v1.5` * Revert "use `bge-small-en-v1.5`" This reverts commit 8cfe084. * Use `gte-small`

This reverts commit f88542b.

…ted (huggingface#451)" This reverts commit 8ce8b63.

This reverts commit 1061bc2.

* wip: complete refactor of streaming backend * working refactoring * fix missing first token & perf regression in output quality * lint * Fix websearch loading from db * fix loading * fix invalidate * remove logs * fix SSR error * typo: paragraphs * fixed save on abort * lint * lint * remove debug log in console * lint for real

* Refactor summarization * get rid of debug log * remove old todo

* fix JSON.parse for summerize When serving with TGI, summerize calls this function and it errors with `SyntaxError: Unexpected token d in JSON at position 0` This PR fixes the problem and keeps existing behaviour. * fix types --------- Co-authored-by: Nathan Sarrazin <[email protected]>

* add-copytoclipboardbtn for the all message * fix padding * fix padding * Fix styling * Move before like and dislike button * position and spacing * mobile fix --------- Co-authored-by: Victor Mustar <[email protected]>

TP-O and others added 30 commits May 23, 2023 13:16

Fix stop generating button (huggingface#244)

293ff91

* fix stop btn * use media query

⚡️ Improve docker incremental build time (huggingface#142)

ff2db2e

🔧 Add "directConnection" option to MongoDB (huggingface#260)

1b9697f

🥅 Display OIDC error properly (huggingface#261)

101f9ef

feat openid login with google (huggingface#250)

fa3b3b4

Co-authored-by: coyotte508 <[email protected]>

Export to parquet: also export score (huggingface#265)

9658717

Add incremental build + multi layer docker for size reduction (huggin…

74532d2

…gface#259) * Add incremental build + multi layer docker for size reduction * Update .dockerignore * fix cache miss * pr comment .gitignore --------- Co-authored-by: Eliott C <[email protected]>

🐛 Fix export of convos (huggingface#267)

aa125df

🩹 Make preferred_username optional

002a2a0

Added a few sections to the readme and reorganized it (huggingface#264)

fae93d9

* Added a few sections to the readme and reorganized it * Applied feedback on readme

Shared convos can be opened without being logged in (huggingface#266, h…

002f606

…uggingface#270) (huggingface#271) * Shared convos can be opened without being logged in * Fix creating conversation from shared conv Closes huggingface#270

Web search details: ui update (huggingface#277)

9d7a9c3

* update ui misc improvements * scale icon * padding

Chat Logo Home Screen Bookmark icons for iOS (huggingface#279)

36022cf

Bump version to v0.3 (huggingface#283)

f854cbb

* bump version to 0.3 * lint

fix details arrow (huggingface#285)

aa18b4d

Fix typo in Settings.ts (huggingface#286)

df5a2eb

explictly -> explicitly

Fixed grammar (huggingface#291)

abe7804

* Update README.md * Update package-lock.json

Fix code example preview (huggingface#300)

f567f41

Fix README linting & add details about auth

e34af36

add a readme section about theming

7457e8c

Added Serper.dev API as an alternative web search provider (huggingfa…

6f7b315

…ce#302)

add details about websearch to README

b46dc11

very basic rate limiter (huggingface#320)

922b1b2

Add support for websearch retries (huggingface#318)

0aa57de

* web search retries * remove test error lol

loader dots fix

fb55900

feat: factor out HF_API_ROOT to allow different inference endpoints (h…

3baa389

…uggingface#332)

nsarrazin and others added 30 commits August 22, 2023 19:37

fix auth on websearch (huggingface#410)

e91b76c

🐛 Fix - Remove the min length validation on preprompt

7533ab7

Should fix the docker container

🐛 Fix - preprompt not being passed correctly

7560449

The default case should be to use the model preprompt which wasn't being done. Closes huggingface#414

Store IP in messageEvents

b8c0a1d

IP based rate limit

87c6937

Revert "IP based rate limit"

2e8d14d

This reverts commit 87c6937.

ip rate limit

6ee13bf

move rate limit event to top

ba93cf8

Add rate limiting to websearch and title summary (huggingface#433)

0953d85

bump to 0.6.0 (huggingface#434)

e5afba2

Mobile: fix model selection (huggingface#448)

c867764

* adjustments and mobile modal * use dvh unit * margin

fix lint on main

77df078

Add latex support with marked-katex-extension (huggingface#450)

15bf16f

* Add latex support with marked-katex-extension * Add renderer * Fix marked default option problem * Fix linting error * Fix lock error

Update embedding model for WebSearch (huggingface#437)

f88542b

* Use `gte-base` as the emebdding model * use `bge-small-en-v1.5` * Revert "use `bge-small-en-v1.5`" This reverts commit 8cfe084. * Use `gte-small`

Revert "Update embedding model for WebSearch (huggingface#437)"

1061bc2

This reverts commit f88542b.

Revert "Improve error message when the .env MODELS is not well format…

7ddda31

…ted (huggingface#451)" This reverts commit 8ce8b63.

Revert "Revert "Update embedding model for WebSearch (huggingface#437)""

aa07e29

This reverts commit 1061bc2.

Update README.md (huggingface#455)

afbf680

Refactor summarization so it gets called from backend (huggingface#456)

9960338

* Refactor summarization * get rid of debug log * remove old todo

Make embedding model settings more future-proof (huggingface#454)

3acc11d

error console instead of crashing

5b07906

fix types

0134fe1

Add a message wide copy button (huggingface#453)

a7dc1aa

* add-copytoclipboardbtn for the all message * fix padding * fix padding * Fix styling * Move before like and dislike button * position and spacing * mobile fix --------- Co-authored-by: Victor Mustar <[email protected]>

Update README.md

af76417

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Test New LLMs (Llama2, CodeLlama, etc.) on Chat-UI? #5

Test New LLMs (Llama2, CodeLlama, etc.) on Chat-UI? #5

krrishdholakia commented Sep 29, 2023 •

edited

Loading

Test New LLMs (Llama2, CodeLlama, etc.) on Chat-UI? #5

Are you sure you want to change the base?

Test New LLMs (Llama2, CodeLlama, etc.) on Chat-UI? #5

Conversation

krrishdholakia commented Sep 29, 2023 • edited Loading

krrishdholakia commented Sep 29, 2023 •

edited

Loading