-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
allow different user and assistant end-token #375
Conversation
For models like Llama2, the EndToken is not the same for a userMessage and an assistantMessage. This implements `userMessageEndToken` and `assistantMessageEndToken` which overwrites the messageEndToken behavior. This PR also allows empty strings as userMessageToken and assistantMessageToken and makes this the default. This adds additional flexibility, which is required in the case of Llama2 where the first userMessage is effectively different because of the system message. Note that because `userMessageEndToken` and `assistantMessageToken` are nearly always concatenated, it is almost redundant to have both. The exception is `generateQuery` for websearch which have several consecutive user messages.
Thanks for this PR! I'll check it with our llama 2 endpoint and see if it improves things a bit. This could be super useful 😄 |
* rm open assistant branding * Update SettingsModal.svelte * make settings work with a dynamic list of models * fixed types --------- Co-authored-by: Nathan Sarrazin <[email protected]>
The chat generation removes parameters.stop and <|endoftext|> from the generated text. And additionally trims trailing whitespace. This PR copies that behavior to the summarize functionality, when the summary is produced by a the chat model.
Seems to work great with llama 2 on prod! I added a fallback, and I'm going to merge this PR. Thanks again for the great work. |
@nsarrazin Thanks for merging this. Please note that the fallback was already implemented in the config parsing: Having spent some time on this code, I think the project will generally benefit from moving more logical to the config parsing. Another case is the |
Oh my bad, good catch! I agree overall, I think the logic in some places is a bit all over the place... I also don't really like the |
@nsarrazin I think the My immediate suggestion would be to store all non-secretes that are reasonable to check-in with git, in a separate JSON file. You could have an optional environment variable that points to the JSON file. Because some properties contain newlines, my preference would be a YAML file instead. That would also allow you to document each property with comments and some of the DRY features that YAML offers (like anchors). A developer can then copy that file default config file, and overwrite the properties they wish. My second advice would be to avoid clever defaults. If you are anyway going to make breaking changes, the don't for example make Finally, a disclaimer, while I've done Node.js for a fair number of years, I'm really old school. I haven't been following up on the "new" reactive isomorphic trends, so if there is some Vite or svelte system for managing this, I for sure wouldn't know about it. 😅 |
I think the |
Unless chat-ui removes the trailing space behind the scenes |
@loubnabnl It looks like you are partially correct. There should be a space after This requires some conditional logic. #400 adds support for that. |
For models like Llama2, the EndToken is not the same for a userMessage and an assistantMessage. This implements
userMessageEndToken
andassistantMessageEndToken
which overwrites the messageEndToken behavior.This PR also allows empty strings as
userMessageToken
andassistantMessageToken
and makes this the default. This adds additional flexibility, which is required in the case of Llama2 where the first userMessage is effectively different because of the system message.Note that because
userMessageEndToken
andassistantMessageToken
are nearly always concatenated, it is almost redundant to have both. The exception isgenerateQuery
for websearch which have several consecutive user messages.Documentation for the correct prompting:
With the existing config provided in #361 (comment)
the prompt becomes incorrect:
However, with this PR, the following config is possible:
the prompt has the correct formatting: