-
Notifications
You must be signed in to change notification settings - Fork 980
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
starcoder : example for using scratch buffers to reduce memory usage #176
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested santacoder with a prompt of length 2003 and tried to generate 45 tokens and it worked 🎉
Thank you for taking care of this
Cool! |
Seems like it still doesnt work for
|
Bumped the buffers to 256 MB. Still not sure if enough - need to figure out a better way to do this |
Starcoder works now with 8k context! 🎉
|
Btw, I saw you are using |
I didn't compare 8 to 48 threads, but 48 was definitely faster than the default 4 |
Hey I am facing the issue with Groovy 1.3 on GPT4all. I don't know why. Can you tell me about it ? |
Not ideal solution, but probably a good starting point.
Needs some testing