Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Pure local mode lets you use sketch without sending any of your data over the network
Do this by setting 3 environment variables
os.environ['SKETCH_USE_REMOTE_LAMBDAPROMPT'] = 'False'
(this turns off talking to us)os.environ['LAMBDAPROMPT_BACKEND'] = 'StarCoder'
(this sets StarCoder model as your backend (rather than relying on OpenAI) (if you want to be sure its set, checklambdaprompt.backends.backends
and you can calllambdaprompt.backends.set_backend('StarCoder')
os.environ['HF_ACCESS_TOKEN'] = 'your_hugging_face_token'
(This is necessary for pulling star-coder, which has an OpenRAIL license --> IF you useMPT
as your backend, you will not need the HF_ACCESS_TOKEN)Notes!
The first install is pretty slow (downloading full weights). I timed it at ~9 minutes
Second runs (eg in a fresh kernel, where the weights are downloaded) still take ~2 minutes for the first request.
But after, that, requests are pretty quick (on an A100 80GB) (and it should work on an A100 40 GB)
I think overall, the more I've been testing, I think StarCoder is actually weaker than OpenAI davinci, but it definitely still works
Here are some screenshots
(note it took 1.8s to complete this one) (Note, this HTML list + description took 7.4 seconds)(when outputting a lot of code, grows to ~13 seconds)
Overall, these outputs are reasonable but I think when i pay close attention, they're not as good as GPT3 answers. But for a purely open, entirely local model, seems pretty great!