Skip to content

A DIY browser interface for interacting with llama locally.

Notifications You must be signed in to change notification settings

mchennupati/cppproj

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

This is a Noob repo to display responses from llama.cpp in the browser. It runs a cpp webserver using cpp httplib. A post request from the browser triggers the llamacpp call with the message. So you can get an idea of how ollama and others built on top of the quantization etc from llamacpp.

How would the future look like?
Finetuning to update a base model daily over a month, probably costs a few hundred thousand bucks (funded together depending on the nature of finetuning(?) by interested parties, govts, ngos etc), so your connected device updates daily a bit like the windows updates ;) and no personal information needs to be given to find out how to bake a red velvet cake...

Anyway... these are the instructions...

  1. If you have got llama.cpp running on your machine, i.e you can get this command to work, then move to the next steps. "../llama.cpp/main -m ../llama.cpp/models/llama-2-7b-chat/ggml-model-q4_0.gguf -n 1024 --prompt ""

This was useful: https://vladiliescu.net/running-llama2-on-apple-silicon/

  1. git clone and change line 23 in server.cpp so that it reflects your path.

  2. hit make and run ./webserver

  3. You can then access llama on localhost:8000

llama

About

A DIY browser interface for interacting with llama locally.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published