# Bindings ## Precondition Build target `libchatllm`: ### Windows: Assume MSVC is used. 1. Build target `libchatllm`: ```sh cmake --build build --config Release --target libchatllm ``` 1. Copy `libchatllm.dll`, `libchatllm.lib` and `ggml.dll` to `bindings`; ### Linux/MacOS: 1. Build target `libchatllm`: ```sh cmake --build build --target libchatllm ``` ## Python ### Command line Run [chatllm.py](../bindings/chatllm.py) with exactly the same command line options. For example, * Linux: `python3 chatllm.py -i -m path/to/model` * Windows: `python chatllm.py -i -m path/to/model` ### Web demo There is also a [Chatbot](../scripts/chatllm_st.py) powered by [Streamlit](https://streamlit.io/): ![](chatbot_st.png) To start it: ```sh streamlit run chatllm_st.py -- -i -m path/to/model ``` Note: "STOP" function is not implemented yet. ### OpenAI Compatible API [Here](../scripts/openai_api.py) is a server providing OpenAI Compatible API. Note that most of the parameters are ignored. With this, one can start two servers one for chatting and one for code completion (a base model supporting fill-in-the-middle is required), and setup a fully functional local copilot in Visual Studio Code with the help of tools like [twinny](https://github.com/rjmacarthy/twinny). `openai_api.py` takes three arguments specifying models for chatting, code completion and text embedding respectively. For example, use DeepSeekCoder instructed for chatting, and its base model for code completion: ```sh python openai_api.py path/to/deepseekcoder-1.3b.bin /path/to/deepseekcoder-1.3b-base.bin ``` Additional arguments for each model can be specified too. For example: ```sh python openai_api.py path/to/chat/model /path/to/fim/model * --temp 0 --top_k 2 --- --temp 0.8 ``` Where `--temp 0 --top_k 2` are passed to the chatting model, while `--temp 0.8` are passed to the code completion model. `openai_api.py` uses `model` and API path to select chatting or completion models: when `Model name` to something either starting with `fim` or ending with `fim`, or API path is ending with `/generate`, code completion model is selected; otherwise, chatting model is selected. Here is a reference configuration in `twinny`: ![](twinny_cfg.png) Note that, `openai_api.py` is tested to be compatible with provider `litellm`. Some models that can be used for code completion: * DeepSeekCoder: [Coder-Base-1.3B](https://huggingface.co/deepseek-ai/deepseek-coder-1.3b-base) * CodeGemma v1.1: [Base-2B](https://huggingface.co/google/codegemma-1.1-2b), [Base-7B](https://huggingface.co/google/codegemma-1.1-7b) * StarCoder2: [Base-3B](https://huggingface.co/bigcode/starcoder2-7b), [Base-7B](https://huggingface.co/bigcode/starcoder2-7b), [Base-15B](https://huggingface.co/bigcode/starcoder2-15b) (not recommended) ## JavaScript/TypeScript ### Command line Run [chatllm.ts](../bindings/chatllm.ts) with exactly the same command line options using [Bun](https://bun.sh/): ```shell bun run chatllm.ts -i -m path/to/model ``` WARNING: Bun [looks buggy on Linux](https://github.com/oven-sh/bun/issues/10242). ## Other Languages `libchatllm` can be utilized by all languages that can call into dynamic libraries. Take C as an example: * Linux 1. Build `bindings\main.c`: ```sh export LD_LIBRARY_PATH=.:$LD_LIBRARY_PATH gcc main.c libchatllm.so ``` 1. Test `a.out` with exactly the same command line options. * Windows: 1. Build `bindings\main.c`: ```shell cl main.c libchatllm.lib ``` 1. Test `main.exe` with exactly the same command line options.