Chat with MLX is a high-performance macOS application that connects your local documents to a personalized large language model (LLM). By leveraging retrieval-augmented generation (RAG), open source LLMs, and MLX for accelerated machine learning on Apple silicon, you can efficently search, query, and interact with your documents without information ever leaving your device.
Our high-level features include:
- Query: load and search with document-specific prompts
- Converse: switch model interaction modes (converse vs. assist) in real time
- Instruct: provide personalization and response tuning
First, setup huggingface access tokens to download models (request access to google/gemma-7b-it), then
huggingface-cli login
Then download the npm/python requirements
cd app && npm install
pip install -r server/requirements.txt
Finally, start the application
cd app && npm run dev
All contributions are welcome. Please take a look at contributing guide.