Streamlit Chatbot with Memory using Llama-2-7B-Chat (Quantized GGML)

Working Url: https://chatdemo.talhaanwar.com/

Introduction

Welcome to the Streamlit Chatbot with Memory using Llama-2-7B-Chat (Quantized GGML) repository! This project aims to provide a simple yet efficient chatbot that can be run on a CPU-only low-resource Virtual Private Server (VPS). The chatbot is powered by the Llama-2-7B-Chat model, which has been quantized for better performance on resource-constrained environments. The web application is built using Streamlit, making it user-friendly and easy to interact with.

The main inspiration for this project comes from the need to have a chatbot capable of maintaining some context or memory during conversations, making the chat experience more natural and engaging for users. With the quantized GGML version of the Llama-2-7B-Chat model, we can leverage powerful language generation capabilities without the need for specialized hardware.

Project Overview

The repository contains all the necessary code and files to set up and run the Streamlit Chatbot with Memory using the Llama-2-7B-Chat model. Here's a brief overview of the key components:

app.py: The Streamlit web application code that allows users to interact with the chatbot through a simple user interface.
llama-2-7b-chat.ggmlv3.q2_K.bin: Quantized model weights provided by TheBloke. Can be download via wget or manually

!wget https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGML/resolve/main/llama-2-7b-chat.ggmlv3.q2_K.bin

Dockerfile: The Dockerfile used to containerize the application for easy deployment and management.
docker-compose.yml: A Docker Compose file that simplifies the deployment of the chatbot with memory as a Docker container.
requirements.txt: Contains a list of Python dependencies required to run the application.

Prerequisites

Before running the Streamlit Chatbot with Memory, you need to have the following installed:

Python (version 3.6 or higher)
Docker (if you plan to use the Docker Compose setup)

Installation

To set up the chatbot locally, follow these steps:

Clone this repository to your local machine using Git:

git clone https://github.com/talhaanwarch/streamlit-llama.git
cd streamlit-chatbot-memory

Create a virtual environment (optional but recommended) and activate it:

python -m venv venv
source venv/bin/activate

Install the required Python packages:

pip install -r requirements.txt

Usage

To run the Streamlit Chatbot with Memory, execute the following command:

streamlit run app.py

This will start the Streamlit server, and you can access the chatbot interface by opening your web browser and navigating to https://localhost:8501.

Simply type your messages in the input box and press "Enter" to send them to the chatbot. The chatbot will respond based on the context of the conversation, thanks to its memory capabilities.

Docker Compose

If you prefer to deploy the chatbot using Docker Compose, follow these steps:

Make sure you have Docker and Docker Compose installed on your system.
Start the container using Docker Compose:

docker-compose up -d

The chatbot will be accessible at https://localhost:8501 in your web browser.

Acknowledgments

Streamlit Chatbot code credits go to Data Professor. Their original repository can be found here.
Quantized GGML version of Llama-2-7B-Chat credits go to TheBloke.

Thank you for your interest in this project. Feel free to contribute, report issues, or provide feedback. Happy chatting!

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.gitignore		.gitignore
.gitingore		.gitingore
Dockerfile		Dockerfile
README.md		README.md
app.py		app.py
docker-compose.yml		docker-compose.yml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Streamlit Chatbot with Memory using Llama-2-7B-Chat (Quantized GGML)

Table of Contents

Introduction

Project Overview

Prerequisites

Installation

Usage

Docker Compose

Acknowledgments

About

Languages

talhaanwarch/streamlit-llama

Folders and files

Latest commit

History

Repository files navigation

Streamlit Chatbot with Memory using Llama-2-7B-Chat (Quantized GGML)

Table of Contents

Introduction

Project Overview

Prerequisites

Installation

Usage

Docker Compose

Acknowledgments

About

Topics

Resources

Stars

Watchers

Forks

Languages