Skip to content

1. Getting Started

Brian Dashore edited this page Apr 28, 2024 · 13 revisions

1a. Prerequisites

To get started, make sure you have the following installed on your system:

  • Python 3.x (preferably 3.11) with pip

    • Do NOT install python from the Microsoft store! This will cause issues with pip.

    • Alternatively, you can use miniconda if it's present on your system.

Note

You can install miniconda3 on your system which will give you the benefit of having both python and conda!

Warning

CUDA and ROCm aren't prerequisites because torch can install them for you. However, if this doesn't work (ex. DLL load failed), install the CUDA toolkit or ROCm on your system.

1b. Installing

For Beginners

  1. Clone this repository to your machine: git clone https://github.com/theroyallab/tabbyAPI

  2. Navigate to the project directory: cd tabbyAPI

  3. Run the appropriate start script (start.bat for Windows and start.sh for linux).

    1. Follow the on-screen instructions and select the correct GPU library.

    2. Assuming that the prerequisites are installed and can be located, a virtual environment will be created for you and dependencies will be installed.

  4. The API should start with no model loaded

For Advanced Users

Note

TabbyAPI has recently switched to use pyproject.toml. These instructions may look different than before.

  1. Follow steps 1-2 in the For Beginners section

  2. Create a python environment through venv:

    1. python -m venv venv

    2. Activate the venv

      1. On Windows: .\venv\Scripts\activate

      2. On Linux: source venv/bin/activate

  3. Install the requirements file based on your system:

    1. Cuda 12.x: pip install -U .[cu121]

    2. Cuda 11.8: pip install -U .[cu118]

    3. ROCm 5.6: pip install -U .[amd]

  4. Start the API by either

    1. Run start.bat/sh. The script will check if you're in a conda environment and skip venv checks.

    2. Run python main.py to start the API. This won't automatically upgrade your dependencies.

1c. Configuration

Loading solely the API may not be your optimal usecase. Therefore, a config.yml exists to tune initial launch parameters and other configuration options.

A config.yml file is required for overriding project defaults. If you are okay with the defaults, you don't need a config file!

If you do want a config file, copy over config_sample.yml to config.yml. All the fields are commented, so make sure to read the descriptions and comment out or remove fields that you don't need.

You can also access the configuration parameters under 2. Configuration in this wiki!

1d. Where next?

  1. Take a look at the API documentation

  2. Get started with a loader: SillyTavern, Gradio WebUI

1e. Updating

There are a couple ways to update TabbyAPI:

  1. Start scripts - Use start.bat or start.sh. The dependencies will automatically be updated.

    1. To ignore automatic dependency updates, pass the --ignore-upgrade flag when invoking the script.
  2. Manual - Install the requirements file and update dependencies depending on your GPU:

    1. pip install -U .[cu121] = CUDA 12.x

    2. pip install -U .[cu118] = CUDA 11.8

    3. pip install -U .[amd] = ROCm 5.6

If you don't want to update dependencies that come from wheels (torch, exllamav2, and flash attention 2), use requirments-nowheel.txt or pass the --nowheel flag when invoking the start scripts.

Update Exllamav2

Warning

These instructions are meant for advanced users.

Important

If you're installing a custom Exllamav2 wheel, make sure to use pip install . when updating! Otherwise, each update will overwrite your custom exllamav2 version.

NOTE:

  • TabbyAPI enforces the latest Exllamav2 version for compatibility purposes.

  • Any upgrades using a requirements file will result in overwriting your installed wheel.

    • To fix this, change the feature in pyproject.toml locally, create an issue or PR, or install your version of exllamav2 after upgrades.

Here are ways to install exllamav2:

  1. From a wheel/release (Recommended)

    1. Find the version that corresponds with your cuda and python version. For example, a wheel with cu121 and cp311 corresponds to CUDA 12.1 and python 3.11
  2. From pip: pip install exllamav2

    1. This is a JIT compiled extension, which means that the initial launch of tabbyAPI will take some time. The build may also not work due to improper environment configuration.
  3. From source

1f. Other installation methods

These are short-form instructions for other methods that users can use to install TabbyAPI.

Warning

Using methods other than venv may not play nice with startup scripts. Using these methods indicates that you're an advanced user and know what you're doing.

Conda

  1. Install Miniconda3 with python 3.11 as your base python

  2. Create a new conda environment conda create -n tabbyAPI python=3.11

  3. Activate the conda environment conda activate tabbyAPI

  4. Install optional dependencies if they aren't present

    1. CUDA via

      1. CUDA 12 - conda install -c "nvidia/label/cuda-12.2.2" cuda

      2. CUDA 11.8 - conda install -c "nvidia/label/cuda-11.8.0" cuda

    2. Git via conda install -k git

  5. Clone TabbyAPI via git clone https://github.com/theroyallab/tabbyAPI

  6. Continue installation steps from:

    1. For Beginners - Step 3. The start scripts detect if you're in a conda environment and skips the venv check.

    2. For Advanced Users - Step 3

Docker

  1. Install Docker and docker compose from the docs

  2. Install the Nvidia container compatibility layer

    1. For Linux: Nvidia container toolkit

    2. For Windows: Cuda Toolkit on WSL

  3. Clone TabbyAPI via git clone https://github.com/theroyallab/tabbyAPI

  4. Enter the tabbyAPI directory by cd tabbyAPI

  5. Run docker compose up