Skip to content

batmen-lab/BioMANIA

Repository files navigation

BioMANIA Logo

BioMANIA

Paper GitHub stars Documentation Status Python unit tests License

Docker Version Railway Open In Colab

Welcome to the BioMANIA! This guide provides detailed instructions on how to set up, run, and interact with the BioMANIA chatbot interface, which connects seamlessly with various APIs to deliver information across numerous libraries and frameworks.

Project Overview:

🌟 We warmly invite you to share your trained models and datasets in our issues section, making it easier for others to utilize and extend your work, thus amplifying its impact. Feel free to explore and provide feedback on tools shared by other contributors as well! 🚀🔍

We welcome 🤗 you to refer to the Q&A section if you encounter any problems during your exploration and contribute some issues for discussion! 🧐 👨‍💻

Video demo

Our demonstration showcases how to utilize a chatbot to simultaneously use scanpy and squidpy in a single conversation, including loading data, invoking functions for analysis, and presenting outputs in the form of code, images, and tables

We also offer a command-line interface (CLI) demo through the terminal.

We also offer a GPTs demo (under developing).

Web access online demo

We provide a colab demo Open In Colab and an online demo hosted on our server!

Quick start

We provide several ways to run the service: terminal CLI, Docker, railway, python script, colab demo. Among those, terminal CLI is the easiest way to start. \

Setup dataset and models

# setup the environment
pip install git+https://github.com/batmen-lab/BioMANIA.git  --index-url https://pypi.org/simple
# setup OPENAI_API_KEY
echo 'OPENAI_API_KEY="sk-proj-xxxx"' >> .env
# (optional) setup github token
echo "GITHUB_TOKEN=your_github_token" >> .env
# download data, retriever, and resources from drive, and put them to the 
# - data/standard_process/{LIB} and 
# - hugging_models/retriever_model_finetuned/{LIB} and 
# - ../../resources/
pip install gdown==5.1.0
gdown https://drive.google.com/uc?id=1nT28pIJ_dsdvi2yD8ffWt_aePXsSWdqI
sh download_data_model.sh
# setup the PYTHONPATH
export PYTHONPATH=$PYTHONPATH:$(pwd)

Run with terminal CLI or gradio app

# CLI service quick start!
python -m BioMANIA.deploy.cli_demo
# or gradio app. (TODO 240509: Images showing are under developing!)
python -m BioMANIA.deploy.cli_gradio

Run with Docker

For ease of use, we provide Docker images for several tools. You can refer the detailed tools list from dockerhub.

# Pull back-end service and front-end UI service with:
docker pull chatbotuibiomania/biomania-together:v1.1.9-${LIB}-cuda12.1-ubuntu22.04

Start service with

# run on gpu
docker run -e LIB=${LIB} -e OPENAI_API_KEY=[your_openai_api_key] --gpus all -d -p 3000:3000 chatbotuibiomania/biomania-together:v1.1.9-${LIB}-cuda12.1-ubuntu22.04
# or on cpu
docker run -e LIB=${LIB} -e OPENAI_API_KEY=[your_openai_api_key] -d -p 3000:3000 chatbotuibiomania/biomania-together:v1.1.9-${LIB}-cuda12.1-ubuntu22.04

Then check UI service with http:https://localhost:3000/en.

Important Tips for Running Docker Without Bugs:

  • To run docker on GPU, you need to install nvidia-docker and nvidia container toolkit. Run docker info | grep "Default Runtime" to check if your device can run docker with gpu.
  • Feel free to adjust the cuda image version inside the Dockerfile to configure it for different CUDA settings which is compatible for your device.

We understand the desire to run the service on a server and visualize locally. You can initiate the ngrok service by running this script on your server:

ngrok http 3000

then get the url like https://[ngrok_id].ngrok-free.app and copy it to chrome to start!

Run with Railway

Deploy on Railway

To use railway, you'll need to fill in the OpenAI_API_KEY in the Variables page of the biomania-backend service. Then, manually enable Public Domain in the Settings/Networking session for both front-end and back-end service. Copy the url from back-end as https://[copied url] and paste it in BACKEND_URL in front-end Variables page. For front-end url, paste it to the browser to access the frontend.

Run with script

This section is provided for user who want DIY more flexible function.

For instance, let's take scanpy as an example. Detailed library support information can be found in the Q&A

Setting up for environment

To prepare your environment for the BioMANIA project, follow these steps:

  1. Clone the repository and install dependencies:
git clone https://github.com/batmen-lab/BioMANIA.git
cd BioMANIA
conda create -n biomania python=3.10
conda activate biomania
pip install -r requirements.txt --index-url https://pypi.org/simple
export PYTHONPATH=$PYTHONPATH:$(pwd)
  1. Set up your OpenAI API key in the BioMANIA/.env file.
echo 'OPENAI_API_KEY="sk-proj-xxxx"' >> .env
  • For inference purposes, a standard OpenAI API key is sufficient.
  • If you intend to use functionalities such as instruction generation or GPT API predictions, a paid OpenAI account is required as it may reach rate limit.
  • Feel free to switch to model_name='gpt-3.5-turbo-0125' or gpt-4-0125-preview in src/models/model.py if you want.

Prepare for Data and Model

Download the necessary data and models from our Google Drive link. For those library data, you can download only the one you need.

We provide a script for downloading models and datas from Google Drive for scanpy as an example. This works if you are accessible to google.

gdown https://drive.google.com/uc?id=1nT28pIJ_dsdvi2yD8ffWt_aePXsSWdqI
sh download_data_model.sh

Organize the downloaded files at BioMANIA/data or BioMANIA/hugging_models as follows (base are necessary):

data
├── conversations
├── others-data
└── standard_process
    ├── base
    │   ├── API_composite.json
    │   └── ...
    ├── scanpy
    │   ├── API_composite.json
    │   └── ...
    ├── {LIB}
    │   ├── API_composite.json
    │   └── ...
    └── ...

hugging_models
└── retriever_model_finetuned
    ├── {LIB}
    └── ...

../../resources

By meticulously following the steps above, you'll have all the essential data and models perfectly organized for the project.

We also offer some demo chat, you can find them in ./examples. Notice that these demo chat are converted from the PyPI readthedoc tutorials. You can check the original tutorial link through the tutorial_links.txt.

Prepare for front-end UI service

This is compatible with Node.js version 19.

# Under folder BioMANIA/chatbot_ui_biomania
npm install && npm run build

Inference with pretrained models

Start both services for back-end and front-end UI with:

# Under folder `BioMANIA/`
# backend, in one terminal
python -m src.deploy.inference_dialog_server
# frontend, in another terminal
cd chatbot_ui_biomania/
npm run dev 

Your chatbot server is now operational at http:https://localhost:3000/en, primed to process user queries.

When selecting different libraries on the UI page, the retriever's path will automatically be changed based on the library selected

DIY

For users who wish to customize functionality more deeply, we provide a script example that demonstrates direct interaction with the BioMANIA library via a Python script. In this example, users can

  • switch different initial loaded library
  • change the llm type by either ollama supported models i.e. llama3, or openai supported models i.e. gpt-3.5-turbo
  • manage the conversation state, either continue the previous saved session, or start a new conversation This method is particularly suited for developers and researchers who want to quickly adjust and test different data processing strategies based on specific research needs.
# under BioMANIA/
from src.deploy.model import Model
conversation_started = True
model = Model(logger=None, device='cpu', model_llm_type='llama3')
user_input = "Could you load the built in dataset?"
library = "scanpy"
# for the first turn of a dialog, use conversation_started=True, then use conversation_started=False for the following dialogs
# if you want to use previous session, use the same session_id as before and conversation_started = False
model.run_pipeline(user_input, library, top_k=1, files=[], conversation_started=conversation_started, session_id="")

Build your APP!

Please refer to the separate README for tutorials that supporting converting different coding tools to our APP.

Share your APP!

If you want to share your pretrained APP to others, there are two ways.

Share docker

You can build docker and push to dockerhub, and share your docker image url in our issue. For environment setting of your tool, please refer to BioMANIA/docker_utils/{LIB}/ to add the env files, or modify the Dockerfile to build your environment.

# cd BioMANIA
docker build --build-arg LIB=[your_tool_name] -t [docker_image_name] -f Dockerfile ./
# (optional)push to docker
docker push [your_docker_repo]/[docker_image_name]:[tag]

Notice if you want to include some data inside the docker, please modify the Dockerfile carefully to copy the folders to /app. Also add your PyPI or Git pip install url to the requirements.txt before your packaging for docker.

Share data/models

You can just share your data and hugging_models folder and logo image by drive link to our issue.

Reference and Acknowledgments

We extend our gratitude to the following references:

Thank you for choosing BioMANIA. We hope this guide assists you in navigating through our project with ease.

Version History

  • v1.1.10 (2024-04-21)
    • Add add git installation, add basic API documentation, add PyPI packaging support.
    • Add basic pytest cases.
    • Add terminal CLI, and Colab demo, with their video demo.
    • Setup and simplify the process through PyPI installation!

view version_history for more details!

Star History

Star History Chart

Citation

Please cite our paper if you fine our data, model or code useful.

@article{dong2023biomania,
  title={BioMANIA: Simplifying bioinformatics data analysis through conversation},
  author={Dong, Zhengyuan and Zhong, Victor and Lu, Yang},
  journal={bioRxiv},
  pages={2023--10},
  year={2023},
  publisher={Cold Spring Harbor Laboratory}
}