Skip to content
forked from pepkit/pephub

A web API and database for biological sample metadata

License

Notifications You must be signed in to change notification settings

nleroy917/pephub

 
 

Repository files navigation

pephub

pephub is a biological metadata server that lets you view, store, and share your sample metadata in form of PEPs. It acts as a database to store PEPs, an API to programmatically read and write PEPs, and a user interface to view and manage these PEPs in the database.

Setup

Already have everything setup? Skip to running pephub. Two things are required to run pephub: 1) A pephub database, and 2) The pephub server.

1. Database Setup

pephub is backed by a postgres database to store PEPs. You can easily create a new pephub-compatible postgres instance locally:

sh setup_db.sh
docker pull postgres
docker build -t pephub_db postgres/
docker run -p 5432:54432 pephub_db

You should now have a pephub-compatible postgres instance running at http:https://localhost:5432.

Have PEPs you want to load? We have provided a convenient script to load a directory of PEPS into the database.

2. pephub Server Setup

Install dependencies using pip (We suggest using virtual environments):

python -m venv venv && source venv/bin/activate
pip install -r requirements/requirements-all.txt

3. (Optional) GitHub Authentication Client Setup

pephub uses GitHub for namespacing and authentication. As such, a GitHub application capable of logging in users is required. We've included instructions for setting this up locally using your own GitHub account.

4. (Optional) Vector Database Setup

We've added semantic-search capabilities to pephub. Optionally, you may host an instance of the qdrant vector database to store embeddings computed using a sentence transformer that has mined and processed any relevant metadata from PEPs. If no qdrant connection settings are supplied, pephub will default to SQL search. Read more here. To run qdrant locally, simply run the following:

docker pull qdrant/qdrant
docker run -p 6333:6333 \
    -v $(pwd)/qdrant_storage:/qdrant/storage \
    qdrant/qdrant

Running

pephub is configured to be run a few different ways. Regardless of how you run it, however, pephub requires many configuration parameters to function. Configuration settings are supplied to pephub through environment variables to allow for flexible development and deployment. The following settings are required to run pephub. While pephub has built-in defaults for these settings, you should provide them youself to ensure compatability:

  • POSTGRES_HOST: The hostname of the PEPhub database server
  • POSTGRES_DB: The name of the database inside the postgres server
  • POSTGRES_USER: Username for the database
  • POSTGRES_PASSWORD: Password for the user
  • POSTGRES_PORT: Port for postgres database
  • GH_CLIENT_ID: Client ID for the GitHub application that authenticates users
  • GH_CLIENT_SECRET: Client secret for the GitHub application that authenticates users
  • BASE_URI: A BASE URI of the PEPhub (e.g. localhost:8000)

You must set these environment variables prior to running PEPhub. We've provided env files inside environment which you may source to load your environment. Alternatively, you may store them locally in a .env file. This file will get loaded and exported to your environment when the server starts up. We've included an example .env file with this repository.

You can read more about server settings and configuration here.

Development:

PEPhub consists of a FatsAPI backend, and a React frontend. to get started with development, there are three things you need to do:

1. Ensure database is set up and running.
See here if you've not setup a database.

2. Start pephub.
You can run pephub natively using the following:

uvicorn pephub.main:app --reload

pephub should now be running at http:https://localhost:8000.

3. Start the React development server:

Important: To make the development server work, you must include a .env.local file inside web/ with the following contents:

VITE_API_HOST=http:https://localhost:8000

This ensures that the frontend development server will proxy requests to the backend server. You can now run the frontend development server:

cd web
npm install # yarn install
npm start # yarn dev

The pephub frontend development server should now be running at http:https://localhost:5173/.

Running with docker:

Option 1. Standalone docker:

If you already have a public database instance running, you can choose to build and run the server container only. A note to Apple Silicon (M1/M2) users: If you have issues running, try setting your default docker platform with export DOCKER_DEFAULT_PLATFORM=linux/amd64 to get the container to build and run properly. See this issue for more information.

1. Environment: Ensure that you have your environment properly configured. To manage secrets in your environment, we leverage pass and curated .env files. You can use our launch_docker.sh script to start your container with these .env files.

2. Build and start container:

docker build -t pephub .
./launch_docker.sh

Alternatively, you can inject your environment variables one-by-one:


docker run -p 8000:8000 \
 -e POSTGRES_HOST=localhost \
 -e POSTGRES_DB=pep-db \
 ...
pephub

Or, provide your own .env file:


docker run -p 8000:8000 \
 --env-file path/to/.env \
 pephub

Option 2. docker compose:

The server has been Dockerized and packaged with a postgres image to be run with docker compose. This lets you run everything at once and develop without having to manage database instances. The docker-compose.yaml file is written such that it mounts the database storage info to a folder called postgres/ at the root of the repository. This lets you load the database once and have it persist its state after restarting the container.

You can start a development environment in three steps:

1. Obtain the latest database schema:

sh setup_db.sh

2. Curate your environment: Since we are running in docker, we need to supply environment variables to the container. The docker-compose.yaml file is written such that you can supply a .env file at the root with your configurations. See the example env file for reference. See here for a detailed explanation of all configurable server settings. For now, you can simply copy the env file:

cp environment/template.env .env

3. Build and start the containers: If you are running on an Apple M1 chip, you will need to set the following env variable prior to running docker compose:

export DOCKER_DEFAULT_PLATFORM=linux/amd64
docker compose up --build

pephub now runs/listens on http:https://localhost:8000
postgres now runs/listens on http:https://localhost:5432

3. Utilize the load_db script to populate the database with examples/:

cd scripts
python load_db.py \
--username docker \
--password password \
--database pephub
../examples

Note: If you wish to run the development environment with a pubic database, curate your .env file as such.

About

A web API and database for biological sample metadata

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • TypeScript 47.9%
  • Python 38.8%
  • HTML 9.4%
  • JavaScript 1.8%
  • CSS 1.5%
  • Shell 0.3%
  • Dockerfile 0.3%