![GitHub](https://img.shields.io/github/license/heartexlabs/label-studio?logo=heartex) ![label-studio:build](https://github.com/heartexlabs/label-studio/workflows/label-studio:build/badge.svg) ![GitHub release](https://img.shields.io/github/v/release/heartexlabs/label-studio?include_prereleases) [Website](https://labelstud.io/) • [Docs](https://labelstud.io/guide/) • [Twitter](https://twitter.com/heartexlabs) • [Join Slack Community ](http://slack.labelstud.io.s3-website-us-east-1.amazonaws.com?source=github-1) ## What is Label Studio? Label Studio is an open source data labeling tool. It lets you label data types like audio, text, images, videos, and time series with a simple and straightforward UI and export to various model formats. It can be used to prepare raw data or improve existing training data to get more accurate ML models. - [Try out Label Studio](#try-out-label-studio) - [What you get from Label Studio](#what-you-get-from-label-studio) - [Included templates for labeling data in Label Studio](#included-templates-for-labeling-data-in-label-studio) - [Set up machine learning models with Label Studio](#set-up-machine-learning-models-with-Label-Studio) - [Integrate Label Studio with your existing tools](#integrate-label-studio-with-your-existing-tools) ![Gif of Label Studio annotating different types of data](https://raw.githubusercontent.com/heartexlabs/label-studio/master/images/annotation_examples.gif) Have a custom dataset? You can customize Label Studio to fit your needs. Read an [introductory blog post](https://towardsdatascience.com/introducing-label-studio-a-swiss-army-knife-of-data-labeling-140c1be92881) to learn more. ## Try out Label Studio Try out Label Studio in a **[running app](https://app.labelstud.io)**, install it locally, or deploy it in a cloud instance. - [Install locally with Docker](#install-locally-with-docker) - [Run with Docker Compose (Label Studio + Nginx + PostgreSQL)](#run-with-docker-compose) - [Install locally with pip](#install-locally-with-pip) - [Install locally with Anaconda](#install-locally-with-anaconda) - [Install for local development](#install-for-local-development) - [Deploy in a cloud instance](#deploy-in-a-cloud-instance) ### Install locally with Docker Run Label Studio in a Docker container and access it at `http://localhost:8080`. ```bash docker run -it -p 8080:8080 -v `pwd`/mydata:/label-studio/data heartexlabs/label-studio:latest ``` You can find all the generated assets, including SQLite3 database storage `label_studio.sqlite3` and uploaded files, in the `./mydata` directory. #### Override default Docker install You can override the default launch command by appending the new arguments: ```bash docker run -it -p 8080:8080 -v `pwd`/mydata:/label-studio/data heartexlabs/label-studio:latest label-studio --log-level DEBUG ``` #### Build a local image with Docker If you want to build a local image, run: ```bash docker build -t heartexlabs/label-studio:latest . ``` ### Run with Docker Compose Docker compose script provides production-ready stack consisting of the following components: - Label Studio - [Nginx](https://www.nginx.com/) - proxy web server used to load various static data, including uploaded audio, images, etc. - [PostgreSQL](https://www.postgresql.org/) - production-ready database that replaces less performant SQLite3. To start using the app from `http://localhost` run this command: ```bash docker-compose up ``` ### Install locally with pip ```bash # Requires >=Python3.6, <3.9 pip install label-studio # Start the server at http://localhost:8080 label-studio ``` ### Install locally with Anaconda ```bash conda create --name label-studio python=3.8 conda activate label-studio pip install label-studio ``` ### Install for local development You can run the latest Label Studio version locally without installing the package with pip. ```bash # Install all package dependencies pip install -e . # Run database migrations python label_studio/manage.py migrate # Start the server in development mode at http://localhost:8080 python label_studio/manage.py runserver ``` ### Deploy in a cloud instance You can deploy Label Studio with one click in Heroku, Microsoft Azure, or Google Cloud Platform: [](https://heroku.com/deploy?template=https://github.com/heartexlabs/label-studio/tree/master) [](https://portal.azure.com/#create/Microsoft.Template/uri/https%3A%2F%2Fraw.githubusercontent.com%2Fheartexlabs%2Flabel-studio%2Fmaster%2Fazuredeploy.json) [](https://deploy.cloud.run) #### Apply frontend changes The frontend part of Label Studio app lies in the `frontend/` folder and written in React JSX. In case you've made some changes there, the following commands should be run before building / starting the instance: ``` cd label_studio/frontend/ npm ci npx webpack cd ../.. python label_studio/manage.py collectstatic --no-input ``` ### Troubleshoot installation If you see any errors during installation, try to rerun the installation ```bash pip install --ignore-installed label-studio ``` #### Install dependencies on Windows To run Label Studio on Windows, download and install the following wheel packages from [Gohlke builds](https://www.lfd.uci.edu/~gohlke/pythonlibs) to ensure you're using the correct version of Python: - [lxml](https://www.lfd.uci.edu/~gohlke/pythonlibs/#lxml) ```bash # Upgrade pip pip install -U pip # If you're running Win64 with Python 3.8, install the packages downloaded from Gohlke: pip install lxml‑4.5.0‑cp38‑cp38‑win_amd64.whl # Install label studio pip install label-studio ``` ## What you get from Label Studio ![Screenshot of Label Studio data manager grid view with images](https://raw.githubusercontent.com/heartexlabs/label-studio/master/images/labelstudio-ui.gif) - **Multi-user labeling** sign up and login, when you create an annotation it's tied to your account. - **Multiple projects** to work on all your datasets in one instance. - **Streamlined design** helps you focus on your task, not how to use the software. - **Configurable label formats** let you customize the visual interface to meet your specific labeling needs. - **Support for multiple data types** including images, audio, text, HTML, time-series, and video. - **Import from files or from cloud storage** in Amazon AWS S3, Google Cloud Storage, or JSON, CSV, TSV, RAR, and ZIP archives. - **Integration with machine learning models** so that you can visualize and compare predictions from different models and perform pre-labeling. - **Embed it in your data pipeline** REST API makes it easy to make it a part of your pipeline ## Included templates for labeling data in Label Studio Label Studio includes a variety of templates to help you label your data, or you can create your own using specifically designed configuration language. The most common templates and use cases for labeling include the following cases: ## Set up machine learning models with Label Studio Connect your favorite machine learning model using the Label Studio Machine Learning SDK. Follow these steps: 1. Start your own machine learning backend server. See [more detailed instructions](https://github.com/heartexlabs/label-studio-ml-backend). 2. Connect Label Studio to the server on the model page found in project settings. This lets you: - **Pre-label** your data using model predictions. - Do **online learning** and retrain your model while new annotations are being created. - Do **active learning** by labeling only the most complex examples in your data. ## Integrate Label Studio with your existing tools You can use Label Studio as an independent part of your machine learning workflow or integrate the frontend or backend into your existing tools. * Use the [Label Studio Frontend](https://github.com/heartexlabs/label-studio-frontend) as a separate React library. See more in the [Frontend Library documentation](https://labelstud.io/guide/frontend.html). ## Ecosystem | Project | Description | |-|-| | label-studio | Server, distributed as a pip package | | [label-studio-frontend](https://github.com/heartexlabs/label-studio-frontend) | React and JavaScript frontend and can run standalone in a web browser or be embedded into your application. | | [data-manager](https://github.com/heartexlabs/dm2) | React and JavaScript frontend for managing data. Includes the Label Studio Frontend. Relies on the label-studio server or a custom backend with the expected API methods. | | [label-studio-converter](https://github.com/heartexlabs/label-studio-converter) | Encode labels in the format of your favorite machine learning library | | [label-studio-transformers](https://github.com/heartexlabs/label-studio-transformers) | Transformers library connected and configured for use with Label Studio | ## Roadmap Want to use **The Coolest Feature X** but Label Studio doesn't support it? Check out [our public roadmap](roadmap.md)! ## Citation ```tex @misc{Label Studio, title={{Label Studio}: Data labeling software}, url={https://github.com/heartexlabs/label-studio}, note={Open source software available from https://github.com/heartexlabs/label-studio}, author={ Maxim Tkachenko and Mikhail Malyuk and Nikita Shevchenko and Andrey Holmanyuk and Nikolai Liubimov}, year={2020-2021}, } ``` ## License This software is licensed under the [Apache 2.0 LICENSE](/LICENSE) © [Heartex](https://www.heartex.ai/). 2020-2021