TSB-UAD is a new open, end-to-end benchmark suite to ease the evaluation of univariate time-series anomaly detection methods. Overall, TSB-UAD contains 12686 time series with labeled anomalies spanning different domains with high variability of anomaly types, ratios, and sizes. Specifically, TSB-UAD includes 10 previously proposed datasets containing 900 time series from real-world data science applications. Motivated by flaws in certain datasets and evaluation strategies in the literature, we study anomaly types and data transformations to contribute two collections of datasets. Specifically, we generate 958 time series using a principled methodology for transforming 126 time-series classification datasets into time series with labeled anomalies. In addition, we present a set of data transformations with which we introduce new anomalies in the public datasets, resulting in 10828 time series (92 datasets) with varying difficulty for anomaly detection.
If you use TSB-UAD in your project or research, please cite our paper:
John Paparrizos, Yuhao Kang, Paul Boniol, Ruey S. Tsay, Themis Palpanas, and Michael J. Franklin. TSB-UAD: An End-to-End Benchmark Suite for Univariate Time-Series Anomaly Detection. PVLDB, 15(8): 1697 - 1711, 2022. doi:10.14778/3529337.3529354
Due to limitations in the upload size on GitHub, we host the datasets at a different location:
Public: https://chaos.cs.uchicago.edu/tsb-uad/public.zip
Synthetic: https://chaos.cs.uchicago.edu/tsb-uad/synthetic.zip
Artificial: https://chaos.cs.uchicago.edu/tsb-uad/artificial.zip
The UCR classification datasets used to generate the Artificial datasets: https://chaos.cs.uchicago.edu/tsb-uad/UCR2018-NEW.zip
- John Paparrizos (University of Chicago)
- Yuhao Kang (University of Chicago)
- Alex Wu (University of Chicago)
- Teja Bogireddy (University of Chicago)
The following tools are required to install TSB-UAD from source:
- git
- conda (anaconda or miniconda)
- Clone this repository using git and change into its root directory.
git clone https://github.com/johnpaparrizos/TSB-UAD.git
cd TSB-UAD/
- Create and activate a conda-environment 'TSB'.
conda env create --file environment.yml
conda activate TSB
- Install TSB-UAD using setup.py:
python setup.py install
- Install the dependencies from
requirements.txt
:
pip install -r requirements.txt
- test_anomaly_detectors.ipynb : The performance of 11 popular anomaly detectors.
- test_artificialConstruction.ipynb: The synthesized dataset based on anomaly construction.
- test_transformer.ipynb: The effects of 11 transformations.
In ./data contains four folders:
- benchmark/ contains ten public datasets. Below shows some typical outliers in these ten datasets.
-
UCR2018-NEW/ contains 128 subfolders
-
artificial/ contains the data that are constructed based on UCR2018-NEW
- synthetic/ contains the data that are synthesized by local and global tranformations
We test eleven algorithms in the module.
Below shows a result based on Autoencoder.
For each output figure, the left panel shows the real time series with outliers (red), anomaly score obtained by each anomaly detector, and the correpsonding TP/FP/TN/FN classification.
The right panel shows the ROC curve. AUC represents the area under the ROC curve. Larger AUC indicates better performance.