Skip to content
/ ALiPy Public

ALiPy: Active Learning in Python is an active learning python toolbox, which allows users to conveniently evaluate, compare and analyze the performance of active learning methods.

License

Notifications You must be signed in to change notification settings

NUAA-AL/ALiPy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ALiPy: Active Learning in Python

Authors: Ying-Peng Tang, Guo-Xiang Li, Sheng-Jun Huang

Online document: https://parnec.nuaa.edu.cn/huangsj/alipy/

Introduction

ALiPy是一个基于Python实现的主动学习工具包,内置20余种主动学习算法,并提供包括数据处理、结果可视化等工具。ALiPy根据主动学习框架的不同部件提供了若干独立的工具类,这样一方面可以方便地支持不同主动学习场景,另一方面可以使用户自由地组织自己的项目,用户可以不必继承任何接口来实现自己的算法与替换项目中的部件。此外,ALiPy不仅支持多种不同的主动学习场景,如标注代价敏感,噪声标注者,多标记查询等。详细的介绍与文档请参考工具包的官方网站

ALiPy provides a module based implementation of active learning framework, which allows users to conveniently evaluate, compare and analyze the performance of active learning methods. It implementations more than 20 algorithms and also supports users to easily implement their own approaches under different settings.

Features of alipy include:

  • Model independent

    • There is no limitation to the classification model. One can use SVM in sklearn or deep model in tensorflow as you need.
  • Module independent

    • One can freely modify one or more modules of the toolbox without affection to the others.
  • Implement your own algorithm without inheriting anything

    • There are few limitations of the user-defined functions, such as the parameters or names.
  • Variant Settings supported

    • Noisy oracles, Multi-label, Cost effective, Feature querying, etc.
  • Powerful tools

    • Save intermediate results of each iteration AND recover the program from any breakpoints.
    • Parallel the k-folds experiment.
    • Gathering, process and visualize the experiment results.
    • Provide 25 algorithms.
    • Support 7 different settings.

For more detailed introduction and tutorial, please refer to the website of alipy.

Setup

You can get alipy simply by:

pip install alipy

Or clone alipy source code to your local directory and build from source:

cd ALiPy
python setup.py install

The dependencies of alipy are:

  1. Python dependency
python >= 3.4
  1. Basic Dependencies
numpy
scipy
scikit-learn
matplotlib
prettytable
  1. Optional dependencies
cvxpy

Note that, the basic dependencies must be installed, and the optional dependencies are required only if users need to involke KDD'13 BMDR and AAAI'19 SPAL methods in alipy. (cvxpy will not be installed through pip install alipy.)

Tools in alipy

The tool classes provided by alipy cover as many components in active learning as possible. It aims to support experiment implementation with miscellaneous tool functions. These tools are designed in a low coupling way in order to let users to program the experiment project at their own customs.

  • Using alipy.data_manipulate to preprocess and split your data sets for experiments.

  • Using alipy.query_strategy to invoke traditional and state-of-the-art methods.

  • Using alipy.index.IndexCollection to manage your labeled indexes and unlabeled indexes.

  • Using alipy.metric to calculate your model performances.

  • Using alipy.experiment.state and alipy.experiment.state_io to save the intermediate results after each query and recover the program from the breakpoints.

  • Using alipy.experiment.stopping_criteria to get some example stopping criteria.

  • Using alipy.experiment.experiment_analyser to gathering, process and visualize your experiment results.

  • Using alipy.oracle to implement clean, noisy, cost-sensitive oracles.

  • Using alipy.utils.multi_thread to parallel your k-fold experiment.

The implemented query strategies

ALiPy provide several commonly used strategies for now, and new algorithms will continue to be added in subsequent updates.

  • AL with Instance Selection: Uncertainty (SIGIR 1994), Graph Density (CVPR 2012), QUIRE (TPAMI 2014), SPAL (AAAI 2019), Query By Committee (ICML 1998), Random, BMDR (KDD 2013), LAL (NIPS 2017), Expected Error Reduction (ICML 2001)

  • AL for Multi-Label Data: AUDI (ICDM 2013) , QUIRE (TPAMI 2014) , Random, MMC (KDD 2009), Adaptive (IJCAI 2013)

  • AL by Querying Features: AFASMC (KDD 2018) , Stability (ICDM 2013) , Random

  • AL with Different Costs: HALC (IJCAI 2018) , Random , Cost performance

  • AL with Noisy Oracles: CEAL (IJCAI 2017) , IEthresh (KDD 2009) , All, Random

  • AL with Novel Query Types: AURO (IJCAI 2015)

  • AL for Large Scale Tasks: Subsampling

Implement your own algorithm