Article

Ensemble based positive unlabeled learning for time series classification

Authors:

Minh Nhut Nguyen,

See-Kiong NgAuthors Info & Claims

DASFAA'12: Proceedings of the 17th international conference on Database Systems for Advanced Applications - Volume Part I

Pages 243 - 257

https://doi.org/10.1007/978-3-642-29038-1_19

Published: 15 April 2012 Publication History

Abstract

Many real-world applications in time series classification fall into the class of positive and unlabeled (PU) learning. Furthermore, in many of these applications, not only are the negative examples absent, the positive examples available for learning can also be rather limited. As such, several PU learning algorithms for time series classification have recently been developed to learn from a small set P of labeled seed positive examples augmented with a set U of unlabeled examples. The key to these algorithms is to accurately identify the likely positive and negative examples from U, but it has remained a challenge, especially for those uncertain examples located near the class boundary. This paper presents a novel ensemble based approach that restarts the detection phase several times to probabilistically label these uncertain examples more robustly so that a reliable classifier can be built from the limited positive training examples. Experimental results on time series data from different domains demonstrate that the new method outperforms existing state-of-the art methods significantly.

References

[1]

Olszewski, R. T.: Generalized Feature Extraction for Structural Pattern Recognition in Time-Series Data, PhD thesis, Carnegie Mellon University, Pittsburgh, PA (2001)

Digital Library

[2]

Rath, T. M., Manmatha, R.: Word image matching using dynamic time warping. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. II-521-II-527 (2003)

[3]

Xi, X., Keogh, E., Shelton, C., Wei, L., Ratanamahatana, C. A.: Fast time series classification using numerosity reduction. In: Proceedings of the 23rd International Conference on Machine Learning. ACM, Pittsburgh (2006)

Digital Library

[4]

Chapelle, O., Scholkopf, B., Zien, A.: Semi-Supervised Learning. MIT Press (2006) (in Press)

Digital Library

[5]

Li, M., Zhou, Z.-H.: SETRED: Self-Training with Editing. In: Ho, T.-B., Cheung, D., Liu, H. (eds.) PAKDD 2005. LNCS (LNAI), vol. 3518, pp. 611-621. Springer, Heidelberg (2005)

Digital Library

[6]

Zhu, X.: Semi-supervised learning literature survey, Technical report, no.1530, Computer Sciences, University of Wisconsin-Madison (2008)

[7]

Liu, T., Du, X., Xu, Y., Li, M.-H., Wang, X.: Partially Supervised Text Classification with Multi-Level Examples. In: AAAI (2011)

[8]

Gabriel Pui Cheong, F., Yu, J.X., Hongjun, L., Yu, P. S.: Text classification without negative examples revisit. IEEE Transactions on Knowledge and Data Engineering 18, 6-20 (2006)

Digital Library

[9]

Li, X., Liu, B.: Learning to classify texts using positive and unlabeled data. In: Proceedings of the 18th International Joint Conference on Artificial Intelligence. Morgan Kaufmann Publishers Inc., Acapulco (2003)

Digital Library

[10]

Li, X., Liu, B., Ng, S.-K.: Learning to Identify Unexpected Instances in the Test Set. In: Proceedings of Twentieth International Joint Conference on Artificial Intelligence, India (IJCAI 2007), pp. 2802-2807 (2007)

Digital Library

[11]

Li, X., Yu, P., Liu, B., Ng, S.-K.: Positive Unlabeled Learning for Data Stream Classification. In: SDM, pp. 257-268 (2009)

[12]

Liu, B., Lee, W. S., Yu, P. S., Li, X.: Partially Supervised Classification of Text Documents. In: ICML (2002)

Digital Library

[13]

Elkan, C., Noto, K.: Learning Classifiers from Only Positive and Unlabeled Data. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2008)

Digital Library

[14]

Wei, L., Keogh, E.: Semi-supervised time series classification. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, Philadelphia (2006)

Digital Library

[15]

Ratanamahatana, C., Wanichsan, D.: Stopping Criterion Selection for Efficient Semisupervised Time Series Classification. In: Lee, R. (ed.) Soft. Eng., Arti. Intel., Net. & Para./Distri. Comp. SCI, vol. 149, pp. 1-14. Springer, Heidelberg (2008)

[16]

Nguyen, M. N., Li, X., Ng, S.-K.: Positive Unlabeled Learning for Time Series Classification. In: Proceedings of International Joint Conference on Artificial Intelligence, IJCAI (2011)

Digital Library

[17]

Kanungo, T., Mount, D. M., Netanyahu, N. S., Piatko, C. D., Silverman, R., Wu, A.Y.: An efficient k-means clustering algorithm: analysis and implementation. IEEE Transactions on Pattern Analysis and Machine Intelligence 24, 881-892 (2002)

Digital Library

[18]

Keogh, E., Kasetty, S.: On the Need for Time Series Data Mining Benchmarks: A Survey and Empirical Demonstration. Data Mining and Knowledge Discovery 7, 349-371 (2003)

Digital Library

[19]

Yoon, H., Yang, K., Shahabi, C.: Feature subset selection and feature ranking for multivariate time series. IEEE Transactions on Knowledge and Data Engineering 17, 1186-1198 (2005)

Digital Library

[20]

Wilson, D. L.: Asymptotic Properties of Nearest Neighbor Rules Using Edited Data. IEEE Transactions on Systems, Man and Cybernetics 2, 408-421 (1972)

[21]

Wei, L.: Self Training dataset (2007), https://alumni.cs.ucr.edu/˜wli/selfTraining/

[22]

Keogh, E.: The UCR Time Series Classification/Clustering Homepage (2008), https://www.cs.ucr.edu/˜eamonn/time_series_data/

Cited By

Liang SZhang YMa J(2019)PU-Shapelets: Towards Pattern-Based Positive Unlabeled Classification of Time SeriesDatabase Systems for Advanced Applications10.1007/978-3-030-18576-3_6(87-103)Online publication date: 22-Apr-2019
https://dl.acm.org/doi/10.1007/978-3-030-18576-3_6
He GLi YZhao W(2017)An uncertainty and density based active semi-supervised learning scheme for positive unlabeled multivariate time series classificationKnowledge-Based Systems10.1016/j.knosys.2017.03.004124:C(80-92)Online publication date: 15-May-2017
https://dl.acm.org/doi/10.1016/j.knosys.2017.03.004
Vinh VAnh D(2016)Two Novel Techniques to Improve MDL-Based Semi-Supervised Classification of Time SeriesTransactions on Computational Collective Intelligence XXV - Volume 999010.1007/978-3-662-53580-6_8(127-147)Online publication date: 1-Sep-2016
https://dl.acm.org/doi/10.1007/978-3-662-53580-6_8
Show More Cited By

Ensemble based positive unlabeled learning for time series classification
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Supervised learning
      2. Unsupervised learning
    2. Machine learning approaches

Recommendations

On the stopping criteria for k-Nearest Neighbor in positive unlabeled time series classification problems

Positive unlabeled time series classification has become an important area during the last decade, as often vast amounts of unlabeled time series data are available but obtaining the corresponding labels is difficult. In this situation, positive ...
A graph-based approach for positive and unlabeled learning
Highlights
- Proposal of a graph-based method for Positive and Unlabeled Learning that uses graph-based strategies in all steps.
Abstract
Positive and Unlabeled Learning (PUL) uses unlabeled documents and a few positive documents for retrieving a set of “interest” documents from a text collection. Usually, PUL approaches are based on the vector space model. However, when ...
Multi-instance positive and unlabeled learning with bi-level embedding

Multiple Instance Learning (MIL) is a widely studied learning paradigm which arises from real applications. Existing MIL methods have achieved prominent performances under the premise of plenty annotation data. Nevertheless, sufficient labeled ...

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings

DASFAA'12: Proceedings of the 17th international conference on Database Systems for Advanced Applications - Volume Part I

April 2012

592 pages

ISBN:9783642290374

Editors:
Sang-goo Lee
School of Computer Science and Engineering, Seoul National University, Gwanak-ro, Gwanak-gu, Seoul, South Korea
,
Zhiyong Peng
Computer School, Wuhan University, Luo-jia-shan, Wuchang, Wuhan, Hubei Province, China
,
Xiaofang Zhou
School of Information Technology and Electrical Engineering, University of Queensland, Luo-jia-shan, Wuchang, Brisbane, Hubei Province, Australia
,
Yang-Sae Moon
Department of Computer Science, Kangwon National University, 192-1, Hyoja2-Dong, Chuncheon, Kangwon, Hubei Province, South Korea
,
Rainer Unland
Institute for Computer Science and Business Information, University of Duisburg-Essen, Schützenbahn 70, Essen, Hubei Province, Germany

Sponsors

Pusan National Univ.: Pusan National University
Onion Software: Onion Software
BBMC: BBMC
KIISE Database Society of Korea
Consortium of Cloud Computing Research: Consortium of Cloud Computing Research

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 15 April 2012

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

4
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 01 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Liang SZhang YMa J(2019)PU-Shapelets: Towards Pattern-Based Positive Unlabeled Classification of Time SeriesDatabase Systems for Advanced Applications10.1007/978-3-030-18576-3_6(87-103)Online publication date: 22-Apr-2019
https://dl.acm.org/doi/10.1007/978-3-030-18576-3_6
He GLi YZhao W(2017)An uncertainty and density based active semi-supervised learning scheme for positive unlabeled multivariate time series classificationKnowledge-Based Systems10.1016/j.knosys.2017.03.004124:C(80-92)Online publication date: 15-May-2017
https://dl.acm.org/doi/10.1016/j.knosys.2017.03.004
Vinh VAnh D(2016)Two Novel Techniques to Improve MDL-Based Semi-Supervised Classification of Time SeriesTransactions on Computational Collective Intelligence XXV - Volume 999010.1007/978-3-662-53580-6_8(127-147)Online publication date: 1-Sep-2016
https://dl.acm.org/doi/10.1007/978-3-662-53580-6_8
Li YHe GXia XLi Y(2016)A Reverse Nearest Neighbor Based Active Semi-supervised Learning Method for Multivariate Time Series ClassificationProceedings, Part I, 27th International Conference on Database and Expert Systems Applications - Volume 982710.1007/978-3-319-44403-1_17(272-286)Online publication date: 5-Sep-2016
https://dl.acm.org/doi/10.1007/978-3-319-44403-1_17

View Options

View options

Media

Figures

Other

Tables

View Table of Contents