article

Self-labeled techniques for semi-supervised learning: taxonomy, software and empirical study

Authors:

Isaac Triguero,

Salvador García,

Francisco HerreraAuthors Info & Claims

Knowledge and Information Systems, Volume 42, Issue 2

Pages 245 - 284

https://doi.org/10.1007/s10115-013-0706-y

Published: 01 February 2015 Publication History

Abstract

Semi-supervised classification methods are suitable tools to tackle training sets with large amounts of unlabeled data and a small quantity of labeled data. This problem has been addressed by several approaches with different assumptions about the characteristics of the input data. Among them, self-labeled techniques follow an iterative procedure, aiming to obtain an enlarged labeled data set, in which they accept that their own predictions tend to be correct. In this paper, we provide a survey of self-labeled methods for semi-supervised classification. From a theoretical point of view, we propose a taxonomy based on the main characteristics presented in them. Empirically, we conduct an exhaustive study that involves a large number of data sets, with different ratios of labeled data, aiming to measure their performance in terms of transductive and inductive classification capabilities. The results are contrasted with nonparametric statistical tests. Note is then taken of which self-labeled models are the best-performing ones. Moreover, a semi-supervised learning module has been developed for the Knowledge Extraction based on Evolutionary Learning software, integrating analyzed methods and data sets.

References

[1]

Zhu X, Goldberg AB (2009) Introduction to semi-supervised learning, 1st edn. Morgan and Claypool, San Rafael, CA

[2]

Witten IH, Frank E, Hall MA (2011) Data mining: practical machine learning tools and techniques, 3rd edn. Morgan Kaufmann, San Francisco

[3]

Zhu Y, Yu J, Jing L (2013) A novel semi-supervised learning framework with simultaneous text representing. Knowl Inf Syst 34(3):547---562

Digital Library

[4]

Chapelle O, Schlkopf B, Zien A (2006) Semi-supervised learning, 1st edn. The MIT Press, Cambridge, MA

[5]

Pedrycz W (1985) Algorithms of fuzzy clustering with partial supervision. Pattern Recognit Lett 3:13---20

Digital Library

[6]

Zhao W, He Q, Ma H, Shi Z (2012) Effective semi-supervised document clustering via active learning with instance-level constraints. Knowl Inf Syst 30(3):569---587

Digital Library

[7]

Chen K, Wang S (2011) Semi-supervised learning via regularized boosting working on multiple semi-supervised assumptions. IEEE Trans Pattern Anal Mach Intell 33(1):129---143

Digital Library

[8]

Fujino A, Ueda N, Saito K (2008) Semisupervised learning for a hybrid generative/discriminative classifier based on the maximum entropy principle. IEEE Trans Pattern Anal Mach Intell 30(3):424---437

Digital Library

[9]

Joachims T (1999) Transductive inference for text classification using support vector machines. In: Proceedings of 16th international conference on machine learning, Morgan Kaufmann, pp 200---209

[10]

Blum A, Chawla S (2001) Learning from labeled and unlabeled data using graph mincuts. In: Proceedings of the eighteenth international conference on machine learning, pp 19---26

[11]

Wang J, Jebara T, Chang S-F (2013) Semi-supervised learning using greedy max-cut. J Mac Learn Res 14(1):771---800

Digital Library

[12]

Mallapragada PK, Jin R, Jain A, Liu Y (2009) Semiboost: boosting for semi-supervised learning. IEEE Trans Pattern Anal Mach Intell 31(11):2000---2014

Digital Library

[13]

Yarowsky D (1995) Unsupervised word sense disambiguation rivaling supervised methods. In: Proceedings of the 33rd annual meeting of the association for computational linguistics, pp 189---196

[14]

Li M, Zhou ZH (2005) SETRED: self-training with editing. In: Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics), vol 3518 LNAI, pp 611---621

[15]

Blum A, Mitchell T (1998) Combining labeled and unlabeled data with co-training. In: Proceedings of the annual ACM conference on computational learning theory, pp 92---100

[16]

Du J, Ling CX, Zhou ZH (2010) When does co-training work in real data? IEEE Trans Knowl Data Eng 23(5):788---799

Digital Library

[17]

Sun S, Jin F (2011) Robust co-training. Int J Pattern Recognit Artif Intell 25(07):1113---1126

[18]

Jiang Z, Zhang S, Zeng J (2013) A hybrid generative/discriminative method for semi-supervised classification. Knowl-Based Syst 37:137---145

Digital Library

[19]

Sun S (2013) A survey of multi-view machine learning. Neural Comput Appl 23(7---8):2031---2038

[20]

Zhou ZH, Li M (2005) Tri-training: exploiting unlabeled data using three classifiers. IEEE Trans Knowl Data Eng 17:1529---1541

Digital Library

[21]

Li M, Zhou ZH (2007) Improve computer-aided diagnosis with machine learning techniques using undiagnosed samples. IEEE Trans Syst Man Cybern A Syst Hum 37(6):1088---1098

[22]

Sun S, Shawe-Taylor J (2010) Sparse semi-supervised learning using conjugate functions. J Mach Learn Res 11:2423---2455

Digital Library

[23]

Zhu X (2005) Semi-supervised learning literature survey. Technical report 1530, Computer Sciences, University of Wisconsin-Madison

[24]

Chawla N, Karakoulas G (2005) Learning from labeled and unlabeled data: an empirical study across techniques and domains. J Artif Intell Res 23:331---366

Digital Library

[25]

Zhou Z-H, Li M (2010) Semi-supervised learning by disagreement. Knowl Inf Syst 24(3):415---439

[26]

Alcalá-Fdez J, Sánchez L, García S, del Jesus MJ, Ventura S, Garrell JM, Otero J, Romero C, Bacardit J, Rivas VM, Fernández JC, Herrera F (2009) KEEL: a software tool to assess evolutionary algorithms for data mining problems. Soft Comput 13(3):307---318

[27]

Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1---30

Digital Library

[28]

García S, Fernández A, Luengo J, Herrera F (2010) Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: experimental analysis of power. Inf Sci 180:2044---2064

Digital Library

[29]

Triguero I, Sáez JA, Luengo J, García S, Herrera F (2013) On the characterization of noise filters for self-training semi-supervised in nearest neighbor classification, Neurocomputing (in press)

[30]

Cover TM, Hart PE (1967) Nearest neighbor pattern classification. IEEE Trans Inf Theory 13(1):21---27

Digital Library

[31]

Dasgupta S, Littman ML, McAllester DA (2001) Pac generalization bounds for co-training. In: Dietterich TG, Becker S, Ghahramani Z (eds) Advances in neural information processing systems. Neural information processing systems: natural and synthetic, vol 14. MIT Press, Cambridge, pp 375---382

[32]

Quinlan JR (1993) C4.5 programs for machine learning. Morgan Kaufmann Publishers, San Francisco, CA

[33]

Efron B, Tibshirani RJ (1993) An Introduction to the bootstrap. Chapman & Hall, New York

[34]

Goldman S, Zhou Y (2000) Enhancing supervised learning with unlabeled data. In: Proceedings of the 17th international conference on machine learning. Morgan Kaufmann, pp 327---334

[35]

Alcalá-Fdez J, Fernandez A, Luengo J, Derrac J, García S, Sánchez L, Herrera F (2011) KEEL data-mining software tool: data set repository, integration of algorithms and experimental analysis framework. J Multiple-Valued Logic Soft Comput 17(2---3):255---277

[36]

Bennett K, Demiriz A, Maclin R (2002) Exploiting unlabeled data in ensemble methods. In: Proceedings of the ACM SIGKDD international conference on knowledge discovery and data mining, pp 289---296

[37]

Zhou Y, Goldman S (2004) Democratic co-learning. In: IEEE international conference on tools with artificial intelligence, pp 594---602

[38]

Deng C, Guo M (2006) Tri-training and data editing based semi-supervised clustering algorithm. In: Gelbukh A, Reyes-Garcia C (eds) MICAI 2006: advances in artificial intelligence, vol 4293 of lecture notes in computer science. Springer, Berlin, pp 641---651

[39]

Wang J, Luo S, Zeng X (2008) A random subspace method for co-training. In: IEEE international joint conference on computational intelligence, pp 195---200

[40]

Hady M, Schwenker F (2008) Co-training by committee: a new semi-supervised learning framework. In: IEEE international conference on data mining workshops, ICDMW '08, pp 563---572

[41]

Hady M, Schwenker F (2010) Combining committee-based semi-supervised learning and active learning. J Comput Sci Technol 25:681---698

Digital Library

[42]

Hady M, Schwenker F, Palm G (2010) Semi-supervised learning for tree-structured ensembles of rbf networks with co-training. Neural Netw 23:497---509

Digital Library

[43]

Yaslan Y, Cataltepe Z (2010) Co-training with relevant random subspaces. Neurocomputing 73(10---12):1652---1661

[44]

Huang T, Yu Y, Guo G, Li K (2010) A classification algorithm based on local cluster centers with a few labeled training examples. Knowl-Based Syst 23(6):563---571

Digital Library

[45]

Halder A, Ghosh S, Ghosh A (2010) Ant based semi-supervised classification. In: Proceedings of the 7th international conference on swarm intelligence, ANTS'10, Springer, Berlin, Heidelberg, pp 376---383

[46]

Wang Y, Xu X, Zhao H, Hua Z (2010) Semi-supervised learning based on nearest neighbor rule and cut edges. Knowl-Based Syst 23(6):547---554

Digital Library

[47]

Deng C, Guo M (2011) A new co-training-style random forest for computer aided diagnosis. J Intell Inf Syst 36:253---281.

Digital Library

[48]

Nigam K, Mccallum A, Thrun S, Mitchell T (2000) Text classification from labeled and unlabeled documents using em. Mach Learn 39(2):103---134

Digital Library

[49]

Tang X-L, Han M (2010) Semi-supervised Bayesian artmap. Appl Intell 33(3):302---317

Digital Library

[50]

Joachims T (2003) Transductive learning via spectral graph partitioning. In: Proceedings of twentieth international conference on machine learning, vol 1, pp 290---297

[51]

Belkin M, Niyogi P, Sindhwani V (2006) Manifold regularization: a geometric framework for learning from labeled and unlabeled examples. J Mach Learn Res 7:2399---2434

Digital Library

[52]

Xie B, Wang M, Tao D (2011) Toward the optimization of normalized graph Laplacian. IEEE Trans Neural Netw 22(4):660---666

[53]

Burges C (1998) A tutorial on support vector machines for pattern recognition. Data Min Knowl Discov 2(2):121---167

Digital Library

[54]

Chapelle O, Sindhwani V, Keerthi SS (2008) Optimization techniques for semi-supervised support vector machines. J Mach Learn Re. 9:203---233

Digital Library

[55]

Adankon M, Cheriet M (2010) Genetic algorithm-based training for semi-supervised svm. Neural Comput Appl 19:1197---1206

Digital Library

[56]

Tian X, Gasso G, Canu S (2012) A multiple kernel framework for inductive semi-supervised svm learning. Neurocomputing 90:46---58

Digital Library

[57]

Sugato B, Raymond JM (2003) Comparing and unifying search-based and similarity-based approaches to semi-supervised clustering. In: Proceedings of the ICML-2003 workshop on the continuum from labeled to unlabeled data in machine learning and data mining, pp 42---49

[58]

Yin X, Chen S, Hu E, Zhang D (2010) Semi-supervised clustering with metric learning: an adaptive kernel method. Pattern Recognit 43(4):1320---1333

Digital Library

[59]

Grira N, Crucianu M, Boujemaa N (2004) Unsupervised and semi-supervised clustering: a brief survey. In: A review of machine learning techniques for processing multimedia content. Report of the MUSCLE European network of excellence FP6

[60]

Freund Y, Seung HS, Shamir E, Tishby N (1997) Selective sampling using the query by committee algorithm. Mach Learn 28:133---168

Digital Library

[61]

Muslea I, Minton S, Knoblock C (2002) Active + semi-supervised learning = robust multi-view learning. In: Proceedings of ICML-02, 19th international conference on machine learning, pp 435---442

[62]

Zhang Q, Sun S (2010) Multiple-view multiple-learner active learning. Pattern Recognit 43(9):3113---3119

Digital Library

[63]

Yu H (2011) Selective sampling techniques for feedback-based data retrieval. Data Min Knowl Discov 22(1---2):1---30

[64]

Belhumeur P, Hespanha J, Kriegman D (1997) Eigenfaces vs. fisherfaces: recognition using class specific linear projection. IEEE Trans Pattern Anal Mach Intell 19(7):711---720

Digital Library

[65]

Hastie T, Tibshirani R, Friedman J (2001) The elements of statistical learning: data mining, inference and prediction, 2nd edn. Springer, Berlin

[66]

Song Y, Nie F, Zhang C, Xiang S (2008) A unified framework for semi-supervised dimensionality reduction. Pattern Recognit 41(9):2789---2799

Digital Library

[67]

Li Y, Guan C (2008) Joint feature re-extraction and classification using an iterative semi-supervised support vector machine algorithm. Mach Learn 71:33---53

Digital Library

[68]

Liu H, Motoda H (eds) (2007) Computational methods of feature selection. Chapman &Hall/CRC data mining and knowledge discovery series. Chapman & Hall/CRC, Boca Raton, FL

[69]

Zhao J, Lu K, He X (2008) Locality sensitive semi-supervised feature selection. Neurocomputing 71(10---12):1842---1849

[70]

Gregory PA, Gail AC (2010) Self-supervised ARTMAP. Neural Netw 23:265---282

[71]

Cour T, Sapp B, Taskar B (2011) Learning from partial labels. J Mach Learn Res 12:1501---1536

Digital Library

[72]

Joshi A, Papanikolopoulos N (2008) Learning to detect moving shadows in dynamic environments. IEEE Trans Pattern Anal Mach Intell 30(11):2055---2063

Digital Library

[73]

Ben-David A (2007) A lot of randomness is hiding in accuracy. Eng Appl Artif Intell 20:875---885

Digital Library

[74]

Alpaydin E (2010) Introduction to machine learning, 2nd edn. MIT Press, Cambridge, MA

[75]

Asuncion A, Newman D (2007) UCI machine learning repository. https://www.ics.uci.edu/mlearn/MLRepository.html

[76]

Wu X, Kumar V (eds) (2009) The top ten algorithms in data mining. Chapman & Hall/CRC data mining and knowledge discovery. Chapman & Hall/CRC, Boca Raton, FL

[77]

Aha DW, Kibler D, Albert MK (1991) Instance-based learning algorithms. Mach Learn 6(1):37---66

[78]

John GH, Langley P (2001) Estimating continuous distributions in Bayesian classifiers. In: Proceedings of the eleventh conference on uncertainty in artificial intelligence. Morgan Kaufmann, San Mateo, pp 338---345

[79]

Vapnik VN (1998) Statistical learning theory. Wiley-Interscience, London

[80]

Platt JC (1999) Fast training of support vector machines using sequential minimal optimization. MIT Press, Cambridge, MA

[81]

García S, Herrera F (2008) An extension on statistical comparisons of classifiers over multiple data sets for all pairwise comparisons. J Mach Learn Res 9:2677---2694

[82]

Sheskin DJ (2011) Handbook of parametric and nonparametric statistical procedures, 5th edn. Chapman & Hall/CRC, Boca Raton, FL

[83]

Friedman M (1937) The use of ranks to avoid the assumption of normality implicit in the analysis of variance. J Am Stat Assoc 32:675---701

[84]

Bergmann G, Hommel G (1988) Improvements of general multiple test procedures for redundant systems of hypotheses. In: Bauer P, Hommel G, Sonnemann E (eds) Multiple hypotheses testing. Springer, Berlin pp 100---115

[85]

Yang Y, Webb G (2009) Discretization for naive-Bayes learning: managing discretization bias and variance. Mac Learn 74(1):39---74

Digital Library

[86]

García S, Luengo J, Saez JA, López V, Herrera F (2013) A survey of discretization techniques: taxonomy and empirical analysis in supervised learning. IEEE Trans Knowl Data Eng 25(4):734---750

Digital Library

[87]

Jolliffe IT (1986) Principal component analysis. Springer, Berlin

Cited By

Wu Q(2024)Research on the Improvement of Image Segmentation Based on the Combination of Semi-Supervised Learning and Computer VisionProceedings of the 2024 International Conference on Machine Intelligence and Digital Applications10.1145/3662739.3669982(118-126)Online publication date: 30-May-2024
https://dl.acm.org/doi/10.1145/3662739.3669982
Li MZhou THan BLiu TLiang XZhao JGong C(2024)Class-Wise Contrastive Prototype Learning for Semi-Supervised Classification Under Intersectional Class MismatchIEEE Transactions on Multimedia10.1109/TMM.2024.337712326(8145-8156)Online publication date: 18-Mar-2024
https://dl.acm.org/doi/10.1109/TMM.2024.3377123
Xia JLin HXu YTan CWu LLi SLi S(2024)GNN Cleaner: Label Cleaner for Graph Structured DataIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2023.328800236:2(640-651)Online publication date: 1-Feb-2024
https://dl.acm.org/doi/10.1109/TKDE.2023.3288002
Show More Cited By

Self-labeled techniques for semi-supervised learning: taxonomy, software and empirical study
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Supervised learning
    2. Machine learning approaches
2. Information systems
  1. Information systems applications

Recommendations

Inductive Semi-supervised Multi-Label Learning with Co-Training
KDD '17: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

In multi-label learning, each training example is associated with multiple class labels and the task is to learn a mapping from the feature space to the power set of label space. It is generally demanding and time-consuming to obtain labels for training ...
Instance selection in semi-supervised learning
Canadian AI'11: Proceedings of the 24th Canadian conference on Advances in artificial intelligence

Semi-supervised learning methods utilize abundant unlabeled data to help to learn a better classifier when the number of labeled instances is very small. A common method is to select and label unlabeled instances that the current classifier has high ...
Tri-Training: Exploiting Unlabeled Data Using Three Classifiers

In many practical data mining applications, such as Web page classification, unlabeled training examples are readily available, but labeled ones are fairly expensive to obtain. Therefore, semi-supervised learning algorithms such as co-training have ...

Comments

Information & Contributors

Information

Published In

cover image Knowledge and Information Systems

Knowledge and Information Systems Volume 42, Issue 2

February 2015

243 pages

ISSN:0219-1377

Issue’s Table of Contents

Copyright © Copyright © 2015 Springer-Verlag London.

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 01 February 2015

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

100
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 28 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Wu Q(2024)Research on the Improvement of Image Segmentation Based on the Combination of Semi-Supervised Learning and Computer VisionProceedings of the 2024 International Conference on Machine Intelligence and Digital Applications10.1145/3662739.3669982(118-126)Online publication date: 30-May-2024
https://dl.acm.org/doi/10.1145/3662739.3669982
Li MZhou THan BLiu TLiang XZhao JGong C(2024)Class-Wise Contrastive Prototype Learning for Semi-Supervised Classification Under Intersectional Class MismatchIEEE Transactions on Multimedia10.1109/TMM.2024.337712326(8145-8156)Online publication date: 18-Mar-2024
https://dl.acm.org/doi/10.1109/TMM.2024.3377123
Xia JLin HXu YTan CWu LLi SLi S(2024)GNN Cleaner: Label Cleaner for Graph Structured DataIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2023.328800236:2(640-651)Online publication date: 1-Feb-2024
https://dl.acm.org/doi/10.1109/TKDE.2023.3288002
Xiao AHuang JLiu KGuan DZhang XLu S(2024)Domain Adaptive LiDAR Point Cloud Segmentation via Density-Aware Self-TrainingIEEE Transactions on Intelligent Transportation Systems10.1109/TITS.2024.338686525:10(13627-13639)Online publication date: 1-Oct-2024
https://dl.acm.org/doi/10.1109/TITS.2024.3386865
Gu XAngelov PShen Q(2024)Semisupervised Fuzzily Weighted Adaptive Boosting for ClassificationIEEE Transactions on Fuzzy Systems10.1109/TFUZZ.2024.334963732:4(2318-2330)Online publication date: 4-Jan-2024
https://dl.acm.org/doi/10.1109/TFUZZ.2024.3349637
Lin WBusso C(2024)Deep temporal clustering features for speech emotion recognitionSpeech Communication10.1016/j.specom.2023.103027157:COnline publication date: 16-May-2024
https://dl.acm.org/doi/10.1016/j.specom.2023.103027
Garrido-Labrador JSerrano-Mamolar AMaudes-Raedo JRodríguez JGarcía-Osorio C(2024)Ensemble methods and semi-supervised learning for information fusionInformation Fusion10.1016/j.inffus.2024.102310107:COnline publication date: 1-Jul-2024
https://dl.acm.org/doi/10.1016/j.inffus.2024.102310
Kuncheva LGarrido-Labrador JRamos-Pérez IHennessey SRodríguez J(2024)Semi-supervised classification with pairwise constraintsInformation Fusion10.1016/j.inffus.2023.102188104:COnline publication date: 1-Apr-2024
https://dl.acm.org/doi/10.1016/j.inffus.2023.102188
Yaghoubi EYaghoubi EKhamees ARazmi DLu T(2024)A systematic review and meta-analysis of machine learning, deep learning, and ensemble learning approaches in predicting EV charging behaviorEngineering Applications of Artificial Intelligence10.1016/j.engappai.2024.108789135:COnline publication date: 1-Sep-2024
https://dl.acm.org/doi/10.1016/j.engappai.2024.108789
Yang HZhu WWang S(2024)Accuracy and generalization improvement for image quality assessment of authentic distortion by semi-supervised learningApplied Intelligence10.1007/s10489-024-05790-754:21(10948-10961)Online publication date: 1-Nov-2024
https://dl.acm.org/doi/10.1007/s10489-024-05790-7
Show More Cited By

View Options

View options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Issue’s Table of Contents