skip to main content
10.1145/1835804.1835848acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Unsupervised feature selection for multi-cluster data

Published: 25 July 2010 Publication History

Abstract

In many data analysis tasks, one is often confronted with very high dimensional data. Feature selection techniques are designed to find the relevant feature subset of the original features which can facilitate clustering, classification and retrieval. In this paper, we consider the feature selection problem in unsupervised learning scenario, which is particularly difficult due to the absence of class labels that would guide the search for relevant information. The feature selection problem is essentially a combinatorial optimization problem which is computationally expensive. Traditional unsupervised feature selection methods address this issue by selecting the top ranked features based on certain scores computed independently for each feature. These approaches neglect the possible correlation between different features and thus can not produce an optimal feature subset. Inspired from the recent developments on manifold learning and L1-regularized models for subset selection, we propose in this paper a new approach, called Multi-Cluster Feature Selection (MCFS), for unsupervised feature selection. Specifically, we select those features such that the multi-cluster structure of the data can be best preserved. The corresponding optimization problem can be efficiently solved since it only involves a sparse eigen-problem and a L1-regularized least squares problem. Extensive experimental results over various real-life data sets have demonstrated the superiority of the proposed algorithm.

Supplementary Material

JPG File (kdd2010_cai_ufsm_01.jpg)
MOV File (kdd2010_cai_ufsm_01.mov)

References

[1]
M. Belkin and P. Niyogi. Laplacian eigenmaps and spectral techniques for embedding and clustering. In Advances in Neural Information Processing Systems 14, pages 585--591. 2001.
[2]
J. Bi, K. Bennett, M. Embrechts, C. Breneman, and M. Song. Dimensionality reduction via sparse support vector machines. Journal of Machine Learning Research, 3:1229--1243, 2003.
[3]
S. Boutemedjet, N. Bouguila, and D. Ziou. A hybrid feature extraction selection approach for high-dimensional non-gaussian data clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(8):1429--1443, 2009.
[4]
C. Boutsidis, M. W. Mahoney, and P. Drineas. Unsupervised feature selection for principal components analysis. In Proceeding of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'08), pages 61--69, 2008.
[5]
D. Cai. Spectral Regression: A Regression Framework for Efficient Regularized Subspace Learning. PhD thesis, Department of Computer Science, University of Illinois at Urbana-Champaign, May 2009.
[6]
D. Cai, X. He, and J. Han. Spectral regression: A unified approach for sparse subspace learning. In Proc. Int. Conf. on Data Mining (ICDM'07), 2007.
[7]
D. Cai, X. He, and J. Han. Sparse projections over graph. In Proc. 2008 AAAI Conf. on Artificial Intelligence (AAAI'08), 2008.
[8]
P. K. Chan, D. F. Schlag, and J. Y. Zien. Spectral k-way ratio-cut partitioning and clustering. IEEE Transactions on Computer-Aided Design, 13:1088--1096, 1994.
[9]
F. R. K. Chung. Spectral Graph Theory, volume 92 of Regional Conference Series in Mathematics. AMS, 1997.
[10]
C. Constantinopoulos, M. K. Titsias, and A. Likas. Bayesian feature and model selection for gaussian mixture models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(6):1013--1018, 2006.
[11]
T. M. Cover and J. A. Thomas. Elements of Information Theory. Wiley-Interscience, 2nd edition, 2006.
[12]
R. O. Duda, P. E. Hart, and D. G. Stork. Pattern Classification. Wiley-Interscience, Hoboken, NJ, 2nd edition, 2000.
[13]
J. G. Dy and C. E. Brodley. Feature selection for unsupervised learning. Journal of Machine Learning Research, 5:845--889, 2004.
[14]
B. Efron, T. Hastie, I. Johnstone, and R. Tibshirani. Least angle regression. Annals of Statistics, 32(2):407--499, 2004.
[15]
M. A. Fanty and R. Cole. Spoken letter recognition. In Advances in Neural Information Processing Systems 3, 1990.
[16]
T. Hastie, R. Tibshirani, and J. Friedman. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. New York: Springer-Verlag, 2001.
[17]
X. He, D. Cai, and P. Niyogi. Laplacian score for feature selection. In Advances in Neural Information Processing Systems 18, 2005.
[18]
J. J. Hull. A database for handwritten text recognition research. IEEE Trans. Pattern Anal. Mach. Intell., 16(5), 1994.
[19]
R. Kohavi and G. H. John. Wrappers for feature subset selection. Artificial Intelligence, 97(1--2):273--324, 1997.
[20]
M. H. C. Law, M. A. T. Figueiredo, and A. K. Jain. Simultaneous feature selection and clustering using mixture models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(9):1154--1166, 2004.
[21]
H. Liu and L. Yu. Toward integrating feature selection algorithms for classification and clustering. IEEE Transactions on Knowledge and Data Engineering, 17(4):491--502, 2005.
[22]
A. Y. Ng, M. Jordan, and Y. Weiss. On spectral clustering: Analysis and an algorithm. In Advances in Neural Information Processing Systems 14, pages 849--856. MIT Press, Cambridge, MA, 2001.
[23]
J. L. Rodgers and W. A. Nicewander. Thirteen ways to look at the correlation coefficient. The American Statistician, 42(1):59--66, 1988.
[24]
V. Roth and T. Lange. Feature selection in clustering problems. In Advances in Neural Information Processing Systems 16. 2003.
[25]
S. Roweis and L. Saul. Nonlinear dimensionality reduction by locally linear embedding. Science, 290(5500):2323--2326, 2000.
[26]
J. Shi and J. Malik. Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8):888--905, 2000.
[27]
G. W. Stewart. Matrix Algorithms Volume II: Eigensystems. SIAM, 2001.
[28]
J. Tenenbaum, V. de Silva, and J. Langford. A global geometric framework for nonlinear dimensionality reduction. Science, 290(5500):2319--2323, 2000.
[29]
L. Wolf and A. Shashua. Feature selection for unsupervised and supervised inference: The emergence of sparsity in a weight-based approach. Journal of Machine Learning Research, 6:1855--1887, 2005.
[30]
Z. Zhao and H. Liu. Spectral feature selection for supervised and unsupervised learning. In Proceedings of the 24th Annual International Conference on Machine Learning (ICML'07), pages 1151--1157, 2007.

Cited By

View all
  • (2025)Unsupervised feature selection using sparse manifold learning: Auto-encoder approachInformation Processing & Management10.1016/j.ipm.2024.10392362:1(103923)Online publication date: Jan-2025
  • (2025)D3WC: Deep three-way clustering with granular evidence fusionInformation Fusion10.1016/j.inffus.2024.102699114(102699)Online publication date: Feb-2025
  • (2024)CoCoder : Concrete Autoencoder using Covariance for Unsupervised Feature SelectionJOURNAL OF BROADCAST ENGINEERING10.5909/JBE.2024.29.3.24229:3(242-251)Online publication date: 31-May-2024
  • Show More Cited By

Index Terms

  1. Unsupervised feature selection for multi-cluster data

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    KDD '10: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
    July 2010
    1240 pages
    ISBN:9781450300551
    DOI:10.1145/1835804
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 25 July 2010

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. clustering
    2. feature selection
    3. unsupervised

    Qualifiers

    • Research-article

    Conference

    KDD '10
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)207
    • Downloads (Last 6 weeks)16
    Reflects downloads up to 14 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2025)Unsupervised feature selection using sparse manifold learning: Auto-encoder approachInformation Processing & Management10.1016/j.ipm.2024.10392362:1(103923)Online publication date: Jan-2025
    • (2025)D3WC: Deep three-way clustering with granular evidence fusionInformation Fusion10.1016/j.inffus.2024.102699114(102699)Online publication date: Feb-2025
    • (2024)CoCoder : Concrete Autoencoder using Covariance for Unsupervised Feature SelectionJOURNAL OF BROADCAST ENGINEERING10.5909/JBE.2024.29.3.24229:3(242-251)Online publication date: 31-May-2024
    • (2024)A New Fast Filter-based Unsupervised Feature Selection Algorithm Using Cumulative and Shannon EntropyJournal of Soft Computing and Artificial Intelligence10.55195/jscai.14646385:1(11-23)Online publication date: 15-Jun-2024
    • (2024)Fundamental Components and Principles of Supervised Machine Learning Workflows with Numerical and Categorical DataEng10.3390/eng50100215:1(384-416)Online publication date: 29-Feb-2024
    • (2024) Sparse PCA via ℓ 2,0 -Norm Constrained Optimization for Unsupervised Feature Selection 2024 43rd Chinese Control Conference (CCC)10.23919/CCC63176.2024.10661810(7375-7379)Online publication date: 28-Jul-2024
    • (2024)A Riemannian Augmented Lagrangian Method for Structured Sparse PCA2024 43rd Chinese Control Conference (CCC)10.23919/CCC63176.2024.10661785(3410-3415)Online publication date: 28-Jul-2024
    • (2024)Unsupervised Feature Selection to Identify Important ICD-10 and ATC Codes for Machine Learning on a Cohort of Patients With Coronary Heart Disease: Retrospective StudyJMIR Medical Informatics10.2196/5289612(e52896-e52896)Online publication date: 26-Jul-2024
    • (2024)Refined composite multivariate multiscale weighted permutation entropy and multicluster feature selection-based fault detection of gearboxTransactions of the Institute of Measurement and Control10.1177/01423312241257143Online publication date: 25-Jul-2024
    • (2024)Double-Structured Sparsity Guided Flexible Embedding Learning for Unsupervised Feature SelectionIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2023.326718435:10(13354-13367)Online publication date: Oct-2024
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media