Skip to content

wittawatj/l1lsmi

Repository files navigation

L1-LSMI

L1-LSMI is a Matlab package for non-linear supervised feature selection with L1-penalized squared-loss mutual information. See this page on my web site for more details.

The source tree in this git repository includes all files I used for experiments and possibly other third-party packages. If you just need the minimal L1-LSMI package, grab this zip instead.

License

GPL v3

L1-LSMI a Matlab package for non-linear supervised feature selection 
with L1-penalized squared-loss mutual information.
Copyright (C) 2012 Wittawat Jitkrittum

L1-LSMI is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with this program.  If not, see https://www.gnu.org/licenses/.

Usage Demo

%
% Demonstrate how to use pglsmi (L1-LSMI).
% pg is an internal code for 'Plain gradient'
%
rng(1)
%%%%% Generate a toy dataset
% X is #dim x #sample
% Y is 1 x #sample
[X Y] = gen_plus(400);

%%%%% Some settings
% Number of features to select. Necessary option.
o.k = 2;

% How many restarts ? More restarts of course give better features, but
% slower.
o.ztuner_repeat = 1;

% LSMI cross validation fold
o.fold = 5; 

% Max iterations for one value of z (l1-ball radius)
o.maxIter = 100;

% Internally, pglsmi tries to find z which gives k features.
% But, in case that k is very large, it is difficult to get exact k
% features. So, we may allow some radius.
% 
% If #features_found is between k and k+high_k_radius, then treat as found.
% Put 0 (exact k) here since it is just a demo. 
o.high_k_radius = 0; 

% There are many other options. See the relevant files in pglsmi folder.
% In particular, see the lines with "myProcessOptions".

%%%%% Run pglsmi
S = fs_pglsmi(X, Y, o); % return a struct 
S

% S.F is a logical vector indicating which features are selected.
% Hopefully, we have [1 1 0 0 0 0 0 0 0 0] here. (the first two features 
% should be selected).
S.F

% S.ZT.W returns the actual weights W found using z = S.ZT.z. The final
% objective function value is S.ZT.f.
S.ZT.W

%%%%%%%%%%%%%%%%%%%%
end

function [X Y]=gen_plus(n)
% Simple classification problem.    
% 10 features.
% Y = sign(2*X1 - 3*X2 + noise)
    X = randn(10, n);
    Y = sign(2*X(1,:) - 3*X(2,:) + randn(1,n)/2);
    Y(Y==0) = 1;
end

Demo Output

S = 

        timetictoc: 4.2043
           timecpu: 8.8500
                 F: [1 1 0 0 0 0 0 0 0 0]
                ZT: [1x1 struct]
            ZTLogs: {[1x1 struct]}
        redun_rate: 6.0813e-04
    abs_redun_rate: 6.0813e-04


ans =

     1     1     0     0     0     0     0     0     0     0


ans =

    0.5290
    0.7510
         0
         0
         0
         0
         0
         0
         0
         0

meaning only the first two features (dimensions) among the 10 features are relevant to explain the output Y. This makes sense as the output Y in our toy dataset depends on only the first two features.