MI_Feature_Selection

Feature selection is an essential aspect of machine learning. When used properly it can make classifiers more efficient and more accurate. Filter feature selection is a category of common methods. These are classifier independent methods that use approximations of mutual information to select a subset of features. However, these approximations often rely on poor assumptions that are not usually true. These assumptions can be avoided by using lower bound of mutual information instead.

This bound is calculated using an arbitrary measurable Q distribution. The closer this Q distribution is to the data’s actual distribution, the better the bound becomes. In this project I explore a weighted pairwise Q distribution that incorporates prior knowledge of the data’s distribution. My experiments show a small improvement over using un-weighted pairwise Q distributions.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
data		data
InfoFeatureSelection - Pairwise.sln		InfoFeatureSelection - Pairwise.sln
MI Lower Bound for Feature Selection.pdf		MI Lower Bound for Feature Selection.pdf
README.md		README.md
VMI_pairwise_extended.cpp		VMI_pairwise_extended.cpp
sample_generation.py		sample_generation.py
synthdata.txt		synthdata.txt
synthdata_labels.txt		synthdata_labels.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MI_Feature_Selection

About

Releases

Packages

Languages

wszamotula/MI_Feature_Selection

Folders and files

Latest commit

History

Repository files navigation

MI_Feature_Selection

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages