Skip to content

Empirical Evaluation Of Elastic Net In Cancer Gene Analysis

Notifications You must be signed in to change notification settings

101x4/pancancer

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 

Repository files navigation

STAT 844 Statistical Learning - Advanced Regression

Empirical Evaluation Of Elastic Net In Cancer Gene Analysis

We did a data analytic and applied statistics project in the context of high-dimensional cancer gene data modeling. Variable selection is of paramount importance in gene data analysis because this kind of data often has extremely high dimensions relative to the sample size. Among all kinds of variable selection techniques, the Elastic Net is excellent for its parameter estimation ability, the balance of fitting and penalizing, and the unique advantage of the grouping effect.

In our project, we trained several gene TP53 inactivation classifiers, compared the performance of Lasso, Ridge, and Elastic net in sparse logistic regression and support vector machines. We verified that logistic regression is a better model in this project and Elastic Net performs better than Lasso and Ridge regularization, with a higher test AUC (93.6%) and a reasonable proportion of selected variables. In addition, we did further analysis to make sure this Elastic Net model has biological significance and good interpretability.

About

Empirical Evaluation Of Elastic Net In Cancer Gene Analysis

Resources

Stars

Watchers

Forks

Packages

No packages published