AWID Cyber threat detection using feature engineering (SSAE, PCA, UMAP, PHATE, SA, LLE, FA, LDA), Recursive Feature Elimination, Logistic Regression and Linear SVM classification (with Bayesian optimisation)
- Download and Install libraries
- Mount Google Drive
- Load Data From CSV and Rename Class Variable
5) Pre-Processing Data
6) Shuffle the Pandas Dataframe
7) Pandas Dataset Split to NumPy (and empty the generated features dataframe)
8) Stacked Sparse Autoencoder for Dimansionality Reduction and Feature Generation
9) Stacked Sparse Autoencoder Optimisation using GridSearchCV
10) PCA for Feature Generation
- UMAP for Feature Generation
- PHATE for Feature Generation
- Spectral Embedding for Feature Generation
- LocallyLinearEmbedding for Feature Generation
- Factor Analysis for Feature Generation
- LDA for Feature Generation
- Concatinate the generated features onto the original dataset
- Drop Features by Name (delete from generated features dataframe)
- Summary of Generated Feature Dataframe
- Reselect all generated features (undo Feature Selection)
- Feature Selection by Wrapper Method with RFE using LinearSVC / LogisticRegression / LinearDiscriminantAnalysis / RidgeClassifier
- Compare Classifier Models (unoptimised)
- LinearSVM Classifier with BayesSearchCV Hyperparameter Optimisation
- Logistic Regression Classifier with BayesSearchCV Hyperparameter Optimisation
- Logistic Regression Classifier (Optimised)
- LinnearSVM Classifier (Optimised)