cmsc23360-project3

Team repo for Advanced Networks project 3

extractor.py

This is a script provided to us to extract features from the raw data. In the context of this project, it should be run as follows:

python3 extractor.py path/to/raw_data

This will create results/raw_data as output

package_data.py

This script takes the ouput of extractor.py, parses it into and Pandas dataframe, and outputs it as a CSV which can be used by the machine learning script. This script requires the pandas and pickle library, as well as Python 3. It can be run as follows:

python3 package_data.py

This will create clean_data.csv as output.

train_ml.py

This script trains a decision tree using the CSV obtained from package_data.py. It will output a CSV with all the features and the corresponding median accuracy. This script requires the pandas, numpy, and sklearn libraries as well as Python 3. It can be run as follows:

python3 train_ml.py

This will create ml_ranks.csv as output.

pivot_data.R

An R script used to pivot and rank the CSV created by the train_ml.py script. It should be run on R 3.6.1. This is not necessary to run, but will pivot the data into a friendlier format for further analysis including sorting.

This will create ml_ranks_cleaned.csv as output.

Instructions

To run this project, simply run the scripts above sequentially as follows:

python3 extractor.py path/to/raw_data
python3 package_data.py
python3 train_ml.py
(optional) Rscript pivot_data.R

Or run pivot_data.R in RStudio

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
data/raw_data		data/raw_data
results		results
README.md		README.md
conf.ini		conf.ini
const.py		const.py
extractor-dissect.py		extractor-dissect.py
extractor.py		extractor.py
ml_ranks.csv		ml_ranks.csv
ml_ranks_cleaned.csv		ml_ranks_cleaned.csv
ml_ranks_sorted.csv		ml_ranks_sorted.csv
package-data.py		package-data.py
pivot_data.R		pivot_data.R
scratch.py		scratch.py
train_ml.py		train_ml.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

cmsc23360-project3

extractor.py

package_data.py

train_ml.py

pivot_data.R

Instructions

About

Releases

Packages

Contributors 2

Languages

jwhenry28/cmsc23360-project3

Folders and files

Latest commit

History

Repository files navigation

cmsc23360-project3

extractor.py

package_data.py

train_ml.py

pivot_data.R

Instructions

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages