HierarchicalClusteringTaxonomies

Step 1: Data Preparation

Data dataPreparation.ipynb file assumes the following file structure for the dataset: a folder called data\All with subfolders [Crime] containing image files from security camera footage with filenames that start with [Crime][video_num]. Running the code creates random folds of the dataset, keeping all images that belong to the same video in the same fold, and assuring approximately equal distribution of types of crime in each fold.

Step 2: Clustering

The clustering.ipynb contains code that loads the (whole) dataset and uses a pretrained ResNet50 model to extract informative features, by taking the output of the last layer before the softmax layer. It then takes the mean of each class as datapoints and performs agglomerative clustering. The silhoutte score is calculated for each step in the clustering. This is useful for making illustrations. Then the file contains code for building the ontologies. It finds the places in the dendrogram whith the largest difference in silhoutte score between consecutive merges. These places represent "cuts" in the dendrogram of the clustering. It then constructs ontologies for the dataset in a json format to be used later.

Step 3: Training

The trainAndTestAll.py file contains all the code to train and test the classifiers using k-fold crossvalidation. It first creates some datastructers needed to calculate hF1 score later on. It then loops over all the test folds. It then loops over all the ontologies and trains the classifier on the remaining folds, which requires some data preprossesing to train the different classifiers represented by nodes of the ontology. It then loops over all ontologies again to calculate the test metrics.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
README.md		README.md
clustering.ipynb		clustering.ipynb
dataPreparation.ipynb		dataPreparation.ipynb
trainAndTestAll.py		trainAndTestAll.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HierarchicalClusteringTaxonomies

Step 1: Data Preparation

Step 2: Clustering

Step 3: Training

About

Releases

Packages

Languages

WJ44/HierarchicalClusteringTaxonomies

Folders and files

Latest commit

History

Repository files navigation

HierarchicalClusteringTaxonomies

Step 1: Data Preparation

Step 2: Clustering

Step 3: Training

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages