Scripts/notebooks for The Nature Conservancy's fish classification competition
- Current best score before blending: 0.886 log loss on LB from
ensemble_model.ipynb
using vgg16bn architecture. - Current best score after blending: 0.822 log loss on LB from blending
ensemble_model.ipynb
results withconv_model.ipynb
results
Fits a group of CNNs of (optionally) varying architectures on the image data; predicts on augmented test data
Computes and saves the convolutional features from a pretrained VGG model w/ batch normalization; uses these features to train a new model on the image data
Uses pretrained PyTorch models with both train-time and test-time augmentation. Have only tested resnet architecture so far, but currently produces third best single model (after "ensemble_model" and "conv_model").
Trains a model to distinguish images that contain fish from those that don't
Similar to ensemble model
, but uses the "fish detector" model to segment out the "NoF" class first
Uses annotations to crop training images down to a bounding box surrounding the fish
Trains model on cropped data to predict coordinates of bounding box in test images
Similar to end_to_end
but uses the "bb regressor" model to try to crop test images first
Mutli-input CNN using keras functional API to incorporate bounding box coordinates as a feature in the model; predicts both coordinates and class probabibilities
Uses conv features of image data as input to SVM model
Same as above, but using gradient boosting instead of SVM
Uses a sliding window to feed subsets of image into CNN for classification
Transformations that could be useful for preprocessing / image segmentation.
Combines submission files into simple ensemble
- Unable to get any improvement over vanilla CNNs using any of the bounding box, cropping, sliding window, or pre-filtering strategies above. In particular, the
bb_end_to_end
model scored significantly worse -- my theories include:- not enough accuracy from bb regressor to make the stategy work
- size/aspect ratio mismatches introduced at some point in the pipeline
- user error in the pipeline somewhere (ie filenames not aligned with predictions, etc)
- Relabeling the dataset also didn't lead to improved performance using these methods
- Unable to get comparable performance from Resnet and Inception models. Might just be an issue of my implementation
(credit to Jeremy Howard, Pai Peng, Naive Shuai, Craig Glastonbury, and others for portions of this code)