Nature Conservancy Fish Classification

Scripts/notebooks for The Nature Conservancy's fish classification competition

Current best score before blending: 0.886 log loss on LB from ensemble_model.ipynb using vgg16bn architecture.
Current best score after blending: 0.822 log loss on LB from blending ensemble_model.ipynb results with conv_model.ipynb results

ensemble_model.ipynb

Fits a group of CNNs of (optionally) varying architectures on the image data; predicts on augmented test data

conv_model.ipynb

Computes and saves the convolutional features from a pretrained VGG model w/ batch normalization; uses these features to train a new model on the image data

pytorch_model.ipynb

Uses pretrained PyTorch models with both train-time and test-time augmentation. Have only tested resnet architecture so far, but currently produces third best single model (after "ensemble_model" and "conv_model").

fish_detector.ipynb

Trains a model to distinguish images that contain fish from those that don't

end_to_end.ipynb

Similar to ensemble model, but uses the "fish detector" model to segment out the "NoF" class first

bb_crops.ipynb

Uses annotations to crop training images down to a bounding box surrounding the fish

bb_regressor.ipynb

Trains model on cropped data to predict coordinates of bounding box in test images

bb_end_to_end.ipynb

Similar to end_to_end but uses the "bb regressor" model to try to crop test images first

bb_multi_input.ipynb

Mutli-input CNN using keras functional API to incorporate bounding box coordinates as a feature in the model; predicts both coordinates and class probabibilities

tf_svm.ipynb

Uses conv features of image data as input to SVM model

tf_xgb.ipynb

Same as above, but using gradient boosting instead of SVM

sliding_window.ipynb

Uses a sliding window to feed subsets of image into CNN for classification

image_ops.ipynb

Transformations that could be useful for preprocessing / image segmentation.

submission_blender.ipynb

Combines submission files into simple ensemble

Other Notes:

Unable to get any improvement over vanilla CNNs using any of the bounding box, cropping, sliding window, or pre-filtering strategies above. In particular, the bb_end_to_end model scored significantly worse -- my theories include:
- not enough accuracy from bb regressor to make the stategy work
- size/aspect ratio mismatches introduced at some point in the pipeline
- user error in the pipeline somewhere (ie filenames not aligned with predictions, etc)
Relabeling the dataset also didn't lead to improved performance using these methods
Unable to get comparable performance from Resnet and Inception models. Might just be an issue of my implementation

(credit to Jeremy Howard, Pai Peng, Naive Shuai, Craig Glastonbury, and others for portions of this code)

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
nc_fish_classification		nc_fish_classification
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Nature Conservancy Fish Classification

ensemble_model.ipynb

conv_model.ipynb

pytorch_model.ipynb

fish_detector.ipynb

end_to_end.ipynb

bb_crops.ipynb

bb_regressor.ipynb

bb_end_to_end.ipynb

bb_multi_input.ipynb

tf_svm.ipynb

tf_xgb.ipynb

sliding_window.ipynb

image_ops.ipynb

submission_blender.ipynb

Other Notes:

About

Releases

Packages

Languages

rdcolema/nc-fish-classification

Folders and files

Latest commit

History

Repository files navigation

Nature Conservancy Fish Classification

ensemble_model.ipynb

conv_model.ipynb

pytorch_model.ipynb

fish_detector.ipynb

end_to_end.ipynb

bb_crops.ipynb

bb_regressor.ipynb

bb_end_to_end.ipynb

bb_multi_input.ipynb

tf_svm.ipynb

tf_xgb.ipynb

sliding_window.ipynb

image_ops.ipynb

submission_blender.ipynb

Other Notes:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages