{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# SSD300 MS COCO Evaluation Tutorial\n", "\n", "This is a brief tutorial that goes over how to evaluate a trained SSD300 on one of the MS COCO datasets using the official MS COCO Python tools available here:\n", "\n", "https://github.com/cocodataset/cocoapi\n", "\n", "Follow the instructions in the GitHub repository above to install the `pycocotools`. Note that you will need to set the path to your local copy of the PythonAPI directory in the subsequent code cell.\n", "\n", "Of course the evaulation procedure described here is identical for SSD512, you just need to build a different model." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "from keras import backend as K\n", "from keras.models import load_model\n", "from keras.optimizers import Adam\n", "from scipy.misc import imread\n", "import numpy as np\n", "from matplotlib import pyplot as plt\n", "import sys\n", "\n", "# TODO: Specify the directory that contains the `pycocotools` here.\n", "pycocotools_dir = '../cocoapi/PythonAPI/'\n", "if pycocotools_dir not in sys.path:\n", " sys.path.insert(0, pycocotools_dir)\n", "\n", "from pycocotools.coco import COCO\n", "from pycocotools.cocoeval import COCOeval\n", "\n", "from models.keras_ssd300 import ssd_300\n", "from keras_loss_function.keras_ssd_loss import SSDLoss\n", "from keras_layers.keras_layer_AnchorBoxes import AnchorBoxes\n", "from keras_layers.keras_layer_DecodeDetections import DecodeDetections\n", "from keras_layers.keras_layer_DecodeDetectionsFast import DecodeDetectionsFast\n", "from keras_layers.keras_layer_L2Normalization import L2Normalization\n", "from data_generator.object_detection_2d_data_generator import DataGenerator\n", "from eval_utils.coco_utils import get_coco_category_maps, predict_all_to_json\n", "\n", "%matplotlib inline" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# Set the input image size for the model.\n", "img_height = 300\n", "img_width = 300" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 1. Load a trained SSD\n", "\n", "Either load a trained model or build a model and load trained weights into it. Since the HDF5 files I'm providing contain only the weights for the various SSD versions, not the complete models, you'll have to go with the latter option when using this implementation for the first time. You can then of course save the model and next time load the full model directly, without having to build it.\n", "\n", "You can find the download links to all the trained model weights in the README." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 1.1. Build the model and load trained weights into it" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# 1: Build the Keras model\n", "\n", "K.clear_session() # Clear previous models from memory.\n", "\n", "model = ssd_300(image_size=(img_height, img_width, 3),\n", " n_classes=80,\n", " mode='inference',\n", " l2_regularization=0.0005,\n", " scales=[0.07, 0.15, 0.33, 0.51, 0.69, 0.87, 1.05], # The scales for Pascal VOC are [0.1, 0.2, 0.37, 0.54, 0.71, 0.88, 1.05]\n", " aspect_ratios_per_layer=[[1.0, 2.0, 0.5],\n", " [1.0, 2.0, 0.5, 3.0, 1.0/3.0],\n", " [1.0, 2.0, 0.5, 3.0, 1.0/3.0],\n", " [1.0, 2.0, 0.5, 3.0, 1.0/3.0],\n", " [1.0, 2.0, 0.5],\n", " [1.0, 2.0, 0.5]],\n", " two_boxes_for_ar1=True,\n", " steps=[8, 16, 32, 64, 100, 300],\n", " offsets=[0.5, 0.5, 0.5, 0.5, 0.5, 0.5],\n", " clip_boxes=False,\n", " variances=[0.1, 0.1, 0.2, 0.2],\n", " normalize_coords=True,\n", " subtract_mean=[123, 117, 104],\n", " swap_channels=[2, 1, 0],\n", " confidence_thresh=0.01,\n", " iou_threshold=0.45,\n", " top_k=200,\n", " nms_max_output_size=400)\n", "\n", "# 2: Load the trained weights into the model.\n", "\n", "# TODO: Set the path of the trained weights.\n", "weights_path = 'path/to/trained/weights/VGG_coco_SSD_300x300_iter_400000.h5'\n", "\n", "model.load_weights(weights_path, by_name=True)\n", "\n", "# 3: Compile the model so that Keras won't complain the next time you load it.\n", "\n", "adam = Adam(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-08, decay=0.0)\n", "\n", "ssd_loss = SSDLoss(neg_pos_ratio=3, alpha=1.0)\n", "\n", "model.compile(optimizer=adam, loss=ssd_loss.compute_loss)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Or" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 1.2. Load a trained model" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# TODO: Set the path to the `.h5` file of the model to be loaded.\n", "model_path = 'path/to/trained/model.h5'\n", "\n", "# We need to create an SSDLoss object in order to pass that to the model loader.\n", "ssd_loss = SSDLoss(neg_pos_ratio=3, n_neg_min=0, alpha=1.0)\n", "\n", "K.clear_session() # Clear previous models from memory.\n", "\n", "model = load_model(model_path, custom_objects={'AnchorBoxes': AnchorBoxes,\n", " 'L2Normalization': L2Normalization,\n", " 'DecodeDetections': DecodeDetections,\n", " 'compute_loss': ssd_loss.compute_loss})" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 2. Create a data generator for the evaluation dataset\n", "\n", "Instantiate a `DataGenerator` that will serve the evaluation dataset during the prediction phase." ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": true }, "outputs": [], "source": [ "dataset = DataGenerator()\n", "\n", "# TODO: Set the paths to the dataset here.\n", "MS_COCO_dataset_images_dir = '../../datasets/MicrosoftCOCO/val2017/'\n", "MS_COCO_dataset_annotations_filename = '../../datasets/MicrosoftCOCO/annotations/instances_val2017.json'\n", "\n", "dataset.parse_json(images_dirs=[MS_COCO_dataset_images_dir],\n", " annotations_filenames=[MS_COCO_dataset_annotations_filename],\n", " ground_truth_available=False, # It doesn't matter whether you set this `True` or `False` because the ground truth won't be used anyway, but the parsing goes faster if you don't load the ground truth.\n", " include_classes='all',\n", " ret=False)\n", "\n", "# We need the `classes_to_cats` dictionary. Read the documentation of this function to understand why.\n", "cats_to_classes, classes_to_cats, cats_to_names, classes_to_names = get_coco_category_maps(MS_COCO_dataset_annotations_filename)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 3. Run the predictions over the evaluation dataset\n", "\n", "Now that we have instantiated a model and a data generator to serve the dataset, we can make predictions on the entire dataset and save those predictions in a JSON file in the format in which COCOeval needs them for the evaluation.\n", "\n", "Read the documenation to learn what the arguments mean, but the arguments as preset below are the parameters used in the evaluation of the original Caffe models." ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# TODO: Set the desired output file name and the batch size.\n", "results_file = 'detections_val2017_ssd300_results.json'\n", "batch_size = 20 # Ideally, choose a batch size that divides the number of images in the dataset." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Number of images in the evaluation dataset: 5000\n", "Producing results file: 100%|██████████| 250/250 [04:11<00:00, 1.05s/it]\n", "Prediction results saved in 'detections_val2017_ssd300_results.json'\n" ] } ], "source": [ "predict_all_to_json(out_file=results_file,\n", " model=model,\n", " img_height=img_height,\n", " img_width=img_width,\n", " classes_to_cats=classes_to_cats,\n", " data_generator=dataset,\n", " batch_size=batch_size,\n", " data_generator_mode='resize',\n", " model_mode='inference',\n", " confidence_thresh=0.01,\n", " iou_threshold=0.45,\n", " top_k=200,\n", " normalize_coords=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 4. Run the evaluation\n", "\n", "Now we'll load the JSON file containing all the predictions that we produced in the last step and feed it to `COCOeval`. Note that the evaluation may take a while." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "loading annotations into memory...\n", "Done (t=0.46s)\n", "creating index...\n", "index created!\n", "Loading and preparing results...\n", "DONE (t=5.87s)\n", "creating index...\n", "index created!\n" ] } ], "source": [ "coco_gt = COCO(MS_COCO_dataset_annotations_filename)\n", "coco_dt = coco_gt.loadRes(results_file)\n", "image_ids = sorted(coco_gt.getImgIds())" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Running per image evaluation...\n", "Evaluate annotation type *bbox*\n", "DONE (t=64.15s).\n", "Accumulating evaluation results...\n", "DONE (t=10.58s).\n", " Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.247\n", " Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.424\n", " Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.253\n", " Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.059\n", " Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.264\n", " Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.414\n", " Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.232\n", " Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.341\n", " Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.362\n", " Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.102\n", " Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.401\n", " Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.577\n" ] } ], "source": [ "cocoEval = COCOeval(cocoGt=coco_gt,\n", " cocoDt=coco_dt,\n", " iouType='bbox')\n", "cocoEval.params.imgIds = image_ids\n", "cocoEval.evaluate()\n", "cocoEval.accumulate()\n", "cocoEval.summarize()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.5.3" } }, "nbformat": 4, "nbformat_minor": 2 }