Skip to content

Espresso-AI/anchors-for-object-detection

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Anchors for Object Detections

This repo has two purposes.

  1. generate anchors for one-stage detection models' prediction
  2. assign targets to the anchors for obtaining the loss

Anchor-Maker

Anchor-based detection models have been mainly developed over four series.

  • Faster R-CNN
  • YOLO
  • RetinaNet
  • SSD

Each anchor-based detection model series has a different manners of creating anchor priors, creating anchors, and combining them with the model's regression predictions. Despite slight variations from model to model, models included in a series basically share the same manners.

Now let's see the examples of how Anchor-Maker gives its anchors to models 🔥

YOLO

img_size = 416
anchor_sizes = [
    [(10, 13), (16, 30), (33, 23)], 
    [(30, 61), (62, 45), (59, 119)], 
    [(116, 90), (156, 198), (373, 326)]
]
strides = [8, 16, 32]
anchors = yolo_anchors(img_size, anchor_sizes, strides)

for i, s in enumerate(strides):
    pred[i][..., :2] = torch.sigmoid(pred[i][..., :2]) * s + anchors[i][..., :2]
    pred[i][..., 2:4] = torch.exp(pred[i][..., 2:4]) * anchors[i][..., 2:]

Here, we should note that in YOLO, unlike the others, anchors and model predictions are a list of tensors on each stride, not tensors flattened over given strides.

RetinaNet

img_size = 608
anchor_sizes = [32, 64, 128, 256, 512]
anchor_scales = [1, 2**(1/3), 2**(2/3)]
aspect_ratios = [(2**(1/2), 2**(-1/2)), (1, 1), (2**(-1/2), 2**(1/2))]
strides = [8, 16, 32, 64, 128]

anchors = retinanet_anchors(img_size, anchor_sizes, anchor_scales, aspect_ratios, strides)

pred[..., :2] = anchors[..., :2] + (pred[..., :2] * anchors[..., 2:])
pred[..., 2:4] = torch.exp(pred[..., 2:4]) * anchors[..., 2:]

SSD

img_size = 300
anchor_sizes = [21, 45, 99, 153, 207, 261]
upper_sizes = [45, 99, 153, 207, 261, 315]
strides = [8, 16, 32, 64, 100, 300]
num_anchors = [4, 6, 6, 6, 4, 4]

anchors = ssd_anchors(img_size, anchor_sizes, upper_sizes, strides, num_anchors)

pred[..., :2] = anchors[..., :2] + (center_variance * pred[..., :2] * anchors[..., 2:])
pred[..., 2:4] = torch.exp(size_variance * pred[..., 2:4]) * anchors[..., 2:]

Anchor-Assigner

It is very necessary to map the predictions and target labels for obtaining the detection models' loss. Anchor-Assigner assigns targets to the anchors, and then return indices of target-assigned anchors and lables of assigned targets.

This implementation can be commonly applied to all series of detection models.

assigner = retinanet_assigner(0.5, 0.4)
assigns = assigner(labels, anchors)

for pred, assign in zip(preds, assigns):
    fore_idx, fore_label = assign['foreground']
    back_idx, _ = assign['background']

    cls_loss = self.focal_loss(pred[..., 4:], fore_idx, back_idx, fore_label[..., 4:])
    reg_loss = self.smooothL1_loss(pred[..., :4], anchors, fore_idx, fore_label[..., :4])

License

BSD 3-Clause License Copyright (c) 2022