DOI: CVPR 2016

Key Qustion

  • the training set is distinguished by a large imbalance between the number of annotated objects and the number of background examples

Contribution

  • Make training more effective and efficient.
  • OHEM is a simple and intuitive algorithm that eliminates several heuristics and hyperparameters in common use.
  • The candidate examples are subsampled according to a distribution that favors diverse, high loss instances.
  • It removes the need for several heuristics and hyperparameters commonly used in region-based ConvNets.
  • It yields a consistent and significant boosts in mean average precision
  • Its effectiveness increases as the training set becomes larger and more difficult, as demonstrated by results on the MS COCO dataset

Architecture

Fast R-CNN

Training Region-based Object Detectors with Online Hard Example Mining

OHEM

Training Region-based Object Detectors with Online Hard Example Mining

Experiments

Training Region-based Object Detectors with Online Hard Example Mining

Training Region-based Object Detectors with Online Hard Example Mining

Training Region-based Object Detectors with Online Hard Example Mining

Training Region-based Object Detectors with Online Hard Example Mining

Conclusion

  • OHEM eliminates several heuristics and hyperparameters in common use by automatically selecting hard examples, thus simplifying training.
  • Though we used Fast R-CNN throughout this paper, OHEM can be used for training any region-based ConvNet detector.

Unknown Key Words

  • bootstrapping = hard negative mining rely on aforementioned alternation template:(a) for some period of time a fixed model is used to find new examples to add to the active training set; (b) then, for some period of time the model is trained on the fixed active training set.
  • hard positive example = false positive example

Questions

  • However, there is a small caveat: co-located RoIs with high overlap are likely to have correlated losses.
    • use standard non-maximum suppression (NMS) to perform deduplication
  • *

Self-Learning

  • SGD is not suitable for bootstrapping template
  • 2 methods of hard example mining
    • remove easy example and then add some hard example
    • add false positives to dataset to train the model again
  • proposal’s IOU with ground truth is in the interval [bg_lo, 0.5), bg_lo = 0.1 is helpful but ignore some infrequent, but import, difficult background regions.
  • OHEM is robust in case one needs fewer images per batch in order to reduce GPU memory usage.

相关文章: