2016年5月4日 星期三

[AMMAI] [Lecture 10] - "Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks"

Paper Information:
  Ren, Shaoqing, et al. "Faster R-CNN: Towards real-time object detection with region proposal networks." Advances in Neural Information Processing Systems. 2015.

Motivation:
  For state-of-the-art object detection networks, the region proposal computation is a bottleneck.

Contributions:
  A cost-free region proposal method, which is Region Proposal Network (RPN) that shares full-image convolutional features with the detection network.

Technical summarization:
  Region proposal networks:

    A RPN takes an image as input and outputs a set of rectangular object proposals, each with an objectness score. Each sliding window is mapped to a lower-dimensional vector. Furthermore, this vector is fed into two sibling FC layers a box-regression layer (reg) and a box-classification layer (cls).

  Translation invariant:
    No matter how the position of object in the image changes, the same function should be able to predict the proposal in either location.

  A Loss Function for Learning Region Proposals:

 
To train RPN, they minimize an objective function above,which the pi is the predicted probability of anchor i, pi* is the ground truth label, ti is a vector representation and ti* is the ground truth box.



  Optimization: 
    For preventing bias, they randomly sample 256 anchors in an image and the positive and negative anchors's ratio of up to 1:1.

  Sharing Convolutional Features for Region Proposal and Object Detection:
    There 4 step for sharing convolutional features.
      1.Train the RPN
      2.Train the Fast R-CNN by RPN.
      3.Use detector network to initialize RPN training, but fix the shared conv and only fine-tune the layers unique to RPN
      4.keeping the shared conv fixed, we fine-tune the fc layers of the Fast R-CNN

My comment:

    Picture below shows that RPN is really much faster than SS. Moreover, it also reveals that RPN can really be trained to have a better performance than pre-defined algorithm.


Below picture show the recall of proposals of RPN at different IoU ratio.
  It shows that RPN is quite stable even dealing with different number of proposals.

  Besides, compared to Selective Search, it demonstrate that it benefits from the training of networks. However, RPN needs bouding box for training and SS don't. SS is still quite a good method to object proposals because bounding box for large scale dataset is not easy to obtain usually.













沒有留言:

張貼留言