首页 > 其他 > 详细

[Tadeas] Image Detection

时间:2019-10-18 00:03:51      阅读:68      评论:0      收藏:0      [点我收藏+]

Object Detection

Bounding box(最小外截距项)和category label

Difference with the other task:

  1. single task: classification <---> classification and localization
  2. multi-task: object detection <---> instance segmentation

contest

imagenet large scale visual recognition challenge

URL: http://image-net.org/challenges/LSVRC/2017/index

  • object detection
  1. 2013-2017, 200 category, multi-category and bounding box each image
  • other task
  1. image classification
  2. scene classification
  3. object localization
  4. scene parsing

PASCAL VOC

URL: http://host.robots.ox.ac.uk/pascal/VOC/

  • eject Detection

    Bounding box(最小外截距项)和category label

    Difference with the other task:

  1. single task: classification <---> classification and localization
  2. multi-task: object detection <---> instance segmentation

contest

imagenet large scale visual recognition challenge

URL: http://image-net.org/challenges/LSVRC/2017/index

  • object detection
  1. 2013-2017, 200 category, multi-category and bounding box each image
  • other task
  1. image classification
  2. scene classification
  3. object localization
  4. scene parsing

PASCAL VOC

URL: http://host.robots.ox.ac.uk/pascal/VOC/

  • evolution:R-CNN --> SPP-Net --> Fast R-CNN --> Faster R-CNN --> YOLO --> SSD -->R-FCN

R-CNN

  • model structure:
  1. Region proposal: devisity boxing with different position and size
  2. Classification: CNN classificator, heavy calculation

    技术分享图片
    $ $
    R-CNN model
技术分享图片 $ $ R-CNN in models
  • R-CNN trainning process
  1. pre-train a imageNet CNN model --> Model1
  2. fine-tune proposal regions with SS(selective search) -->Model2
    -- Log loss
    -- softmax layer --> (N+1) way
    -- 32 positive samples (N classes): IoU with Ground truth > 0.5
    -- 96 negative samples (1 classes): IoU < 0.5
技术分享图片
  1. based on Model2, on its Fc7 train linear SVM classificator --> Model3
    -- Hinge loss
    -- each class with a SVM classficator (N classes)
    -- postive samples: all Ground-truth areas
    -- negative samples: IoU < 0.3 of SS of SS areas
技术分享图片
  1. with Model2, on Fc7, train Bounding box regression model
    -- improve bounding box accuracy
    -- a regression model for each class (N class)
    --- bounding box in SS regression:P --> G
    --- input:
    ----- bounding box \({P^i, G^i}_{i=1,...,N}\)
    ----- center: \((x, y)\), width and height:\((w, h)\)
    ----- $ P^i=(P^i_x, P^i_y, P^i_w, P^i_h), P^i=(P^i_x, P^i_y, P^i_w, P^i_h)$
    ----- feature for Conv5 of CNN: \(\phi_5(P)\)
    --- IoU for P > 0.6
    --- Squared loss:
    ----- \({{\rm{W}}_*} =\mathop {\arg \min }\limits_{{{{\rm{\hat w}}}_*}} {( {t_o^i - {{{\rm{\hat w}}}_o}{\phi_5}( {{P^i}} )} )^2} + \lambda ||{{\rm{\hat w}}_*}||^2\)
    ----- where \(t_x = (G_x-P_x)/{P_w}, t_y = (G_y-P_y)/{P_y}, t_w = log(G_w/{P_w}), t_h= log(G_h/{P_h})\)
    --- test process:
    ----- \({\hat G}_x = P_x d_x(P)+P_x, {\hat G}_y = P_y d_y(P)+P_y, {\hat G}_w = P_w exp(d_w(P)), d_*(P) = {\rm}^{\rm T}_* \phi_5 (P)\)

explain for cost function: Ref

  1. R-CNN test
    -- SS (fast model) get about 2000 proposal areas/images
    -- resize(expand and scaling) them into size of 227*227
    -- Model2 to get 2 feature map: proposal areas subset and regressed Bbox
    --- for each class
    1). Fc7 feature --> SVM classificator --> class score
    2). with IoU > 0.5, get non-redundant area subset
  • reorder all areas in decrease order with the score
  • remove redundant areas: IoU > 0.5 with the area of highest score
  • save the highest score area, the left areas to candidate set
    3). conv5 --> Bounding box regression model --> regressed Bbox
    4). with regressed Bboxs fine tune ares subset
  1. performance evaluation
  • mAp@5 (mean Average Precision)
    -- calculate AP for each class, then get mean value
  • AP is the area of Precision-Recall Curve
  • precision: TP/(TP+FP)
  • recall: TP/(TP+FN)
    ** True positive area: IoU >=0.5 with ground Truth
    ** False positive area: IoU < 0.5
    ** False negative area: unmarked Gound truth area:
    ** IoU = Intersection over Unit = (AnB)/(AUB)
  1. defect:
  • too much time for training process
    -- fine-tune (18) + feature extraction (63) + SUM/Box training (3)
  • too much time for test: a VGG16 47S
  • complex and multi-training processes

[Tadeas] Image Detection

原文:https://www.cnblogs.com/tadeas/p/11669158.html

(0)
(0)
   
举报
评论 一句话评论(0
关于我们 - 联系我们 - 留言反馈 - 联系我们:wmxa8@hotmail.com
© 2014 bubuko.com 版权所有
打开技术之扣,分享程序人生!