Question

我刚刚安装了Fast RCNN并运行了演示，

我开始怀疑是否可以从图像中的所有边界框中提取特征（并为整个数据集执行此操作）。

例如，如果快速RCNN从图像中检测到猫，狗和汽车，

我想为每只猫，狗和汽车提取单独的CNN功能。

为成千上万的图像做到这一点。

Fast RCNN的Github（https://github.com/rbgirshick/caffe-fast-rcnn/tree/master/examples/feature_extraction）上的特征提取示例似乎是使用caffe对整个图像进行特征提取的复制品，而不是每个边界框。

有人可以帮我吗？

更新：

显然，每个边界框的特征提取都在https://github.com/rbgirshick/fast-rcnn/blob/master/lib/fast_rcnn/test.py的代码的以下部分中完成：

# When mapping from image ROIs to feature map ROIs, there's some aliasing
# (some distinct image ROIs get mapped to the same feature ROI).
# Here, we identify duplicate feature ROIs, so we only compute features
# on the unique subset.
if cfg.DEDUP_BOXES > 0:
    v = np.array([1, 1e3, 1e6, 1e9, 1e12])
    hashes = np.round(blobs['rois'] * cfg.DEDUP_BOXES).dot(v)
    _, index, inv_index = np.unique(hashes, return_index=True,
                                    return_inverse=True)
    blobs['rois'] = blobs['rois'][index, :]
    boxes = boxes[index, :]

# reshape network inputs
net.blobs['data'].reshape(*(blobs['data'].shape))
net.blobs['rois'].reshape(*(blobs['rois'].shape))
blobs_out = net.forward(data=blobs['data'].astype(np.float32, copy=False),
                        rois=blobs['rois'].astype(np.float32, copy=False))
if cfg.TEST.SVM:
    # use the raw scores before softmax under the assumption they
    # were trained as linear SVMs
    scores = net.blobs['cls_score'].data
else:
    # use softmax estimated probabilities
    scores = blobs_out['cls_prob']

if cfg.TEST.BBOX_REG:
    # Apply bounding-box regression deltas
    box_deltas = blobs_out['bbox_pred']
    pred_boxes = _bbox_pred(boxes, box_deltas)
    pred_boxes = _clip_boxes(pred_boxes, im.shape)
else:
    # Simply repeat the boxes, once for each class
    pred_boxes = np.tile(boxes, (1, scores.shape[1]))

if cfg.DEDUP_BOXES > 0:
    # Map scores and predictions back to the original set of boxes
    scores = scores[inv_index, :]
    pred_boxes = pred_boxes[inv_index, :]

return scores, pred_boxes

我正在尝试弄清楚如何调整此功能以保存功能，就像我们使用Caffe一样保存整个图像的功能，这些功能会保存到mdb文件中。

Answer 1

<强>更新

在确定右边界框的过程中，Fast-RCNN从高（约800-2000）个图像区域中提取CNN特征，称为对象建议。这些区域通过不同的算法获得，通常为selective search。在此计算之后，它使用这些功能来识别＆＃34; right＆＃34;建议并找出＆＃34;权利＆＃34;边界框。这称为边界框回归。

当然，Fast-RCNN优化了这个过程，但仍然需要从与感兴趣对象相关的区域中提取更多区域的CNN特征。

很快，如果您要将变量blobs_out保存在粘贴的代码快照中，则会保存相对于所有对象提案的功能，包括＆＃34;错误＆＃34;提案。但是你可以保存所有这些，然后尝试修剪并只检索所需的那些。要保存这些功能，只需使用pickle.dump()。

查看test_net函数here的结尾。 nms_dets变量似乎存储了最后的框。可能有一种方法可以获取您存储的blobs_out并关闭不需要的功能，但它看起来并不那么简单。

我能够考虑的最简单的解决方案如下：

让Fast-RCNN计算最终的边界框。然后，提取相关图像补丁，使用类似下面的内容（我假设是Python）：

img = cv2.imread('/path/to/image')
for bbox in bboxes_list:
    x0, y0, x1, y1 = bbox
    cut = img[y0:y1, x0:x1]
    extract_cnn_features(cut)

特征提取与整个图像情况相同：

 net = Caffe.NET('deploy.prototxt', 'caffemodel', caffe.TEST)
 # preprocess input
 net.blobs['data'].data[...] = net_input
 net.forward()
 feats = net.blobs['my_layer'].data.copy()

当然这种方法计算成本很高，因为你基本上计算两次 CNN功能。这取决于您对速度和CNN型号尺寸的要求。

从快速R-CNN的所有边界框中提取特征

1 个答案: