我想知道对象检测 api 中边界框的大小答案

【问题标题】：I want to know the size of bounding box in object-detection api我想知道对象检测 api 中边界框的大小
【发布时间】：2017-09-12 05:22:00
【问题描述】：

我用过API

(https://github.com/tensorflow/models/tree/master/object_detection)

然后，

我怎么知道边界框的长度？

我在 github 上实时使用了 Tutorial IPython notebook。

但是我不知道用哪个命令来计算盒子的长度。

【问题讨论】：

欢迎来到 SO！如果您更详细地解释问题，它将对您和该论坛的其他用户更有用。还有你到目前为止所尝试的。

标签： tensorflow object-detection

【解决方案1】：

只是为了扩展 Beta 的答案：

您可以从检测图中得到预测的边界框。在Tutorial IPython notebook on github 中给出了一个例子。这就是 Beta 被剪断的代码的来源。访问detection_graph 并从张量中提取预测边界框的坐标：

通过调用np.squeeze(boxes)，您将它们重塑为 (m, 4)，其中 m 表示预测框的数量。您现在可以访问这些框并计算长度、面积或任何您想要的。

但请记住，预测的框坐标是标准化的！它们的顺序如下：

[ymin, xmin, ymax, xmax]

所以以像素为单位计算长度会是这样的：

def length_of_bounding_box(bbox):
    return bbox[3]*IMG_WIDTH - bbox[1]*IMG_WIDTH

【讨论】：

感谢详细解答！！但是我怎样才能“访问 detection_graph 并从张量中提取预测边界框的坐标”。我对代码不是很清楚。
@SUNJIAWEI 你用boxes[0][i]访问第i个盒子的坐标
@iTiger 我不确定 tensorflow 在此处（及以下）答案日期之间是否发生了变化，但遵循 [xmin, ymin, xmax, ymax] 约定时我没有得到预期的结果。我使用此链接（github.com/tensorflow/models/blob/master/research/…）在 2018 年 8 月 2 日查看了 github 上 draw_bounding_box_on_image_array 函数的源代码，订单为 [ymin, xmin, ymax, xmax]
@KolaB 是的，看起来他们改变了坐标的顺序。谢谢你的评论！我将编辑我的答案。
@ITiger 您指定的链接已损坏。我想这是旧版本的。截至目前，我认为 object_dectection.ipynb 中的代码也已更改。我应该对新文件进行哪些更改，以便获得边界框的坐标。当我打印 output_dict['detection_boxes'] 时，即使对于模型无法预测的那些测试图像，我也会得到 100 个元组。感谢您的帮助。

【解决方案2】：

我写了一个关于如何找到边界框坐标here 的完整答案，并认为它对这个线程上的某个人也可能有用。

Google 对象检测 API 返回格式为 [ymin, xmin, ymax, xmax] 和标准化形式的边界框（完整说明 here）。要找到 (x,y) 像素坐标，我们需要将结果乘以图像的宽度和高度。首先获取图片的宽高：

width, height = image.size

然后，从boxes对象中提取ymin,xmin,ymax,xmax并相乘得到(x,y)坐标：

ymin = boxes[0][i][0]*height
xmin = boxes[0][i][1]*width
ymax = boxes[0][i][2]*height
xmax = boxes[0][i][3]*width

最后打印框角的坐标：

print 'Top left'
print (xmin,ymin,)
print 'Bottom right'
print (xmax,ymax)

【讨论】：

【解决方案3】：

您可以调用框，如下所示：

boxes = detection_graph.get_tensor_by_name('detection_boxes:0')

分数和课程也是如此。

然后在会话运行中调用它们。

(boxes, scores, classes) = sess.run(
              [boxes, scores, classes],
              feed_dict={image_tensor: imageFile})

【讨论】：

感谢详细解答！但是，在调用框后，我如何获得会话长度？
@SUNJIAWEI：如果您选中这些框的值，它将为您提供对象的坐标。假设您正在检查图像是否有人。这些框为您提供该人在图像中存在的坐标（或位置）。如果你只想要人的长度，你可以提取人的长度。

【解决方案4】：

基本上，你可以从图中得到所有这些

image_tensor = graph.get_tensor_by_name('image_tensor:0')
boxes = graph.get_tensor_by_name('detection_boxes:0')
scores = graph.get_tensor_by_name('detection_scores:0')
classes = graph.get_tensor_by_name('detection_classes:0')
num_detections = graph.get_tensor_by_name('num_detections:0')

box[0] 包含所有预测的边界框坐标，格式为 [top_left_x, top_left_y, bottom_right_x, bottom_right_y]，这就是您要查找的内容。

查看此 repo，您可能会发现更多详细信息： https://github.com/KleinYuan/tf-object-detection

【讨论】：

【解决方案5】：

以下识别对象并返回位置和置信度信息的代码是：

(boxes, scores, classes, num_detections) = sess.run(
          [boxes, scores, classes, num_detections],
          feed_dict={image_tensor: image_np_expanded})

遍历盒子

 for i,b in enumerate(boxes[0]):

获取宽度和高度：

 width = boxes[0][i][1]+boxes[0][i][3]
 height = boxes[0][i][0]+boxes[0][i][2]

您可以找到更多详细信息：[https://pythonprogramming.net/detecting-distances-self-driving-car/]

【讨论】：