使用 Tensorflow 对象检测 API（或其他）的多通道图像的更快 RCNN答案

【问题标题】：Faster RCNN for multi-channel images with Tensorflow Object Detection API (or others)使用 Tensorflow 对象检测 API（或其他）的多通道图像的更快 RCNN
【发布时间】：2020-02-11 15:57:38
【问题描述】：

我想为多通道图像（5 个通道）实现更快的 RCNN 架构，但是如果 YOLO 更简单的话，对我来说也差不多

我从 Tensorflow 对象检测 API 开始：

是否可以轻松实现，还是需要对架构进行深度重写？
哪种策略最好？
我应该更改哪些文件？

我对 TF、PyTorch 等没有特别的偏好，所以如果有人对其他库有建议/提示，欢迎您！

【问题讨论】：

标签： python tensorflow object-detection object-detection-api faster-rcnn

【解决方案1】：

您需要在 Tensorflow 对象检测 API 中进行一些更改：

主要是为了准备输入，您需要更改 .tfrecord 文件的格式以存储多维数组，方法是将输入准备为 numpy 数组并将其编码为如下所示的字符串 -提到的代码示例：

# Read your image and extra inputs
image = cv2.imread('path/to/image')[:, :, ::-1]
background = cv2.imread('path/to/background')[:, :, ::-1]
# Image and background are numpy arrays that has dimension of H x W x 3
# Concatenate them on depth channel to create an H x W x 6 input
inputs_stacked = np.concatenate([image, background], axis=-1)
# Encode your input as string
encoded_inputs = inputs_stacked.tostring()

有关架构代码和其他详细信息，您可以参考link。

希望这对您有所帮助。快乐学习！

【讨论】：

@scaramouche88 - 如果它回答了您的问题，请投票并接受答案。谢谢。