将 1 通道掩码应用于 tensorflow 中的 3 通道张量答案

【问题标题】：Apply 1 channel mask to 3 channel Tensor in tensorflow将 1 通道掩码应用于 tensorflow 中的 3 通道张量
【发布时间】：2016-03-17 04:18:35
【问题描述】：

我正在尝试将遮罩（二进制，只有一个通道）应用于 RGB 图像（3 个通道，归一化为 [0, 1]）。我目前的解决方案是，我将 RGB 图像拆分为其通道，将其与蒙版相乘并再次连接这些通道：

with tf.variable_scope('apply_mask') as scope:
  # Output mask is in range [-1, 1], bring to range [0, 1] first
  zero_one_mask = (output_mask + 1) / 2
  # Apply mask to all channels.
  channels = tf.split(3, 3, output_img)
  channels = [tf.mul(c, zero_one_mask) for c in channels]
  output_img = tf.concat(3, channels)

但是，这似乎非常低效，尤其是因为据我了解，这些计算都不是就地完成的。有没有更有效的方法来做到这一点？

【问题讨论】：

标签： mask tensorflow

【解决方案1】：

tf.mul() 运算符支持numpy-style broadcasting，这将允许您稍微简化和优化代码。

假设zero_one_mask 是一个m x n 张量，而output_img 是一个b x m x n x 3（其中b 是批量大小 - 我是根据您将output_img 拆分为尺寸 3)*。您可以使用tf.expand_dims() 使zero_one_mask 可广播到channels，方法是将其重塑为m x n x 1 张量：

with tf.variable_scope('apply_mask') as scope:
  # Output mask is in range [-1, 1], bring to range [0, 1] first
  # NOTE: Assumes `output_mask` is a 2-D `m x n` tensor.
  zero_one_mask = tf.expand_dims((output_mask + 1) / 2, 2)
  # Apply mask to all channels.
  # NOTE: Assumes `output_img` is a 4-D `b x m x n x c` tensor.
  output_img = tf.mul(output_img, zero_one_mask)

（* 如果 output_img 是 4-D b x m x n x c（对于任意数量的频道 c）或 3-D m x n x c 张量，这将同样有效，这取决于广播的工作方式。）

【讨论】：

tensorflow 是否支持这种广播？我在阅读文档时没有遇到它...
我相信它不是 - 事实上tf.mul 等的文档字符串还有一些不足之处。我为此创建了一个GitHub issue，它应该会在下一个版本中及时修复。