将图像编辑为 tensorflow 张量 python答案

【问题标题】：Edit image as tensorflow tensor python将图像编辑为 tensorflow 张量 python
【发布时间】：2019-01-12 21:28:53
【问题描述】：

我会尽力在这里提供一个可重现的例子。

我有一张图片：

这张 Aaron Eckhart 的图片是(150, 150)

我的目标是通过对像素进行数学运算来扰乱该图像的 ROI，但是，问题是数学必须作为张量流张量来完成，因为要完成的数学运算是将张量乘以它的缩放梯度（也是一个大小为 (row_pixels, column_pixels, 3) 的张量）

所以这是我想象的过程：

在图像中读取为 numpy 数组 RGB 大小：(1, 150, 150, 3) (1 是批处理大小）

w, h = img.shape

ret = np.empty((w, h, 3), dtype=np.uint8)

ret[:, :, 0] = ret[:, :, 1] = ret[:, :, 2] = img
使像素值介于 0 和 1 之间

img = (faces1 - min_pixel) / (max_pixel - min_pixel)
for i in range(steps):

(a) 提取图像的ROI 这是我不明白怎么做的部分

(b) 计算较小img ROI张量损失的梯度

loss = utils_tf.model_loss(y, preds, mean=False)
grad, = tf.gradients(loss, x)

scaled_grad = eps * normalized_grad
adv_img = img + scaled_grad

(d) 将这个新扰动的 ROI 张量放回原来张量中的相同位置这是我不明白该怎么做的另一部分

这将导致图像中只有一些像素值受到干扰，其余的保持不变

【问题讨论】：

我假设您的意思是感兴趣区域 (ROI)。几行就不容易回答了。直接在 Tensorflow 中查看 github.com/deepsense-ai/roi-pooling 中的一些示例。我在 TF 中没有这样做，但在 OpenCV 中它相当简单。如果你想要一个例子，请告诉我
@geekonedge 是的，我的意思是感兴趣的区域。不幸的是，我认为我需要将其作为张量来执行，否则它会大大降低算法的速度。因为要使用openCV来做到这一点，我想我必须将张量转换为numpy数组，然后应用openCV ROI，然后转换回张量，进行渐变，应用变换，再次转换回numpy数组，将其放回图像中，然后转换回张量
不幸的是，我不知道您当前的管道，但我怀疑它需要那么复杂。如果你有原始图像格式，你总是可以在 opencv 处理后将区域存储在文本文件中，并将其作为数据的特征，一旦作为张量导入，你就不必进行任何转换。万事如意！

标签： python tensorflow image-processing

【解决方案1】：

给定一张图片：

(a) 从图像中获取感兴趣区域（(440, 240), (535, 380)）：

roi_slice = tf.slice(
  image_in,
  [top_left_x, top_left_y, top_left_z],
  [roi_len_x, roi_len_y, bottom_right_z]
)

获取与图像大小相同的 ROI 的布尔掩码

roi_mask = tf.ones_like(roi_slice)
mask_canvas = tf.image.pad_to_bounding_box(
  [roi_mask],
  top_left_x,
  top_left_y,
  np_image.shape[0],
  np_image.shape[1]
)
bool_mask = tf.cast(mask_canvas, tf.bool)

(b) 出于本示例的目的，我使用的是假渐变，但您可以替换为真实渐变。

fake_gradients = tf.ones_like(image_in) * 0.2

masked_gradients = tf.where(bool_mask[0], fake_gradients, mask_canvas[0])

(d) 制作图像的可编辑副本并使用蒙版渐变对其进行更新

# Make an editable copy of the image
editable_image = tf.get_variable(
    name='editable_image', shape=image_in.shape, dtype=tf.float32)
init_op = tf.assign(editable_image, image_in)

# Make sure we don't update the image before we've set its initial value.
with tf.control_dependencies([init_op]):
  update_roi_op = tf.assign_add(editable_image, masked_gradients)

您可以找到一个完整的 Colab 示例 on GitHub。

【讨论】：

是布尔掩码张量简单的 1 和 0。即，如果我有另一个只有白色和黑色像素的图像，然后我将它转换为张量，我可以使用这个张量作为我的布尔蒙版而不是矩形吗？
是的，当然，遮罩不必是矩形的。 tf.where 需要 tf.bool 类型的东西，但如果你有带有 1 和 0 的东西（例如黑白图像），你可以使用 tf.cast 来获得正确的类型。