在张量流中画一条线答案

【问题标题】：draw a line in tensorflow在张量流中画一条线
【发布时间】：2023-03-05 09:16:01
【问题描述】：

我想创建一个人体姿态骨架估计网络，为此，我有一个由两部分组成的网络，第一部分生成 16 个热图作为输出（每个热图用于不同的关节，因此可以提取一个关键点），使用这些16 个关键点我希望创建一个人体骨架并将其提供给我的网络的后半部分。我的问题是，如何在关键点之间画线以创建骨架？我找不到使用 tensorflow 或 keras 在张量对象上执行此操作的方法。

【问题讨论】：

标签： tensorflow keras

【解决方案1】：

我知道我来晚了，但这里有一些我认为可以满足您需求的代码（在 TFv2.3 中）。希望将来可以节省一些时间！

它只使用 tensorflow 操作，所以你可以在数据加载器等中使用它。这里真正的痛苦是 Tensorflow 不允许 Eager Assignment，所以你不能只按索引更新张量。这通过创建两个稀疏张量来解决这个问题，一个用于mask（在哪里应用该行），另一个用于new_values（在该行上应用什么值）。简单设计该行的代码可能不适用于您的情况（基于https://stackoverflow.com/a/47381058），但已从 numpy 移植。

import tensorflow as tf


def trapez(y, y0, w):
    return tf.clip_by_value(tf.minimum(y + 1 + w/2 - y0, -y + 1 + w/2 + y0), 0, 1)


def apply_output(img, yy, xx, val):
    stack = tf.stack([yy, xx], axis=1)
    stack = tf.cast(stack, tf.int64)
    values = tf.ones(stack.shape[0], tf.float32)
    mask = tf.sparse.SparseTensor(indices=stack, values=values, dense_shape=img.shape)
    mask = tf.sparse.reorder(mask)
    mask = tf.sparse.to_dense(mask)
    mask = tf.cast(mask, tf.float32)

    new_values = tf.sparse.SparseTensor(indices=stack, values=val, dense_shape=img.shape)
    new_values = tf.sparse.reorder(new_values)
    new_values = tf.sparse.to_dense(new_values)

    img = img * (1 - mask) + new_values * mask
    img = tf.cast(tf.expand_dims(img * 255, axis=-1), tf.uint8)
    return img


def weighted_line(img, r0, c0, r1, c1, w):
    output = img
    x = tf.range(c0, c1 + 1, dtype=tf.float32)
    slope = (r1-r0) / (c1-c0)
    w *= tf.sqrt(1 + tf.abs(slope)) / 2
    y = x * slope + (c1*r0-c0*r1) / (c1-c0)

    thickness = tf.math.ceil(w/2)
    yy = (tf.reshape(tf.math.floor(y), [-1, 1]) + tf.reshape(tf.range(-thickness-1, thickness+2), [1, -1]))
    xx = tf.repeat(x, yy.shape[1])
    values = tf.reshape(trapez(yy, tf.reshape(y, [-1, 1]), w), [-1])
    yy = tf.reshape(yy, [-1])

    limits_y = tf.math.logical_and(yy >= 0, yy < img.shape[0])
    limits_x = tf.math.logical_and(xx >= 0, xx < img.shape[1])
    limits = tf.math.logical_and(limits_y, limits_x)
    limits = tf.math.logical_and(limits, values > 0)
    yy = tf.cast(yy[limits], tf.float32)
    xx = tf.cast(xx[limits], tf.float32)

    return yy, xx, values[limits], apply_output(output, yy, xx, values[limits])

只是为了进行完整性检查，您可以使用以下命令调用它，并使用 opencv 显示它

if __name__ == "__main__":
    IMG = tf.zeros((500, 500), tf.float32)
    yy, xx, vals, FINAL_IMG = weighted_line(IMG, 10, 20, 100, 200, 5)
    jpeg_string = tf.io.encode_jpeg(FINAL_IMG)
    tf.io.write_file("output.jpg", jpeg_string)
    import cv2
    img = cv2.imread("output.jpg")
    cv2.imshow("Output", img)
    cv2.waitKey(0)

【讨论】：