如何为大小为 128x128x3 的图像在输入层中构造一个用于内核初始化的 sobel 滤波器？答案

【问题标题】：How to construct a sobel filter for kernel initialization in input layer for images of size 128x128x3?如何为大小为 128x128x3 的图像在输入层中构造一个用于内核初始化的 sobel 滤波器？
【发布时间】：2020-09-10 23:33:18
【问题描述】：

这是我的 sobel 过滤器代码：

def init_f(shape, dtype=None):

    sobel_x = tf.constant([[-5, -4, 0, 4, 5], [-8, -10, 0, 10, 8], [-10, -20, 0, 20, 10], [-8, -10, 0, 10, 8], [-5, -4, 0, 4, 5]])

    ker = np.zeros(shape, dtype)
    ker_shape = tf.shape(ker)
    kernel = tf.tile(sobel_x, ker_shape)//*Is this correct?*
    return kernel

model.add(Conv2D(filters=30, kernel_size=(5,5), kernel_initializer=init_f, strides=(1,1), activation='relu'))

到目前为止，我已经成功地做到了这一点。但是，这给了我错误：

Shape must be rank 2 but is rank 4 for 'conv2d_17/Tile' (op: 'Tile') with input shapes: [5,5], [4].

张量流版本：2.1.0

【问题讨论】：

请包含您的 tensorflow 版本，以便人们可以更轻松地查找 api。你为什么要使用 tf.tile 呢？从文档中，它用于创建一堆重复的元素。您已经创建了 5x5 内核，还需要重复什么？
您的 Conv2D 有 30 个过滤器，您似乎只需要 1 个过滤器？
我在这个问题上提到了 vijay m 的答案：stackoverflow.com/a/50913430/13368020
那30个过滤器应该怎么做呢？

标签： tf.keras conv-neural-network sobel

【解决方案1】：

您已经接近了，但是要平铺的参数似乎不正确。这就是为什么您会收到错误“形状必须是 2 级，但对于...是 4 级”您的 sobel_x 必须是 4 级张量，因此您需要再添加两个维度。我在这个例子中使用了 reshape。

from tensorflow import keras
import tensorflow as tf
import numpy

def kernelInitializer(shape, dtype=None):
    print(shape)    
    sobel_x = tf.constant(
        [
            [-5, -4, 0, 4, 5], 
            [-8, -10, 0, 10, 8], 
            [-10, -20, 0, 20, 10], 
            [-8, -10, 0, 10, 8], 
            [-5, -4, 0, 4, 5]
        ], dtype=dtype )
    #create the missing dims.
    sobel_x = tf.reshape(sobel_x, (5, 5, 1, 1))

    print(tf.shape(sobel_x))
    #tile the last 2 axis to get the expected dims.
    sobel_x = tf.tile(sobel_x, (1, 1, shape[-2],shape[-1]))

    print(tf.shape(sobel_x))
    return sobel_x

x1 = keras.layers.Input((128, 128, 3))

cvl = keras.layers.Conv2D(30, kernel_size=(5,5), kernel_initializer=kernelInitializer, strides=(2,2), activation='relu')

model = keras.Sequential();
model.add(x1)
model.add(cvl)

data = numpy.ones((1, 128, 128, 3))
data[:, 0:64, 0:64, :] = 0

pd = model.predict(data)
print(pd.shape)

d = pd[0, :, :, 0]
for row in d:
    for col in row:
        m = '0'
        if col != 0:
            m = 'X'
        print(m, end="")
    print("")

我查看了使用 expand_dims 而不是 reshape，但似乎没有任何优势。 broadcast_to 看起来很理想，但您仍然需要添加尺寸，所以我认为它并不比瓷砖更好。

为什么同一过滤器有 30 个过滤器？以后会改吗？

【讨论】：

我尝试根据您的回答修改代码。它现在显示此错误。在构建函数时，张量类型的变量初始值设定项必须包装在 init_scope 或可调用（例如，tf.Variable(lambda : tf.truncated_normal([10, 40]))）中。
当前层仅用于使图像通过高通滤波器。实际任务是将图像分类为真实或伪造，这将在后续层中完成。我正在实施这篇研究论文ieeexplore.ieee.org/document/7823911
@drum_stick 那是哪一行？我在 tf 1.13 上对其进行了测试，并认为它会很好，因为 api 对于正在使用的函数是相同的。我可以为 tf 2.1 更新它。
定义Conv2D层的行。 model.add(Conv2D(filters=30, kernel_size=(5,5), kernel_initializer=kernelInitializer, strides=(2,2), activation='relu'))
大声笑！错误实际上是 keras 而不是 tensorflow.keras。再次感谢！