【问题标题】:Topological sort failed when using Keras TimeDistributed layer使用 Keras TimeDistributed 层时拓扑排序失败
【发布时间】:2020-06-17 14:58:43
【问题描述】:

我尝试使用 keras TimeDistributed 层将回溯维度的最后一列点到 4D 张量(样本、时间步长、回溯、特征)上的先前回溯期。该模型可以正常运行,但是当我运行 Model.fit() 时,它会发出一个警告,即图形无法按拓扑顺序排序。

This forum 说它会搞砸模型训练。那么我能做些什么来防止这种情况发生呢?

环境:

  1. Tensorflow-GPU 1.15.0
  2. CUDA V10.0.130
  3. python 3.6.5
  4. Keras 2.3.1
  5. Keras 应用程序 1.0.8
  6. Keras 预处理 1.1.0
import numpy as np
from keras.models import Model
from keras.layers import Input, TimeDistributed
import keras
# Dot layer
class Dot(keras.layers.Layer):
    def __init__(self, **kwargs):
        super(Dot, self).__init__(**kwargs)

    def call(self, x):

        ht, hT = x[:,:-1,:],x[:,-1:,:]
        ml = tf.multiply(ht, hT)

        # I believe problem come from reduce_sum
        dot = tf.reduce_sum(ml, axis=-1)
        return dot

    def compute_output_shape(self, input_shape):

        return (None,input_shape[1]-1)

num_fea = 11
num_lookback = 5
time_step = 3
sample = 2

# create model
input = Input(shape=(time_step,num_lookback,num_fea))
dot = Dot()
output = TimeDistributed(dot)(input)

M = Model(inputs=[input], outputs=[output])
M.compile(keras.optimizers.Adam(learning_rate=0.0001), loss='mse')

# create test data
data = np.arange(num_lookback*num_fea).reshape((num_lookback,num_fea))
data = np.broadcast_to(data,shape=(sample,time_step,num_lookback,num_fea))
y = np.ones(shape=(sample,time_step,num_lookback-1))

# fit model to demonstrate error
M.fit(x=data,y=y, batch_size=2, epochs=10)

警告日志

2020-03-05 08:36:17.558396: E tensorflow/core/grappler/optimizers/dependency_optimizer.cc:697] Iteration = 1, topological sort failed with message: The graph couldn't be sorted in topological order.
2020-03-05 08:36:17.558777: E tensorflow/core/grappler/optimizers/meta_optimizer.cc:533] layout failed: Invalid argument: The graph couldn't be sorted in topological order.
2020-03-05 08:36:17.559302: E tensorflow/core/grappler/optimizers/meta_optimizer.cc:533] model_pruner failed: Invalid argument: MutableGraphView::MutableGraphView error: node 'loss/time_distributed_1_loss/mean_squared_error/weighted_loss/concat' has self cycle fanin 'loss/time_distributed_1_loss/mean_squared_error/weighted_loss/concat'.
2020-03-05 08:36:17.560121: E tensorflow/core/grappler/optimizers/meta_optimizer.cc:533] remapper failed: Invalid argument: MutableGraphView::MutableGraphView error: node 'loss/time_distributed_1_loss/mean_squared_error/weighted_loss/concat' has self cycle fanin 'loss/time_distributed_1_loss/mean_squared_error/weighted_loss/concat'.
2020-03-05 08:36:17.560575: E tensorflow/core/grappler/optimizers/meta_optimizer.cc:533] arithmetic_optimizer failed: Invalid argument: The graph couldn't be sorted in topological order.
2020-03-05 08:36:17.560853: E tensorflow/core/grappler/optimizers/dependency_optimizer.cc:697] Iteration = 0, topological sort failed with message: The graph couldn't be sorted in topological order.
2020-03-05 08:36:17.561141: E tensorflow/core/grappler/optimizers/dependency_optimizer.cc:697] Iteration = 1, topological sort failed with message: The graph couldn't be sorted in topological order.

【问题讨论】:

    标签: python tensorflow keras deep-learning


    【解决方案1】:

    您可以考虑使用 TensorFlow 2.x 版本。

    我已迁移/升级您的代码并验证它是否适用于 google colab。 您可以尝试查看 here 以获取有关如何将代码迁移到 Tensorflow 2.x 的更多信息

    请参考以下代码

    import numpy as np
    import tensorflow as tf
    from tensorflow.keras.models import Model
    from tensorflow.keras.layers import Input, TimeDistributed
    #import keras
    # Dot layer
    class Dot(tf.keras.layers.Layer):
        def __init__(self, **kwargs):
            super(Dot, self).__init__(**kwargs)
    
        def call(self, x):
    
            ht, hT = x[:,:-1,:],x[:,-1:,:]
            ml = tf.multiply(ht, hT)
    
            # I believe problem come from reduce_sum
            dot = tf.reduce_sum(ml, axis=-1)
            return dot
    
        def compute_output_shape(self, input_shape):
    
            return (None,input_shape[1]-1)
    
    num_fea = 11
    num_lookback = 5
    time_step = 3
    sample = 2
    
    # create model
    input = Input(shape=(time_step,num_lookback,num_fea))
    dot = Dot()
    output = TimeDistributed(dot)(input)
    
    M = Model(inputs=[input], outputs=[output])
    M.compile(optimizer='adam', loss='mse')
    
    # create test data
    data = np.arange(num_lookback*num_fea).reshape((num_lookback,num_fea))
    data = np.broadcast_to(data,shape=(sample,time_step,num_lookback,num_fea))
    y = np.ones(shape=(sample,time_step,num_lookback-1))
    
    # fit model to demonstrate error
    M.fit(x=data,y=y, batch_size=2, epochs=10)
    

    【讨论】:

    • 感谢您的代码,但这只是我的代码的一部分。如果我更改TF版本,我必须将我的1000行代码全部更改为TF2并再次调试。
    • 嗨@RonakritW。您可以参考以下关于该警告的说明stackoverflow.com/questions/52607063/…
    • 我看过那个帖子,但我不明白这个实现中的循环在哪里。
    • 问题可能是这个ml = tf.multiply(ht, hT)。也可以查看链接github.com/tensorflow/tensorflow/issues/24816
    • 我也看到了这个,但它没有帮助,我花了 1 周的时间找到解决方案,我看到了几乎所有可用的答案,但没有一个表明有解决方案。其次,将 ht 和 hT 相乘不会创建循环,如果它在后端出现,请向我展示代码或其他东西来证明它。
    猜你喜欢
    • 1970-01-01
    • 2021-07-28
    • 1970-01-01
    • 2014-10-03
    • 2010-12-31
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多