【问题标题】:A3C with LSTM using keras使用 keras 和 LSTM 的 A3C
【发布时间】:2018-03-30 08:09:05
【问题描述】:

我正在尝试使用 keras 使用 LSTM 实现 A3C 模型,我开始使用没有 LSTM 的 A3C 版本:“https://github.com/coreylynch/async-rl”,并尝试仅修改网络代码,但我很难编译整个模型:

我错过了什么吗?

这是我的模型:

state = tf.placeholder("float", [None, agent_history_length, resized_width, resized_height])

vision_model = Sequential()
vision_model.add(Conv2D(activation="relu", filters=16, kernel_size=(8, 8), name="conv1", padding="same", strides=(4, 4),input_shape=(agent_history_length,resized_width, resized_height)))
vision_model.add(Conv2D(activation="relu", filters=32, kernel_size=(4, 4), name="conv2", padding="same", strides=(2, 2)))
vision_model.add(Flatten())
vision_model.add(Dense(activation="relu", units=256, name="h1"))

# Now let's get a tensor with the output of our vision model:

state_input = Input(shape=(1,agent_history_length,resized_width,resized_height))

encoded_frame_sequence = TimeDistributed(vision_model)(state_input)
encoded_video = LSTM(256)(encoded_frame_sequence)  # the output will be a vector

action_probs = Dense(activation="softmax", units=4, name="p")(encoded_video)
state_value = Dense(activation="linear", units=1, name="v")(encoded_video)

policy_network = Model(inputs=state_input, outputs=action_probs)
value_network = Model(inputs=state_input, outputs=state_value)

p_params = policy_network.trainable_weights
v_params = value_network.trainable_weights

policy_network.summary()
value_network.summary()

p_out = policy_network(state_input)
v_out = value_network(state_input)

【问题讨论】:

    标签: deep-learning lstm reinforcement-learning


    【解决方案1】:

    keras-rl 示例库不支持超过 2D 的输入形状!

    【讨论】:

      猜你喜欢
      • 2017-12-09
      • 2019-09-15
      • 2018-01-30
      • 2020-05-16
      • 2018-01-15
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多