【问题标题】:Keras - GRU layer with recurrent dropout - loss: 'nan', accuracy: 0Keras - 具有经常丢失的 GRU 层 - 损失:'nan',准确度:0
【发布时间】:2020-07-02 23:33:25
【问题描述】:

问题描述

我正在阅读 François Chollet (publisher webpage, notebooks on github) 的“Python 中的深度学习”。复制第 6 章中的示例,我遇到了(我相信)GRU 层经常丢失的问题。

我第一次观察到这些错误的代码很长,所以我决定坚持最简单的问题,它可以复制错误:将 IMDB 评论分为“正面”和“负面”类别。

当我使用具有经常性 dropout 训练损失的 GRU 层时(在第一个 epoch 的几批之后)取“值”nan,而训练准确度(从第二个 epoch 开始)取值为 0。

   64/12000 [..............................] - ETA: 3:05 - loss: 0.6930 - accuracy: 0.4844
  128/12000 [..............................] - ETA: 2:09 - loss: 0.6926 - accuracy: 0.4766
  192/12000 [..............................] - ETA: 1:50 - loss: 0.6910 - accuracy: 0.5573
(...) 
 3136/12000 [======>.......................] - ETA: 59s - loss: 0.6870 - accuracy: 0.5635
 3200/12000 [=======>......................] - ETA: 58s - loss: 0.6862 - accuracy: 0.5650
 3264/12000 [=======>......................] - ETA: 58s - loss: 0.6860 - accuracy: 0.5650
 3328/12000 [=======>......................] - ETA: 57s - loss: nan - accuracy: 0.5667   
 3392/12000 [=======>......................] - ETA: 57s - loss: nan - accuracy: 0.5560
 3456/12000 [=======>......................] - ETA: 56s - loss: nan - accuracy: 0.5457
(...)
11840/12000 [============================>.] - ETA: 1s - loss: nan - accuracy: 0.1593
11904/12000 [============================>.] - ETA: 0s - loss: nan - accuracy: 0.1584
11968/12000 [============================>.] - ETA: 0s - loss: nan - accuracy: 0.1576
12000/12000 [==============================] - 83s 7ms/step - loss: nan - accuracy: 0.1572 - val_loss: nan - val_accuracy: 0.0000e+00
Epoch 2/20

   64/12000 [..............................] - ETA: 1:16 - loss: nan - accuracy: 0.0000e+00
  128/12000 [..............................] - ETA: 1:15 - loss: nan - accuracy: 0.0000e+00
  192/12000 [..............................] - ETA: 1:16 - loss: nan - accuracy: 0.0000e+00
(...)
11840/12000 [============================>.] - ETA: 1s - loss: nan - accuracy: 0.0000e+00
11904/12000 [============================>.] - ETA: 0s - loss: nan - accuracy: 0.0000e+00
11968/12000 [============================>.] - ETA: 0s - loss: nan - accuracy: 0.0000e+00
12000/12000 [==============================] - 82s 7ms/step - loss: nan - accuracy: 0.0000e+00 - val_loss: nan - val_accuracy: 0.0000e+00
Epoch 3/20

   64/12000 [..............................] - ETA: 1:18 - loss: nan - accuracy: 0.0000e+00
  128/12000 [..............................] - ETA: 1:18 - loss: nan - accuracy: 0.0000e+00
  192/12000 [..............................] - ETA: 1:16 - loss: nan - accuracy: 0.0000e+00
(...)

定位问题

为了找出解决方案,我编写了下面给出的代码,该代码通过了几个模型(GRU/LSTM,{没有 dropout,只有“正常”dropout,只有经常性 dropout,“正常”和经常性 dropout,rmsprop/adam} ) 并呈现所有这些模型的损失和准确性。 (它还为每个模型创建更小的、单独的图表。)

# Based on examples from "Deep Learning with Python" by François Chollet:
## Constants, modules:
VERSION = 2

import os
from keras import models
from keras import layers
import matplotlib.pyplot as plt
import pylab

## Loading data:
from keras.datasets import imdb
(x_train, y_train), (x_test, y_test) = \
    imdb.load_data(num_words=10000)

from keras.preprocessing import sequence
x_train = sequence.pad_sequences(x_train, maxlen=500)
x_test = sequence.pad_sequences(x_test, maxlen=500)


## Dictionary with models' hyperparameters:
MODELS = [
    # GRU:
    {"no": 1,
     "layer_type": "GRU",
     "optimizer": "rmsprop",
     "dropout": None,
     "recurrent_dropout": None},

    {"no": 2,
     "layer_type": "GRU",
     "optimizer": "rmsprop",
     "dropout": 0.3,
     "recurrent_dropout": None},

    {"no": 3,
     "layer_type": "GRU",
     "optimizer": "rmsprop",
     "dropout": None,
     "recurrent_dropout": 0.3},

    {"no": 4,
     "layer_type": "GRU",
     "optimizer": "rmsprop",
     "dropout": 0.3,
     "recurrent_dropout": 0.3},

    {"no": 5,
     "layer_type": "GRU",
     "optimizer": "adam",
     "dropout": None,
     "recurrent_dropout": None},

    {"no": 6,
     "layer_type": "GRU",
     "optimizer": "adam",
     "dropout": 0.3,
     "recurrent_dropout": None},

    {"no": 7,
     "layer_type": "GRU",
     "optimizer": "adam",
     "dropout": None,
     "recurrent_dropout": 0.3},

    {"no": 8,
     "layer_type": "GRU",
     "optimizer": "adam",
     "dropout": 0.3,
     "recurrent_dropout": 0.3},

    # LSTM:
    {"no": 9,
     "layer_type": "LSTM",
     "optimizer": "rmsprop",
     "dropout": None,
     "recurrent_dropout": None},

    {"no": 10,
     "layer_type": "LSTM",
     "optimizer": "rmsprop",
     "dropout": 0.3,
     "recurrent_dropout": None},

    {"no": 11,
     "layer_type": "LSTM",
     "optimizer": "rmsprop",
     "dropout": None,
     "recurrent_dropout": 0.3},

    {"no": 12,
     "layer_type": "LSTM",
     "optimizer": "rmsprop",
     "dropout": 0.3,
     "recurrent_dropout": 0.3},

    {"no": 13,
     "layer_type": "LSTM",
     "optimizer": "adam",
     "dropout": None,
     "recurrent_dropout": None},

    {"no": 14,
     "layer_type": "LSTM",
     "optimizer": "adam",
     "dropout": 0.3,
     "recurrent_dropout": None},

    {"no": 15,
     "layer_type": "LSTM",
     "optimizer": "adam",
     "dropout": None,
     "recurrent_dropout": 0.3},

    {"no": 16,
     "layer_type": "LSTM",
     "optimizer": "adam",
     "dropout": 0.3,
     "recurrent_dropout": 0.3},
]

## Adding name:
for model_dict in MODELS:
    model_dict["name"] = f"{model_dict['layer_type']}"
    model_dict["name"] += f"_d{model_dict['dropout']}" if model_dict['dropout'] is not None else f"_dN"
    model_dict["name"] += f"_rd{model_dict['recurrent_dropout']}" if model_dict['recurrent_dropout'] is not None else f"_rdN"
    model_dict["name"] += f"_{model_dict['optimizer']}"

## Fucntion - defing and training model:
def train_model(model_dict):
    """Defines and trains a model, outputs history."""

    ## Defining:
    model = models.Sequential()
    model.add(layers.Embedding(10000, 32))

    recurrent_layer_kwargs = dict()
    if model_dict["dropout"] is not None:
        recurrent_layer_kwargs["dropout"] = model_dict["dropout"]
    if model_dict["recurrent_dropout"] is not None:
        recurrent_layer_kwargs["recurrent_dropout"] = model_dict["recurrent_dropout"]

    if model_dict["layer_type"] == 'GRU':
        model.add(layers.GRU(32, **recurrent_layer_kwargs))
    elif model_dict["layer_type"] == 'LSTM':
        model.add(layers.LSTM(32, **recurrent_layer_kwargs))
    else:
        raise ValueError("Wrong model_dict['layer_type'] value...")
    model.add(layers.Dense(1, activation='sigmoid'))

    ## Compiling:
    model.compile(
        optimizer=model_dict["optimizer"],
        loss='binary_crossentropy',
        metrics=['accuracy'])

    ## Training:
    history = model.fit(x_train, y_train,
                        epochs=20,
                        batch_size=64,
                        validation_split=0.2)

    return history

## Multi-model graphs' parameters:
graph_all_nrow = 4
graph_all_ncol = 4
graph_all_figsize = (20, 20)

assert graph_all_nrow * graph_all_nrow >= len(MODELS)

## Figs and axes of multi-model graphs:
graph_all_loss_fig, graph_all_loss_axs = plt.subplots(graph_all_nrow, graph_all_ncol, figsize=graph_all_figsize)
graph_all_acc_fig, graph_all_acc_axs = plt.subplots(graph_all_nrow, graph_all_ncol, figsize=graph_all_figsize)

## Loop trough all models:
for i, model_dict in enumerate(MODELS):
    history = train_model(model_dict)

    ## Metrics extraction:
    loss = history.history['loss']
    val_loss = history.history['val_loss']
    acc = history.history['accuracy']
    val_acc = history.history['val_accuracy']

    epochs = range(1, len(loss) + 1)

    ## Single-model grph - loss:
    graph_loss_fname = fr"{os.path.basename(__file__).replace('.py', '')}"
    graph_loss_fname += fr"_v{VERSION}_{model_dict['no']}_{model_dict['name']}_loss_graph.png"

    graph_loss_fig, graph_loss_ax = plt.subplots()
    graph_loss_ax.plot(epochs, loss, 'bo', label='Training loss')
    graph_loss_ax.plot(epochs, val_loss, 'b', label='Validation loss')
    graph_loss_ax.legend()
    graph_loss_fig.suptitle("Training and validation loss")
    graph_loss_fig.savefig(graph_loss_fname)
    pylab.close(graph_loss_fig)


    ## Single-model grph - accuracy:
    graph_acc_fname = fr"{os.path.basename(__file__).replace('.py', '')}"
    graph_acc_fname += fr"_v{VERSION}_{model_dict['no']}_{model_dict['name']}_acc_graph.png"

    graph_acc_fig, graph_acc_ax = plt.subplots()
    graph_acc_ax.plot(epochs, acc, 'bo', label='Training accuracy')
    graph_acc_ax.plot(epochs, val_acc, 'b', label='Validation accuracy')
    graph_acc_ax.legend()
    graph_acc_fig.suptitle("Training and validation acc")
    graph_acc_fig.savefig(graph_acc_fname)
    pylab.close(graph_acc_fig)

    ## Position of axes on multi-model graph:
    i_row = i // graph_all_ncol
    i_col = i % graph_all_ncol

    ## Adding model metrics to multi-model graph - loss:
    graph_all_loss_axs[i_row, i_col].plot(epochs, loss, 'bo', label='Training loss')
    graph_all_loss_axs[i_row, i_col].plot(epochs, val_loss, 'b', label='Validation loss')
    graph_all_loss_axs[i_row, i_col].set_title(fr"{model_dict['no']}. {model_dict['name']}")

    ## Adding model metrics to multi-model graph - accuracy:
    graph_all_acc_axs[i_row, i_col].plot(epochs, acc, 'bo', label='Training acc')
    graph_all_acc_axs[i_row, i_col].plot(epochs, val_acc, 'b', label='Validation acc')
    graph_all_acc_axs[i_row, i_col].set_title(fr"{model_dict['no']}. {model_dict['name']}")


## Saving multi-model graphs:
# Output files are quite big (8000x8000 PNG), you may want to decrease DPI.
graph_all_loss_fig.savefig(fr"{os.path.basename(__file__).replace('.py', '')}_ALL_loss_graph.png", dpi=400)
graph_all_acc_fig.savefig(fr"{os.path.basename(__file__).replace('.py', '')}_ALL_acc_graph.png", dpi=400)

请在下面找到两个主要图表:Loss - binary crossentropyAccuracy(由于声誉低,我不允许在帖子中嵌入图片)。

我在回归模型中也遇到了类似的奇怪问题 - MAE 在几 的范围内 - 在 $y$ 范围可能是几 几十 的问题中. (我决定不在这里包含这个模型,因为它会使这个问题变得更长。)

模块和库的版本、硬件

  • 模块:
Keras                    2.3.1
Keras-Applications       1.0.8
Keras-Preprocessing      1.1.0
matplotlib               3.1.3
tensorflow-estimator     1.14.0
tensorflow-gpu           2.1.0
tensorflow-gpu-estimator 2.1.0
  • keras.json文件:
{
    "floatx": "float32",
    "epsilon": 1e-07,
    "backend": "tensorflow",
    "image_data_format": "channels_last"
}
  • CUDA - 我的系统上安装了 CUDA 10.0 和 CUDA 10.1。
  • CUDnn - 我有三个版本:cudnn-10.0 v7.4.2.24、cudnn-10.0 v7.6.4.38、cudnn-9.0 v7.4.2.24
  • GPU:Nvidia GTX 1050Ti 4gb
  • Windows 10 家庭版

问题

  1. 您知道这种行为的可能原因是什么吗?
  2. 这可能是由多个 CUDA 和 CUDnn 安装引起的吗?在观察问题之前,我已经训练了几个模型(来自书本和我自己的模型)并且似乎表现得或多或少符合预期,同时有 2 个 CUDA 和 2 个 CUDnn 版本(上面没有 cudnn-10.0 v7.6.4.38 的那些)已安装。
  3. 是否有任何官方/良好的 keras、tensorflow、CUDA、CUDnn(以及其他相关内容,例如可能是 Visual Studio)的适当组合来源?我真的找不到任何权威和最新的来源。

我希望我已经足够清楚地描述了所有内容。如果您有任何问题,请提出。

【问题讨论】:

    标签: machine-learning keras lstm recurrent-neural-network gated-recurrent-unit


    【解决方案1】:

    我终于找到了解决方案(有点)。把keras改成tensorflow.keras就够了。

    修改后的代码

    # Based on examples from "Deep Learning with Python" by François Chollet:
    ## Constants, modules:
    VERSION = 2
    
    import os
    #U: from keras import models
    #U: from keras import layers
    from tensorflow.keras import models
    from tensorflow.keras import layers
    
    import matplotlib.pyplot as plt
    import pylab
    
    ## Loading data:
    from keras.datasets import imdb
    
    (x_train, y_train), (x_test, y_test) = \
        imdb.load_data(num_words=10000)
    
    from keras.preprocessing import sequence
    
    x_train = sequence.pad_sequences(x_train, maxlen=500)
    x_test = sequence.pad_sequences(x_test, maxlen=500)
    
    ## Dictionary with models' hyperparameters:
    MODELS_ALL = [
        # GRU:
        {"no": 1,
         "layer_type": "GRU",
         "optimizer": "rmsprop",
         "dropout": None,
         "recurrent_dropout": None},
    
        {"no": 2,
         "layer_type": "GRU",
         "optimizer": "rmsprop",
         "dropout": 0.3,
         "recurrent_dropout": None},
    
        {"no": 3,
         "layer_type": "GRU",
         "optimizer": "rmsprop",
         "dropout": None,
         "recurrent_dropout": 0.3},
    
        {"no": 4,
         "layer_type": "GRU",
         "optimizer": "rmsprop",
         "dropout": 0.3,
         "recurrent_dropout": 0.3},
    
        {"no": 5,
         "layer_type": "GRU",
         "optimizer": "adam",
         "dropout": None,
         "recurrent_dropout": None},
    
        {"no": 6,
         "layer_type": "GRU",
         "optimizer": "adam",
         "dropout": 0.3,
         "recurrent_dropout": None},
    
        {"no": 7,
         "layer_type": "GRU",
         "optimizer": "adam",
         "dropout": None,
         "recurrent_dropout": 0.3},
    
        {"no": 8,
         "layer_type": "GRU",
         "optimizer": "adam",
         "dropout": 0.3,
         "recurrent_dropout": 0.3},
    
        # LSTM:
        {"no": 9,
         "layer_type": "LSTM",
         "optimizer": "rmsprop",
         "dropout": None,
         "recurrent_dropout": None},
    
        {"no": 10,
         "layer_type": "LSTM",
         "optimizer": "rmsprop",
         "dropout": 0.3,
         "recurrent_dropout": None},
    
        {"no": 11,
         "layer_type": "LSTM",
         "optimizer": "rmsprop",
         "dropout": None,
         "recurrent_dropout": 0.3},
    
        {"no": 12,
         "layer_type": "LSTM",
         "optimizer": "rmsprop",
         "dropout": 0.3,
         "recurrent_dropout": 0.3},
    
        {"no": 13,
         "layer_type": "LSTM",
         "optimizer": "adam",
         "dropout": None,
         "recurrent_dropout": None},
    
        {"no": 14,
         "layer_type": "LSTM",
         "optimizer": "adam",
         "dropout": 0.3,
         "recurrent_dropout": None},
    
        {"no": 15,
         "layer_type": "LSTM",
         "optimizer": "adam",
         "dropout": None,
         "recurrent_dropout": 0.3},
    
        {"no": 16,
         "layer_type": "LSTM",
         "optimizer": "adam",
         "dropout": 0.3,
         "recurrent_dropout": 0.3},
    ]
    
    MODELS_GRU_RECCURENT = [
        # GRU:
        {"no": 3,
         "layer_type": "GRU",
         "optimizer": "rmsprop",
         "dropout": None,
         "recurrent_dropout": 0.3},
    
        {"no": 4,
         "layer_type": "GRU",
         "optimizer": "rmsprop",
         "dropout": 0.3,
         "recurrent_dropout": 0.3},
    
        {"no": 7,
         "layer_type": "GRU",
         "optimizer": "adam",
         "dropout": None,
         "recurrent_dropout": 0.3},
    
        {"no": 8,
         "layer_type": "GRU",
         "optimizer": "adam",
         "dropout": 0.3,
         "recurrent_dropout": 0.3},
    ]
    
    MODELS = MODELS_ALL   # "MODELS = MODELS_ALL" or "MODELS = MODELS_GRU_RECCURENT"
    
    ## Adding name:
    for model_dict in MODELS:
        model_dict["name"] = f"{model_dict['layer_type']}"
        model_dict["name"] += f"_d{model_dict['dropout']}" if model_dict['dropout'] is not None else f"_dN"
        model_dict["name"] += f"_rd{model_dict['recurrent_dropout']}" if model_dict['recurrent_dropout'] is not None else f"_rdN"
        model_dict["name"] += f"_{model_dict['optimizer']}"
    
    
    ## Fucntion - defing and training model:
    def train_model(model_dict):
        """Defines and trains a model, outputs history."""
    
        ## Defining:
        model = models.Sequential()
        model.add(layers.Embedding(10000, 32))
    
        recurrent_layer_kwargs = dict()
        if model_dict["dropout"] is not None:
            recurrent_layer_kwargs["dropout"] = model_dict["dropout"]
        if model_dict["recurrent_dropout"] is not None:
            recurrent_layer_kwargs["recurrent_dropout"] = model_dict["recurrent_dropout"]
    
        if model_dict["layer_type"] == 'GRU':
            model.add(layers.GRU(32, **recurrent_layer_kwargs))
        elif model_dict["layer_type"] == 'LSTM':
            model.add(layers.LSTM(32, **recurrent_layer_kwargs))
        else:
            raise ValueError("Wrong model_dict['layer_type'] value...")
        model.add(layers.Dense(1, activation='sigmoid'))
    
        ## Compiling:
        model.compile(
            optimizer=model_dict["optimizer"],
            loss='binary_crossentropy',
            metrics=['accuracy'])
    
        ## Training:
        history = model.fit(x_train, y_train,
                            epochs=20,
                            batch_size=64,
                            validation_split=0.2)
    
        return history
    
    
    ## Multi-model graphs' parameters:
    graph_all_nrow = 4
    graph_all_ncol = 4
    graph_all_figsize = (20, 20)
    
    assert graph_all_nrow * graph_all_nrow >= len(MODELS)
    
    # fig and axes of multi-model graphs:
    graph_all_loss_fig, graph_all_loss_axs = plt.subplots(graph_all_nrow, graph_all_ncol, figsize=graph_all_figsize)
    graph_all_acc_fig, graph_all_acc_axs = plt.subplots(graph_all_nrow, graph_all_ncol, figsize=graph_all_figsize)
    
    ## Loop trough all models:
    for i, model_dict in enumerate(MODELS):
        history = train_model(model_dict)
    
        ## Metrics extraction:
        loss = history.history['loss']
        val_loss = history.history['val_loss']
        acc = history.history['accuracy']
        val_acc = history.history['val_accuracy']
    
        epochs = range(1, len(loss) + 1)
    
        ## Single-model graph - loss:
        graph_loss_fname = fr"{os.path.basename(__file__).replace('.py', '')}"
        graph_loss_fname += fr"_v{VERSION}_{model_dict['no']}_{model_dict['name']}_loss_graph.png"
    
        graph_loss_fig, graph_loss_ax = plt.subplots()
        graph_loss_ax.plot(epochs, loss, 'bo', label='Training loss')
        graph_loss_ax.plot(epochs, val_loss, 'b', label='Validation loss')
        graph_loss_ax.legend()
        graph_loss_fig.suptitle("Training and validation loss")
        graph_loss_fig.savefig(graph_loss_fname)
        pylab.close(graph_loss_fig)
    
        ## Single-model graph - accuracy:
        graph_acc_fname = fr"{os.path.basename(__file__).replace('.py', '')}"
        graph_acc_fname += fr"_v{VERSION}_{model_dict['no']}_{model_dict['name']}_acc_graph.png"
    
        graph_acc_fig, graph_acc_ax = plt.subplots()
        graph_acc_ax.plot(epochs, acc, 'bo', label='Training accuracy')
        graph_acc_ax.plot(epochs, val_acc, 'b', label='Validation accuracy')
        graph_acc_ax.legend()
        graph_acc_fig.suptitle("Training and validation acc")
        graph_acc_fig.savefig(graph_acc_fname)
        pylab.close(graph_acc_fig)
    
        ## Position of axes on multi-model graph:
        i_row = i // graph_all_ncol
        i_col = i % graph_all_ncol
    
        ## Adding model metrics to multi-model graph - loss:
        graph_all_loss_axs[i_row, i_col].plot(epochs, loss, 'bo', label='Training loss')
        graph_all_loss_axs[i_row, i_col].plot(epochs, val_loss, 'b', label='Validation loss')
        graph_all_loss_axs[i_row, i_col].set_title(fr"{model_dict['no']}. {model_dict['name']}")
    
        ## Adding model metrics to multi-model graph - accuracy:
        graph_all_acc_axs[i_row, i_col].plot(epochs, acc, 'bo', label='Training acc')
        graph_all_acc_axs[i_row, i_col].plot(epochs, val_acc, 'b', label='Validation acc')
        graph_all_acc_axs[i_row, i_col].set_title(fr"{model_dict['no']}. {model_dict['name']}")
    
    graph_all_loss_fig.suptitle(f"Loss - binary crossentropy [v{VERSION}]")
    graph_all_acc_fig.suptitle(f"Accuracy [v{VERSION}]")
    
    ## Saving multi-model graphs:
    graph_all_loss_fig.savefig(fr"{os.path.basename(__file__).replace('.py', '')}_ALL_v{VERSION}_loss_graph.png", dpi=400)
    graph_all_acc_fig.savefig(fr"{os.path.basename(__file__).replace('.py', '')}_ALL_v{VERSION}_acc_graph.png", dpi=400)
    
    ## Saving multi-model graphs (SMALL):
    graph_all_loss_fig.savefig(fr"{os.path.basename(__file__).replace('.py', '')}_ALL_v{VERSION}_loss_graph_SMALL.png", dpi=150)
    graph_all_acc_fig.savefig(fr"{os.path.basename(__file__).replace('.py', '')}_ALL_v{VERSION}_acc_graph_SMALL.png", dpi=150)
    

    结果

    与这些问题类似的图表:Loss - binary crossentropyAccuracy

    更多关于 kerastensorflow.keras

    正如 François Chollet 的 tweets(可在此处找到:https://stackoverflow.com/a/54117754)中所写,从现在开始将使用 keras 而非独立的 tensorflow.keras(即 Keras 作为 TensorFlow 的官方 API)。 (我不完全确定我是否 100% 正确,请随时纠正我。)

    我认为在以后的项目中使用tensorflow.keras 而不是keras 会更好。

    【讨论】:

    • 谢谢。我正在开发一个 Azure 机器学习 Ubuntu Linux VM,Keras 2.3.1。更改为 tensorflow.keras 而不是 keras 给了我与本书相似的结果。
    【解决方案2】:

    在使用 Keras 的 R 接口进行培训时,我也是如此。这个问题似乎与经常性辍学和“时间”维度的长度有关。它仅使用 GRU 发生(lstm 没有问题)。

    # remotes::install_github("rstudio/keras#1032")
    library(keras)
    
    
    reticulate::py_config()
    #> python:         /home/clanera/anaconda3/envs/r-tensorflow/bin/python
    #> libpython:      /home/clanera/anaconda3/envs/r-tensorflow/lib/libpython3.6m.so
    #> pythonhome:     /home/clanera/anaconda3/envs/r-tensorflow:/home/clanera/anaconda3/envs/r-tensorflow
    #> version:        3.6.10 |Anaconda, Inc.| (default, Jan  7 2020, 21:14:29)  [GCC 7.3.0]
    #> numpy:          /home/clanera/anaconda3/envs/r-tensorflow/lib/python3.6/site-packages/numpy
    #> numpy_version:  1.18.1
    #> tensorflow:     /home/clanera/anaconda3/envs/r-tensorflow/lib/python3.6/site-packages/tensorflow
    #> 
    #> NOTE: Python version was forced by RETICULATE_PYTHON
    tensorflow::tf_config()
    #> TensorFlow v2.0.0 (~/anaconda3/envs/r-tensorflow/lib/python3.6/site-packages/tensorflow)
    #> Python v3.6 (~/anaconda3/envs/r-tensorflow/bin/python)
    tensorflow::tf_gpu_configured()
    #> TensorFlow built with CUDA:  FALSE 
    #> GPU device name:
    #> [1] FALSE
    
    
    n <- 100
    t <- 80 # with 72- seams have no problem
    q <- 10
    
    x <- array(sample(n*t*q), c(n, t, q))
    y <- sample(0:1, n, replace = TRUE)
    
    
    input <- layer_input(c(t, q))
    output <- input %>% 
    #  ## no problem using LSTM
    #  layer_lstm(units = 2, recurrent_dropout = 0.5) %>%
      layer_gru(units = 2, recurrent_dropout = 0.5) %>%
      layer_dense(units = 1, activation = "sigmoid")
    
    model <- keras_model(input, output)
    
    summary(model)
    #> Model: "model"
    #> ________________________________________________________________________________
    #> Layer (type)                        Output Shape                    Param #     
    #> ================================================================================
    #> input_1 (InputLayer)                [(None, 80, 10)]                0           
    #> ________________________________________________________________________________
    #> gru (GRU)                           (None, 2)                       78          
    #> ________________________________________________________________________________
    #> dense (Dense)                       (None, 1)                       3           
    #> ================================================================================
    #> Total params: 81
    #> Trainable params: 81
    #> Non-trainable params: 0
    #> ________________________________________________________________________________
    
    history <- model %>%
      compile(optimizer = "adam", loss = "binary_crossentropy") %>% 
      fit(x, y, 2, 3)
    
    history
    #> Trained on 100 samples (batch_size=2, epochs=3)
    #> Final epoch (plot to see history):
    #> loss: NaN
    

    reprex package (v0.3.0) 于 2020-05-10 创建

    sessionInfo()
    #> R version 4.0.0 (2020-04-24)
    #> Platform: x86_64-pc-linux-gnu (64-bit)
    #> Running under: Ubuntu 18.04.4 LTS
    #> 
    #> Matrix products: default
    #> BLAS/LAPACK: /usr/lib/x86_64-linux-gnu/libopenblasp-r0.2.20.so
    #> 
    #> locale:
    #>  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
    #>  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
    #>  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
    #>  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
    #>  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
    #> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
    #> 
    #> attached base packages:
    #> [1] stats     graphics  grDevices datasets  utils     methods   base     
    #> 
    #> other attached packages:
    #> [1] keras_2.2.5.0
    #> 
    #> loaded via a namespace (and not attached):
    #>  [1] Rcpp_1.0.4.6         whisker_0.4          knitr_1.28          
    #>  [4] magrittr_1.5         lattice_0.20-41      R6_2.4.1            
    #>  [7] rlang_0.4.6          stringr_1.4.0        highr_0.8           
    #> [10] tools_4.0.0          grid_4.0.0           xfun_0.13           
    #> [13] htmltools_0.4.0      tfruns_1.4           yaml_2.2.1          
    #> [16] digest_0.6.25        tensorflow_2.0.0     Matrix_1.2-18       
    #> [19] base64enc_0.1-3      zeallot_0.1.0        evaluate_0.14       
    #> [22] rmarkdown_2.1        stringi_1.4.6        compiler_4.0.0      
    #> [25] generics_0.0.2       reticulate_1.15-9000 jsonlite_1.6.1      
    #> [28] renv_0.10.0
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 2017-05-22
      • 1970-01-01
      • 2020-08-25
      • 1970-01-01
      • 2021-09-06
      • 2021-07-24
      • 2018-06-25
      相关资源
      最近更新 更多