Rnn 神经网络预测返回意外预测 [关闭]答案

【问题标题】：Rnn Neural Network predict return unexpected predictions [closed]Rnn 神经网络预测返回意外预测 [关闭]
【发布时间】：2018-09-19 18:49:11
【问题描述】：

我正在尝试配置一个 RNN 神经网络来预测 5 种不同类型的文本实体。我正在使用下一个配置：

    MultiLayerConfiguration conf = new NeuralNetConfiguration.Builder()
            .seed(seed)
            .iterations(100)
            .updater(Updater.ADAM)  //To configure: .updater(Adam.builder().beta1(0.9).beta2(0.999).build())
            .regularization(true).l2(1e-5)
            .weightInit(WeightInit.XAVIER)
            .gradientNormalization(GradientNormalization.ClipElementWiseAbsoluteValue).gradientNormalizationThreshold(1.0)
            .learningRate(2e-2)
            .trainingWorkspaceMode(WorkspaceMode.SEPARATE).inferenceWorkspaceMode(WorkspaceMode.SEPARATE)   //https://deeplearning4j.org/workspaces
            .list()
            .layer(0, new GravesLSTM.Builder().nIn(500).nOut(3)
                    .activation(Activation.TANH).build())
            .layer(1, new RnnOutputLayer.Builder(LossFunctions.LossFunction.MCXENT).activation(Activation.SOFTMAX)        //MCXENT + softmax for classification
                    .nIn(3).nOut(5).build())
            .pretrain(false).backprop(true).build();
  MultiLayerNetwork net = new MultiLayerNetwork(conf);
  net.init();

我训练它，然后评估它。有用。尽管如此，当我使用时：

 int[] prediction = net.predict(features);

有时它会返回和意外预测。它返回正确的预测为 1,2....5，但有时它返回数字为 9,14,12...此数字不对应于可识别的预测/标签。

为什么这个配置会返回意外的输出？

【问题讨论】：

有例子github.com/deeplearning4j/deeplearning4j/blob/master/…
能分享一下初始化功能的代码吗？
我使用官方的 word2vecsentiment 示例。唯一的变化是可能的输出数量。
我使用这个例子：github.com/deeplearning4j/dl4j-examples/tree/master/… 改变输入并添加一些可能的输出。

标签： java deep-learning rnn deeplearning4j

【解决方案1】：

不要使用 net.predict。使用 net.output 和 Nd4j.argMax(outputOfNeuralNet,-1);不应使用 Net.predict（它主要与 2d 一起使用）。

【讨论】：

请添加一个解决方案示例。
这两个函数不相似。 Net.predict 输出是一个 INDArray，net.predict 输出是一个带有预测类的 int 数组。你能举一些例子来说明如何使用它吗？
Nd4j.argMax 为您输出索引。您可以像使用任何 int 数组一样使用 INDArray。 dl4j 示例已经在一些地方介绍了这一点。一个例子是：github.com/deeplearning4j/dl4j-examples/blob/… - 将 1 更改为 -1。 -1 表示“在最后一个维度上运行，无论它是什么”。这遵循 numpy 约定。并纠正你：是的，这两个功能是类似的。我的回答只是一个更通用的版本