二维简单问题中的 TensorflowJS答案

【问题标题】：TensorflowJS in a 2-dimension simple problem二维简单问题中的 TensorflowJS
【发布时间】：2019-12-26 02:25:22
【问题描述】：

我正在使用 TensorflowJS，我对我得到的（糟糕的）结果感到惊讶。这是我正在处理的问题：你有一个从左上角（0,0）到右下角（1,1）的二维正方形。每个角的 RGB 颜色如下：

左上角：黑色

右上角：红色

右下：绿色

左下角：蓝色

我想推断正方形中某个点的颜色。

我已经设置了一个简单的 Tensorflow 模型。经过简单的训练后，我在右下角进行了测试……结果不是接近果岭，而是得到了不好的结果。你能告诉我我在哪里做错了吗？谢谢

async function test() 
{
  tf.setBackend('cpu');

  const model = tf.sequential();
  model.add(tf.layers.dense({units: 3, inputShape: [2] }));

  model.compile({loss: 'meanSquaredError', optimizer: 'sgd'});

  const xs = tf.tensor([0,0, 1,0, 1,1, 0,1 ], [4, 2]);
  const ys = tf.tensor([ 

                         [ 0, 0, 0 ], // black
                         [ 1, 0, 0 ], // red
                         [ 0, 1, 0 ], // green
                         [ 0, 0, 1 ], // blue

                       ], [4, 3]);

  await model.fit(xs, ys, {epochs: 5000});

  const input = tf.tensor([1,1], [1, 2]);
  console.log(model.predict(input).dataSync());
}

我的结果：

Float32Array(3) [0.25062745809555054, 0.7481716275215149, 0.2501324415206909]

【问题讨论】：

您的学习算法和数据存在一些问题。但我认为最大的问题是您的图层具有默认（“线性”）激活。由于输出值类似于多个类的概率，因此您希望使用actions: 'softmax'。此外，与您在本示例中使用的数据相比，典型的机器学习数据更加多样化和多变（噪音更大）。
对不起，我的意思是 activation: 'softmax' 为您的 tf.layers.dense() 通话。
我尝试将激活设置为softmax...但这并没有解决问题。我用颜色表示了正方形。见下一个链接。 !image.

标签： javascript tensorflow tensorflow.js

【解决方案1】：

模型使用线性激活，只有特征和标签线性相关（y = ax+b）才能输出正确的结果。需要使用不同的激活。

通常需要对模型进行微调，这意味着需要使用一组不同的参数，直到找到具有最佳精度的模型 - 这称为微调。下面是一个模型，其中包含一组精度较低的参数。需要记住，这不是“”组参数。它是“a”组参数。请参阅answer，了解如何微调模型。

(async() => {
 const model = tf.sequential();
 model.add(tf.layers.dense({units: 18, inputShape: [2]}));
 model.add(tf.layers.dense({units: 14, activation: 'relu6'}));
 model.add(tf.layers.dense({units: 3, activation: 'relu6'}));

const xs = tf.tensor([0,0, 1,0, 1,1, 0,1 ], [4, 2]);
  const ys = tf.tensor([ 

                         [ 0, 0, 0 ], // black
                         [ 1, 0, 0 ], // red
                         [ 0, 1, 0 ], // green
                         [ 0, 0, 1 ], // blue

                       ], [4, 3]);
const optimizer = tf.train.sgd(0.000001)
model.compile({loss: 'meanSquaredError', optimizer: 'adam'});
  await model.fit(xs, ys, {
    epochs: 500,
    callbacks: {onEpochEnd: (epoch, logs) => console.log(logs.loss)}
  });

const input = tf.tensor([1,1], [1, 2]);
  console.log(model.predict(input).dataSync());
 })()

&lt;script src="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs/dist/tf.min.js"&gt; &lt;/script&gt;

即使上述模型的准确率较低，从而导致更好的预测，这里解决的问题似乎也不是回归问题。如果目标只是从三种颜色中选择一种颜色，我们宁愿遇到分类问题。不同之处在于在最后一层有一个softmax activation 层，而对于最后一层上的 2 个（分别超过 2 个）单元，损失函数将是一个 binaryCrossentropy（分别为 categoricalCrossentropy）。

(async() => {
 const model = tf.sequential();
 model.add(tf.layers.dense({units: 18, inputShape: [2]}));
 model.add(tf.layers.dense({units: 14, activation: 'relu6'}));
 model.add(tf.layers.dense({units: 4, activation: 'softmax'}));

const xs = tf.tensor([0,0, 1,0, 1,1, 0,1 ], [4, 2]);
const ys = tf.oneHot([0, 1, 2, 3], 4)
// 0 for black 
// 1 for red 
// 2 for green 
// 3 for blue
ys.print()

model.compile({loss: 'categoricalCrossentropy', optimizer: 'sgd'});
  await model.fit(xs, ys, {
    epochs: 100,
    callbacks: {onEpochEnd: (epoch, logs) => console.log(logs.loss)}
  });

const input = tf.tensor([1,1], [1, 2]);
const output = model.predict(input)
output.print()
output.argMax(1).print(); // result will be 2 for green
 })()

&lt;script src="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs/dist/tf.min.js"&gt; &lt;/script&gt;

【讨论】：

谢谢edkeveked！损失值是好的...但是当我尝试以下时，我仍然有不好的结果...这太奇怪了！ ` var input = tf.tensor([0,0], [1, 2]); console.log("左上", model.predict(input).dataSync());输入 = tf.tensor([0,1], [1, 2]); console.log("右上角", model.predict(input).dataSync());输入 = tf.tensor([1,1], [1, 2]); console.log("右下", model.predict(input).dataSync());输入 = tf.tensor([1,0], [1, 2]); console.log("左下", model.predict(input).dataSync()); `
每次运行代码时，权重值都会随机初始化。根据这些值，模型预测可能是错误的。但是多次运行代码，您会看到它会预测[0,0] 的正确值。您可以尝试保存您正在寻找的模型精度的权重值
我确认...它过于随机依赖...有时左上角是好的，有时是错误的。但不幸的是，我认为没有办法找到一个全球性的好的解决方案。
每次训练时保存模型，它们被称为检查点。在训练 DL 模型时，这是很常见的事情。第二个模型比第一个更稳定。因为正如答案中所解释的，这是解决您正在解决的问题的方法