张量流中的基本神经网络答案

【问题标题】：Basic neural net in tensorflow张量流中的基本神经网络
【发布时间】：2017-06-12 13:17:37
【问题描述】：

我一直在尝试在 tensorflow 中实现一个基本的神经网络，输入只是 (x,y,z) 中的 1/0 的随机数据，但是我希望我的网络在 x = 1 和否则输出 0。

这是我的网络代码

import tensorflow as tf
import numpy as np

x_data = np.array([[0,0,1],
         [0,1,1],
         [1,0,0],
         [0,1,0],
         [1,1,1],
         [0,1,1],
         [1,1,1]])

x_test = np.array([[1,1,1], [0,1,0], [0,0,0]])
y_data = np.array([0,0,1,0,1,0,1])


iters = 1000
learning_rate = 0.1
weights = {
'w1': tf.Variable(tf.random_normal([3, 5])),
'w2': tf.Variable(tf.random_normal([5, 1])),
}
bias = {
'b1': tf.Variable(tf.random_normal([5])),
'b2': tf.Variable(tf.random_normal([1])),
}

def predict(x, weights, bias):
    l1 = tf.add(tf.matmul(x, weights['w1']), bias['b1'])
    l1 = tf.nn.sigmoid(l1)
    out = tf.add(tf.matmul(l1, weights['w2']), bias['b2'])
    return out


x = tf.placeholder(tf.float32, shape=(None,3))
y = tf.placeholder(tf.float32, shape=(None))

pred = predict(x, weights, bias)

cost = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits=pred, labels=y))
optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate).minimize(cost)

init = tf.global_variables_initializer()

# graph
with tf.Session() as sess:
sess.run(init)

for i in range(0, iters):
    _, c = sess.run([optimizer, cost], feed_dict={x: x_data, y: y_data})
    if i % 100 == 0:
        print("cost: " + str(c))

print(sess.run(weights['w1']))
print(sess.run(pred, feed_dict={x: x_test}))

哪些输出

[-0.37119362]
[-0.23264697]
[-0.14701667]

但是我的测试数据应该输出 [1,0,0]，我真的不确定这里有什么问题。我试过玩超参数并查看stackoverflow。我也尝试过使用 softmax_cross_entropy 作为成本函数，尽管它给了我一个错误，说 logits 与标签的形状不同。

有人知道为什么这没有输出我所期望的吗？

【问题讨论】：

我不能告诉你为什么它不学习相关性，但是你在计算交叉熵之前执行了一个 sigmoid，所以你可能还应该在最后一行打印预测的 sigmoid 以使输出可比：print(sess.run(tf.nn.sigmoid(pred), feed_dict={x: x_test}))。有了这个，你至少会有一个积极的输出。

标签： tensorflow deep-learning

【解决方案1】：

首先，你需要在输出之前通过一个激活函数（即tf.nn.sigmoid）。

确保 tf.nn.sigmoid_cross_entropy_with_logits 接受 logits（在 sigmoid 激活之前）。

您的输入 y_data 也存在形状问题，即 (7) 而不是 (7, 1)

这是您的代码的工作版本：

import tensorflow as tf
import numpy as np

x_data = np.array([[0,0,1],
         [0,1,1],
         [1,0,0],
         [0,1,0],
         [1,1,1],
         [0,1,1],
         [1,1,1]])

x_test = np.array([[1,1,1], [0,1,0], [0,0,0]])
y_data = np.array([[0],[0],[1],[0],[1],[0],[1]])


iters = 1000
learning_rate = 0.1
weights = {
'w1': tf.Variable(tf.random_normal([3, 5])),
'w2': tf.Variable(tf.random_normal([5, 1])),
}
bias = {
'b1': tf.Variable(tf.random_normal([5])),
'b2': tf.Variable(tf.random_normal([1])),
}

def predict(x, weights, bias):
    l1 = tf.add(tf.matmul(x, weights['w1']), bias['b1'])
    l1 = tf.nn.sigmoid(l1)    
    out = tf.add(tf.matmul(l1, weights['w2']), bias['b2'])
    return out


x = tf.placeholder(tf.float32, shape=(None,3))
y = tf.placeholder(tf.float32, shape=(None,1))

pred = predict(x, weights, bias)
pred_postactivation = tf.nn.sigmoid(pred)

cost = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits=pred, labels=y))
optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate).minimize(cost)

init = tf.global_variables_initializer()

# graph
with tf.Session() as sess:
    sess.run(init)

    for i in range(0, iters):
        _, c = sess.run([optimizer, cost], feed_dict={x: x_data, y: y_data})
        if i % 100 == 0:
            print("cost: " + str(c))

    print(sess.run(weights['w1']))
    print(sess.run(pred_postactivation, feed_dict={x: x_test}))

哪些输出：

cost: 1.23954
cost: 0.583582
cost: 0.455403
cost: 0.327644
cost: 0.230051
cost: 0.165296
cost: 0.123712
cost: 0.0962315
cost: 0.0772587
cost: 0.0636141
[[ 0.94488049  0.78105074  0.81608331  1.75763154 -4.47565413]
 [-2.61545444  0.26020721  0.151407    1.33066297  1.00578034]
 [-1.2027328   0.05413296 -0.13530347 -0.39841765  0.16014417]]
[[ 0.92521071]
 [ 0.05481482]
 [ 0.07227208]]

【讨论】：