【发布时间】:2019-10-25 00:42:30
【问题描述】:
我是机器学习和张量流的新手。我试图以一种简单的方式在不使用任何架构、没有太多转换的情况下实现这一点,但我被困住了,不惜一切代价获得 NAN。下面是代码,如果我在这里做错了什么,请告诉我。
这段代码是我从其中一门学习课程中得到的,他们已经为 IRIS 数据实现了它,只是试图对泰坦尼克号数据做同样的事情。
! kaggle competitions download -c 'titanic'
import pandas as pd
train_data = pd.read_csv("train.csv")
train_data.head(), train_data.columns
n_input =7 ##7 valid columns
n_output=2 ## {0,1} survived or not survived
import tensorflow as tf
tf.reset_default_graph()
input_shape = [None,n_input]
inputplaceholder = tf.placeholder(dtype=tf.float32, shape=input_shape, name="input_placeholder")
weights = tf.Variable(tf.random_normal([n_input,n_output]), name="weights")
biases = tf.Variable(tf.zeros([n_output]), name="biases")
layer_1 = tf.matmul(inputplaceholder, weights)
layer_2 = tf.add(layer_1, biases)
outputlayer = tf.nn.sigmoid(layer_2)
learning_rate = 0.001
labelsplaceholder = tf.placeholder(dtype=tf.float32, shape=[None,n_output], name="labels_placeholder")
cost = tf.losses.mean_squared_error(labelsplaceholder, outputlayer)
optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate).minimize(cost)
init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init)
scaled_data=train_data[[ 'Pclass', 'Sex', 'Age', 'SibSp','Parch', 'Fare', 'Embarked']]
genderMap = {"male":1,"female":2,"":""}
embarkMap = {"C":1,"Q":2,"S":3,"":""}
scaled_data['Sex'].replace(genderMap, inplace=True)
scaled_data['Embarked'].replace(embarkMap, inplace=True)
scaled_data['Sex'] = scaled_data['Sex'].astype('float32')
scaled_data['Pclass'] = scaled_data['Pclass'].astype('float32')
scaled_data['SibSp'] = scaled_data['SibSp'].astype('float32')
scaled_data['Parch'] = scaled_data['Parch'].astype('float32')
scaled_data['Embarked'] = scaled_data['Embarked'].astype('float32')
scaled_data['Age'] = scaled_data['Age'].astype('float32')
scaled_data['Fare'] = scaled_data['Fare'].astype('float32')
import random
mydata = list(zip(scaled_data.values, train_data.Survived))
batch_size = 891
iterations = 400
history_loss = list()
for _ in range(iterations):
inputdata = list()
output_data = list()
for _ in range(batch_size):
input_output_pairs = random.choice(mydata)
inputdata.append(input_output_pairs[0])
output_one_hot = [0,0]
output_one_hot[input_output_pairs[1]] = 1
output_data.append(output_one_hot)
res_optimizer, res_cost = sess.run([optimizer, cost], feed_dict={inputplaceholder: inputdata, labelsplaceholder: output_data})
print(res_cost)
history_loss.append(res_cost)
运行这个我期待一些成本数据,但将所有数据都作为 NAN。我确实尝试了较低的学习率,即 0.00001 和 0.00005,但结果仍然相同
【问题讨论】: