【发布时间】:2021-07-03 03:18:59
【问题描述】:
我正在尝试为我的数据集采用标准的波士顿住房问题,不同之处在于我在数据集中有负值并且想要预测输出中的负值。
当我在 StackOverflow 中阅读以预测负值时,我应该在输出层使用比激活函数。另外,我知道我应该将我的数据集标准化为 -1,1 范围。
所以我有两个问题。 我有两种代码变体。
-
我的第一个代码变体是否正确?我没有找到任何带有负数的公共数据集可供检查,也不知道如何确保它运行良好。
-
在第二个变体中,我的 NN 预测的值类似于“0.9”,但我的数据集值类似于“24”。我假设它是因为此代码中没有适当的规范化。请告诉我如何实施规范化。
我对 Keras 的体验很差,对 Python 的功底也不强,所以我只是尝试从不同的地方组装一段代码。
第一个代码:
import pandas as pd
from sklearn.preprocessing import MinMaxScaler
import tensorflow as tf
from tensorflow import keras
from keras import Sequential
from keras.layers import Dense
#read in training data
train_df = pd.read_csv('train.csv', index_col='ID')
train_df.head()
target = 'medv'
scaler = MinMaxScaler(feature_range=(-1, 1)) ## tut byl 0,1
scaled_train = scaler.fit_transform(train_df)
# Print out the adjustment that the scaler applied to the total_earnings column of data
print("Note: median values were scaled by multiplying by {:.10f} and adding {:.6f}".format(scaler.scale_[13], scaler.min_[13]))
multiplied_by = scaler.scale_[13]
added = scaler.min_[13]
scaled_train_df = pd.DataFrame(scaled_train, columns=train_df.columns.values)
#build our model
model = Sequential()
model.add(Dense(50, activation='relu'))
model.add(Dense(100, activation='relu'))
model.add(Dense(50, activation='relu'))
model.add(Dense(1, activation='tanh')) #tut nichego
model.compile(loss='mean_squared_error', optimizer='adam')
X = scaled_train_df.drop(target, axis=1).values
Y = scaled_train_df[[target]].values
# Train the model
model.fit(
X[10:],
Y[10:],
epochs=100,
shuffle=True,
verbose=2
)
#inference
prediction = model.predict(X[:4])
y_0 = prediction[0][0]
print('Prediction with scaling - {}',format(y_0))
y_0 -= added
y_0 /= multiplied_by
print("Housing Price Prediction - ${}".format(y_0))
Prediction with scaling - {} -0.1745799034833908
Housing Price Prediction - $23.571952171623707
代码的第二种变体:
# Regression Example With Boston Dataset: Standardized and Larger
from pandas import read_csv
from keras.models import Sequential
from keras.layers import Dense
from keras.wrappers.scikit_learn import KerasRegressor
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import KFold
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline
import numpy
# load dataset
dataframe = read_csv("housing.csv", delim_whitespace=True, header=None)
dataset = dataframe.values
# split into input (X) and output (Y) variables
X = dataset[:,0:13]
Y = dataset[:,13]
# define the model
def larger_model():
# create model
model = Sequential()
model.add(Dense(13, input_dim=13, kernel_initializer='normal', activation='relu'))
model.add(Dense(6, kernel_initializer='normal', activation='relu'))
model.add(Dense(1, kernel_initializer='normal', activation='tanh'))
# Compile model
model.compile(loss='mean_squared_error', optimizer='adam')
return model
# evaluate model with standardized dataset
estimators = []
estimators.append(('standardize', StandardScaler()))
estimators.append(('mlp', KerasRegressor(build_fn=larger_model, epochs=50, batch_size=5, verbose=1)))
pipeline = Pipeline(estimators)
kfold = KFold(n_splits=10)
results = cross_val_score(pipeline, X, Y, cv=kfold)
print("Larger: %.2f (%.2f) MSE" % (results.mean(), results.std()))
pipeline.fit(X, Y)
#prediction = pipeline.predict(numpy.array([[0.0273, 0., 7.07, 0., 0.469, 6.421, 78.9, 4.9671, 2., 242., 17.8, 396.9, 9.14]]))
prediction = pipeline.predict(numpy.array([[0.7258, 0., 8.14, 0., 0.538, 5.727, 69.5, 3.7965, 4., 307., 21.0, 390.95, 11.28]]))
print(prediction)
结果:
......
......
102/102 [==============================] - 0s 927us/step - loss: 548.0819
Epoch 50/50
102/102 [==============================] - 0s 912us/step - loss: 548.0818
1/1 [==============================] - 0s 0s/step
0.99998754
【问题讨论】:
标签: tensorflow keras neural-network regression activation-function