如何在 TensorFlow 中使用 RNN LSTM 模型塑造和训练多列输入和多列输出（多对多）？答案

【问题标题】：How to shape and train multicolumn input and multicolumn output (many to many) with RNN LSTM model in TensorFlow?如何在 TensorFlow 中使用 RNN LSTM 模型塑造和训练多列输入和多列输出（多对多）？
【发布时间】：2022-01-01 23:26:33
【问题描述】：

我在训练具有多列输入输出的 LSTM 模型时遇到了问题。我的代码如下：

time_step = 60

#Create a data structure with n-time steps
X = []
y = []
for i in range(time_step + 1, len(training_set_scaled)):
    X.append(training_set_scaled[i-time_step-1:i-1, 0:len(training_set.columns)]) #take all columns into the set
    y.append(training_set_scaled[i, 0:len(training_set.columns)]) #take all columns into the set
X_train_arr, y_train_arr = np.array(X), np.array(y)
print(X_train_arr.shape) #(2494, 60, 5)
print(y_train_arr.shape) #(2494, 5)

#Split data
X_train_splitted = X_train_arr[:split]
y_train_splitted = y_train_arr[:split]
X_test_splitted = X_train_arr[split:]
y_test_splitted = y_train_arr[split:]

#Initialize the RNN
model = Sequential()

#Add the LSTM layers and some dropout regularization
model.add(LSTM(units= 50, activation = 'relu', return_sequences = True, input_shape = (X_train_arr.shape[1], X_train_arr.shape[2]))) #time_step/columns
model.add(Dropout(0.2))
model.add(LSTM(units= 40, activation = 'relu', return_sequences = True))
model.add(Dropout(0.2))
model.add(LSTM(units= 80, activation = 'relu', return_sequences = True))
model.add(Dropout(0.2))

#Add the output layer.
model.add(Dense(units = 1))

#Compile the RNN
model.compile(optimizer='adam', loss = 'mean_squared_error')

#Fit to the training set
model.fit(X_train_splitted, y_train_splitted, epochs=3, batch_size=32)

这个想法是从i 后退 60 步训练模型，并在i 中有 5 列目标：

for i in range(time_step + 1, len(training_set_scaled)):
    X.append(training_set_scaled[i-time_step-1:i-1, 0:len(training_set.columns)]) #take all columns into the set
    y.append(training_set_scaled[i, 0:len(training_set.columns)]) #take all columns into the set

所以我的 x-train (feed) 和 y-train (targets) 是：

X_train_arr, y_train_arr = np.array(X), np.array(y)
print(X_train_arr.shape) #(2494, 60, 5)
print(y_train_arr.shape) #(2494, 5)

不幸的是，在拟合模型时：

model.fit(X_train_splitted, y_train_splitted, epochs=3, batch_size=32)

我收到一个错误：

维度必须相等，但对于 '{{node 来说是 60 和 5 mean_squared_error/SquaredDifference}} = SquaredDifference[T=DT_FLOAT](mean_squared_error/remove_squeezable_dimensions/Squeeze, IteratorGetNext:1)' 输入形状：[?,60], [?,5]。

我知道X_train_arr 和y_train_arr 需要相同。但是在使用以下案例进行测试时，一切都很好：

X_train_arr, y_train_arr = np.array(X), np.array(y)
    print(X_train_arr.shape) #(2494, 60, 5)
    print(y_train_arr.shape) #(2494, 1)

拥有print(y_train_arr.shape) #(2494, 5) 的想法是能够预测未来的 n 步，其中每次预测迭代都会生成具有 5 列值的新整行数据。

【问题讨论】：

标签： python tensorflow many-to-many lstm recurrent-neural-network

【解决方案1】：

好吧，在完成this tutorial 之后，我明白应该做什么了。下面是使用 cmets 放置的最终代码：

#Variables
future_prediction = 30
time_step = 60 #learning step
split_percent = 0.80 #train/test data split percent (80%)
split = int(split_percent*len(training_set_scaled)) #split percent multiplying by data rows

#Create a data structure with n-time steps
X = []
y = []
for i in range(time_step + 1, len(training_set_scaled)):
    X.append(training_set_scaled[i-time_step-1:i-1, 0:len(training_set.columns)]) #take all columns into the set, including time_step legth
    y.append(training_set_scaled[i, 0:len(training_set.columns)]) #take all columns into the set

X_train_arr, y_train_arr = np.array(X), np.array(y) #must be numpy array for TF inputs

print(X_train_arr.shape) #(2494, 60, 5) <-- train data, having now 2494 rows, with 60 time steps, each row has 5 features (MANY)
print(y_train_arr.shape) #(2494, 5) <-- target data, having now 2494 rows, with 1 time step, but 5 features (TO MANY)

#Split data
X_train_splitted = X_train_arr[:split] #(80%) model train input data
y_train_splitted = y_train_arr[:split] #80%) model train target data
X_test_splitted = X_train_arr[split:] #(20%) test prediction input data
y_test_splitted = y_train_arr[split:] #(20%) test prediction compare data

#Reshaping to rows/time_step/columns
X_train_splitted = np.reshape(X_train_splitted, (X_train_splitted.shape[0], X_train_splitted.shape[1], X_train_splitted.shape[2])) #(samples, time-steps, features), by default should be already
y_train_splitted = np.reshape(y_train_splitted, (y_train_splitted.shape[0], 1, y_train_splitted.shape[1]))  #(samples, time-steps, features)
X_test_splitted = np.reshape(X_test_splitted, (X_test_splitted.shape[0], X_test_splitted.shape[1], X_test_splitted.shape[2])) #(samples, time-steps, features), by default should be already
y_test_splitted = np.reshape(y_test_splitted, (y_test_splitted.shape[0], 1, y_test_splitted.shape[1]))  #(samples, time-steps, features)

print(X_train_arr.shape) #(2494, 60, 5)
print(y_train_arr.shape) #(2494, 1, 5)
print(X_test_splitted.shape) #(450, 60, 5)
print(y_test_splitted.shape) #(450, 1, 5)

#Initialize the RNN
model = Sequential()

#Add Bidirectional LSTM, has better performance than stacked LSTM
model = Sequential()
model.add(Bidirectional(LSTM(100, activation='relu', input_shape = (X_train_splitted.shape[1], X_train_splitted.shape[2])))) #input_shape will be (2494-size, 60-shape[1], 5-shape[2])
model.add(RepeatVector(5)) #for 5 column of features in output, in other cases used for time_step in output
model.add(Bidirectional(LSTM(100, activation='relu', return_sequences=True)))
model.add(TimeDistributed(Dense(1)))

#Compile the RNN
model.compile(optimizer='adam', loss = 'mean_squared_error')

#Fit to the training set
model.fit(X_train_splitted, y_train_splitted, epochs=3, batch_size=32, validation_split=0.2, verbose=1)

#Test results
y_pred = model.predict(X_test_splitted, verbose=1)
print(y_pred.shape) #(450, 5, 1) - need to be reshaped for (450, 1, 5)

#Reshaping data for inverse transforming
y_test_splitted = np.reshape(y_test_splitted, (y_test_splitted.shape[0], 5)) #reshaping for (450, 1, 5)
y_pred = np.reshape(y_pred, (y_pred.shape[0], 5)) #reshaping for (450, 1, 5)

#Reversing transform to get proper data values
y_test_splitted = scaler.inverse_transform(y_test_splitted)
y_pred = scaler.inverse_transform(y_pred)

#Plot data
plt.figure(figsize=(14,5))
plt.plot(y_test_splitted[-time_step:, 3], label = "Real values") #I am interested only with display of column index 3
plt.plot(y_pred[-time_step:, 3], label = 'Predicted values') # #I am interested only with display of column index 3
plt.title('Prediction test')
plt.xlabel('Time')
plt.ylabel('Column index 3')
plt.legend()
plt.show()

#todo: future prediction

【讨论】：