如何用 keras 逼近行列式答案

【问题标题】：How to approximate the determinant with keras如何用 keras 逼近行列式
【发布时间】：2017-10-13 16:18:53
【问题描述】：

作为一个实验，我正在构建一个 keras 模型来近似矩阵的行列式。但是，当我运行它时，每个时期的损失都会下降，而验证损失会上升！例如：

8s - loss: 7573.9168 - val_loss: 21831.5428
Epoch 21/50
8s - loss: 7345.0197 - val_loss: 23594.8540
Epoch 22/50
13s - loss: 7087.7454 - val_loss: 24718.3967
Epoch 23/50
7s - loss: 6851.8714 - val_loss: 25624.8609
Epoch 24/50
6s - loss: 6637.8168 - val_loss: 26616.7835
Epoch 25/50
7s - loss: 6446.8898 - val_loss: 28856.9654
Epoch 26/50
7s - loss: 6255.7414 - val_loss: 30122.7924
Epoch 27/50
7s - loss: 6054.5280 - val_loss: 32458.5306
Epoch 28/50

完整代码如下：

import numpy as np
import sys
from scipy.stats import pearsonr
from scipy.linalg import det
from sklearn.model_selection import train_test_split
from tqdm import tqdm
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline
import math
import tensorflow as tf
from keras.models import Sequential
from keras.layers import Dense
from keras.wrappers.scikit_learn import KerasRegressor
from keras import backend as K

def baseline_model():
# create model
        model = Sequential()
        model.add(Dense(200, input_dim=n**2, kernel_initializer='normal', activation='relu'))
        model.add(Dense(1, input_dim=n**2))
        #        model.add(Dense(1, kernel_initializer='normal'))
        # Compile model
        model.compile(loss='mean_squared_error', optimizer='adam')
        return model


n = 15

print("Making the input data using seed 7", file=sys.stderr)
np.random.seed(7)
U = np.random.choice([0, 1], size=(n**2,n))
#U is a random orthogonal matrix
X =[]
Y =[]
# print(U)
for i in tqdm(range(100000)):
        I = np.random.choice(n**2, size = n)
        # Pick out the random rows and sort the rows of the matrix lexicographically.
        A = U[I][np.lexsort(np.rot90(U[I]))] 
        X.append(A.ravel())
        Y.append(det(A))

X = np.array(X)
Y = np.array(Y)

print("Data created")

estimators = []
estimators.append(('standardize', StandardScaler()))
estimators.append(('mlp', KerasRegressor(build_fn=baseline_model, epochs=50, batch_size=32, verbose=2)))
pipeline = Pipeline(estimators)
X_train, X_test, y_train, y_test = train_test_split(X, Y,
                                                    train_size=0.75, test_size=0.25)
pipeline.fit(X_train, y_train, mlp__validation_split=0.3)

我怎样才能阻止它如此严重地过度拟合？

更新 1

我尝试添加更多层和 L_2 正则化。但是，它几乎没有区别。

def baseline_model():
# create model
        model = Sequential()
        model.add(Dense(n**2, input_dim=n**2, kernel_initializer='glorot_normal', activation='relu'))
        model.add(Dense(int((n**2)/2.0), kernel_initializer='glorot_normal', activation='relu', kernel_regularizer=regularizers.l2(0.01)))
        model.add(Dense(int((n**2)/2.0), kernel_initializer='glorot_normal', activation='relu', kernel_regularizer=regularizers.l2(0.01)))
        model.add(Dense(int((n**2)/2.0), kernel_initializer='glorot_normal', activation='relu', kernel_regularizer=regularizers.l2(0.01)))
        model.add(Dense(1, kernel_initializer='glorot_normal'))
        # Compile model
        model.compile(loss='mean_squared_error', optimizer='adam')
        return model

我将 epoch 的数量增加到 100，并以以下方式结束：

19s - loss: 788.9504 - val_loss: 18423.2807
Epoch 97/100
24s - loss: 760.2046 - val_loss: 18305.9273
Epoch 98/100
20s - loss: 806.0941 - val_loss: 18174.8706
Epoch 99/100
24s - loss: 780.0487 - val_loss: 18356.7482
Epoch 100/100
27s - loss: 749.2595 - val_loss: 18331.5859

是否可以使用 keras 逼近矩阵的行列式？

【问题讨论】：

这不是过拟合，你的模型不适合数据。模型太简单了。
@MatiasValdenegro 我称之为过度拟合的原因是损失继续下降到 0 并且验证损失继续进行。增加隐藏层中的节点数量根本没有帮助。接下来你会尝试什么？
增加隐藏层数。使用glorot 初始化隐藏层。使用dropout 或l2 regularizer

标签： python matrix machine-learning tensorflow keras

【解决方案1】：

我测试了您的代码并得到了相同的结果。但是让我们深入了解矩阵行列式（DET）。 DET 由 n!产品，所以你不能在几层神经网络中用 n*n 权重来近似它。这需要无法缩放到 n=15 的权重数量，因为 15！是 1307674368000 个 DET 中的乘法项。

【讨论】：

这我不清楚。 DET 当然可以在 n^3 时间内计算（不是 n！）。此外，如果您只是将 keras 模型运行数百个 epoch，则训练集上的损失会下降到接近 0。
事实上，这是一个定义明确的公式，仅将 +1 和 -1 作为权重，但涉及大量输入的乘法。不确定这是否适合尝试简单的神经网络。
@eleanora 您将术语数量与计算复杂性混淆了。
@denfromufa 我的意思是你实际上从来没有计算过这些 n！产品。