使用 Theano 训练的权重的 PyTorch 前传答案

【问题标题】：PyTorch forward pass using weights trained by Theano使用 Theano 训练的权重的 PyTorch 前传
【发布时间】：2018-09-01 23:44:45
【问题描述】：

我在 Theano 中训练了一个小型 CNN 二元分类器。为了获得更简单的代码，我想将训练后的权重移植到 PyTorch 或 numpy 前向传递以进行预测。原始 Theano 程序的预测是令人满意的，但 PyTorch 前向传递将所有示例预测为一个类。

这是我在 Theano 中使用 h5py 保存训练权重的方法：

layer0_w = layer0.W.get_value(borrow=True)
layer0_b = layer0.b.get_value(borrow=True)
layer1_w = layer1.W.get_value(borrow=True)
layer1_b = layer1.b.get_value(borrow=True)
layer2_w = layer2.W.get_value(borrow=True)
layer2_b = layer2.b.get_value(borrow=True)
sm_w = layer_softmax.W.get_value(borrow=True)
sm_b = layer_softmax.b.get_value(borrow=True)

h5_l0w = h5py.File('./model/layer0_w.h5', 'w')
h5_l0w.create_dataset('layer0_w', data=layer0_w)
h5_l0b = h5py.File('./model/layer0_b.h5', 'w')
h5_l0b.create_dataset('layer0_b', data=layer0_b)
h5_l1w = h5py.File('./model/layer1_w.h5', 'w')
h5_l1w.create_dataset('layer1_w', data=layer1_w)
h5_l1b = h5py.File('./model/layer1_b.h5', 'w')
h5_l1b.create_dataset('layer1_b', data=layer1_b)
h5_l2w = h5py.File('./model/layer2_w.h5', 'w')
h5_l2w.create_dataset('layer2_w', data=layer2_w)
h5_l2b = h5py.File('./model/layer2_b.h5', 'w')
h5_l2b.create_dataset('layer2_b', data=layer2_b)
h5_smw = h5py.File('./model/softmax_w.h5', 'w')
h5_smw.create_dataset('softmax_w', data=sm_w)
h5_smb = h5py.File('./model/softmax_b.h5', 'w')
h5_smb.create_dataset('softmax_b', data=sm_b)

然后加载权重以使用 Pytorch 和 Numpy 构建前向传递：

import torch
import numpy as np
import torch.nn.functional as F
def model(data):

    conv0_out = F.conv2d(input=np2var(data),
                         weight=np2var(layer0_w),
                         bias=np2var(layer0_b)
                        )
    layer0_out = relu(var2np(conv0_out))

    conv1_out = F.conv2d(input=np2var(layer0_out),
                         weight=np2var(layer1_w),
                         bias=np2var(layer1_b)
                        )
    layer1_out = np.max(relu(var2np(conv1_out)), axis=2)

    dense_out=relu(np.matmul(layer1_out, layer2_w) + layer2_b)

    softmax_out = softmax(np.matmul(dense_out, softmax_w) + softmax_b)

    return softmax_out

def relu(x):
    return x * (x > 0)
def np2var(x):
    return torch.autograd.Variable(torch.from_numpy(x))
def var2np(x):
    return x.data.numpy()
def softmax(x):
    e_x = np.exp(x - np.max(x))
    return e_x / e_x.sum()

conv2d 函数的输入和内核形状对于 Theano 和 PyTorch 是相同的，并且两个框架中的网络结构是相同的。我无法逐步检测到任何错误。这里可能出了什么问题？

【问题讨论】：

标签： python theano conv-neural-network pytorch

【解决方案1】：

Theano uses convolutions（默认为filter_flip=True）而PyTorch uses cross-correlation。因此，对于每个卷积层，您需要在 PyTorch 中使用它们之前翻转权重。

你可以使用 Keras 的convert_kernel 函数来实现这个结果。

【讨论】：