【问题标题】:Accessing reduced dimensionality of trained autoencoder访问经过训练的自动编码器的降维
【发布时间】:2023-03-16 23:05:02
【问题描述】:

这是一个使用 PyTorch 在 mnist 上训练的自动编码器:

import torch
import torchvision
import torch.nn as nn
from torch.autograd import Variable

cuda = torch.cuda.is_available() # True if cuda is available, False otherwise
FloatTensor = torch.cuda.FloatTensor if cuda else torch.FloatTensor
print('Training on %s' % ('GPU' if cuda else 'CPU'))

# Loading the MNIST data set
transform = torchvision.transforms.Compose([torchvision.transforms.ToTensor(),
                torchvision.transforms.Normalize((0.1307,), (0.3081,))])
mnist = torchvision.datasets.MNIST(root='../data/', train=True, transform=transform, download=True)

# Loader to feed the data batch by batch during training.
batch = 100
data_loader = torch.utils.data.DataLoader(mnist, batch_size=batch, shuffle=True)

autoencoder = nn.Sequential(
                # Encoder
                nn.Linear(28 * 28, 512),
                nn.PReLU(512),
                nn.BatchNorm1d(512),

                # Low-dimensional representation
                nn.Linear(512, 128),   
                nn.PReLU(128),
                nn.BatchNorm1d(128),

                # Decoder
                nn.Linear(128, 512),
                nn.PReLU(512),
                nn.BatchNorm1d(512),
                nn.Linear(512, 28 * 28))

autoencoder = autoencoder.type(FloatTensor)

optimizer = torch.optim.Adam(params=autoencoder.parameters(), lr=0.005)

epochs = 10
data_size = int(mnist.train_labels.size()[0])

for i in range(epochs):
    for j, (images, _) in enumerate(data_loader):
        images = images.view(images.size(0), -1) # from (batch 1, 28, 28) to (batch, 28, 28)
        images = Variable(images).type(FloatTensor)

        autoencoder.zero_grad()
        reconstructions = autoencoder(images)
        loss = torch.dist(images, reconstructions)
        loss.backward()
        optimizer.step()
    print('Epoch %i/%i loss %.2f' % (i + 1, epochs, loss.data[0]))

print('Optimization finished.')

我正在尝试比较每个图像的低维表示。

打印每一层的维度:

for l in autoencoder.parameters() : 
    print(l.shape)

显示:

torch.Size([512, 784])
torch.Size([512])
torch.Size([512])
torch.Size([512])
torch.Size([512])
torch.Size([128, 512])
torch.Size([128])
torch.Size([128])
torch.Size([128])
torch.Size([128])
torch.Size([512, 128])
torch.Size([512])
torch.Size([512])
torch.Size([512])
torch.Size([512])
torch.Size([784, 512])
torch.Size([784])

看来维度没有存储在学习向量中?

换句话说,如果我有 10000 个图像,每个图像包含 100 个像素,执行上述自动编码器将维度减少到 10 个像素应该允许访问所有 10000 个图像的 10 个像素维度?

【问题讨论】:

  • 您将 28x28 图像缩小为 128 维空间。然后你打印出大小为 128 的张量。所以我不明白你为什么说编码后的图像不在打印的张量中?
  • @user2653663 我有控制权吗
  • @user2653663 对于单个图像如何访问它的降维?由于编码器和解码器的维度相同,那么为了访问降维需要访问隐藏层的权重?

标签: machine-learning neural-network pytorch autoencoder dimensionality-reduction


【解决方案1】:

我对 pyTorch 不是很熟悉,但是将自动编码器拆分为编码器和解码器模型似乎可行(我将隐藏层的大小从 512 更改为 64,将编码图像的维度从 128 更改为 4 ,以使示例运行得更快):

import torch
import torchvision
import torch.nn as nn
from torch.autograd import Variable

cuda = torch.cuda.is_available() # True if cuda is available, False otherwise
FloatTensor = torch.cuda.FloatTensor if cuda else torch.FloatTensor
print('Training on %s' % ('GPU' if cuda else 'CPU'))

# Loading the MNIST data set
transform = torchvision.transforms.Compose([torchvision.transforms.ToTensor(),
                torchvision.transforms.Normalize((0.1307,), (0.3081,))])
mnist = torchvision.datasets.MNIST(root='../data/', train=True, transform=transform, download=True)

# Loader to feed the data batch by batch during training.
batch = 100
data_loader = torch.utils.data.DataLoader(mnist, batch_size=batch, shuffle=True)


encoder = nn.Sequential(
                # Encoder
                nn.Linear(28 * 28, 64),
                nn.PReLU(64),
                nn.BatchNorm1d(64),

                # Low-dimensional representation
                nn.Linear(64, 4),
                nn.PReLU(4),
                nn.BatchNorm1d(4))

decoder = nn.Sequential(
                # Decoder
                nn.Linear(4, 64),
                nn.PReLU(64),
                nn.BatchNorm1d(64),
                nn.Linear(64, 28 * 28))

autoencoder = nn.Sequential(encoder, decoder)

encoder = encoder.type(FloatTensor)
decoder = decoder.type(FloatTensor)
autoencoder = autoencoder.type(FloatTensor)

optimizer = torch.optim.Adam(params=autoencoder.parameters(), lr=0.005)

epochs = 10
data_size = int(mnist.train_labels.size()[0])

for i in range(epochs):
    for j, (images, _) in enumerate(data_loader):
        images = images.view(images.size(0), -1) # from (batch 1, 28, 28) to (batch, 28, 28)
        images = Variable(images).type(FloatTensor)

        autoencoder.zero_grad()
        reconstructions = autoencoder(images)
        loss = torch.dist(images, reconstructions)
        loss.backward()
        optimizer.step()
    print('Epoch %i/%i loss %.2f' % (i + 1, epochs, loss.data[0]))

print('Optimization finished.')

# Get the encoded images here
encoded_images = []
for j, (images, _) in enumerate(data_loader):
    images = images.view(images.size(0), -1) 
    images = Variable(images).type(FloatTensor)

    encoded_images.append(encoder(images))

【讨论】:

  • 感谢您,您的代码在哪里指定访问层权重?我知道编码存储在列表'encoded_images'中,但是什么决定了存储在这里的隐藏权重?编码器中的最后一层是“低维表示”的原因,所以在执行“编码器(图像)”时,“nn.Linear(64, 4)”存储的是什么?
  • 我不确定我是否理解您的问题。在编码器中,图像(大小为 28^2)首先被转换为 64 维表示。然后将 64 维表示转换为 4 维(编码图像)。 nn.Linear(64, 4) 包含将 64 个浮点值转换为 4 个浮点值所需的权重。所以编码器对象包含所有的网络结构和训练的权重,以将图像转换为 4d 编码图像。
  • 我试图了解编码器(图像)如何存储减少的特征编码。查看编码器中最后一行的代码是:nn.BatchNorm1d(4)),nn.BatchNorm1d(4)) 由编码器(图像)调用以存储编码。
猜你喜欢
  • 2019-02-15
  • 1970-01-01
  • 2020-12-25
  • 2019-08-22
  • 2021-04-03
  • 2021-01-29
  • 2021-05-10
  • 1970-01-01
  • 1970-01-01
相关资源
最近更新 更多