【问题标题】:PyTorch will not fit straight line to two data pointsPyTorch 不会将直线拟合到两个数据点
【发布时间】:2019-04-08 00:50:03
【问题描述】:

我在使用 pytorch 拟合带有 2 个数据点的简单 y=4x1 线时遇到问题。在运行推理代码时,模型似乎向任何奇怪的输入输出相同的值。请找到随附的代码以及我使用的数据文件。在这里感谢任何帮助。

import torch
import numpy as np
import pandas as pd

df = pd.read_csv('data.csv')
test_data = pd.read_csv('test_data.csv')

inputs = df[['x1']]
target = df['y']
inputs = torch.tensor(inputs.values).float()
target = torch.tensor(target.values).float()

test_data = torch.tensor(test_data.values).float()
#Defining Network Architecture
import torch.nn as nn
import torch.nn.functional as F

class Net(nn.Module):

  def __init__(self):
    super(Net,self).__init__()

    hidden1 = 3
#     hidden2 = 5 

    self.fc1 = nn.Linear(1,hidden1)
    self.fc3 = nn.Linear(hidden1,1)


  def forward(self,x):
    x = F.relu(self.fc1(x))
    x = self.fc3(x)
    return x

#instantiate the model

model = Net()
print(model)

criterion = nn.MSELoss()
optimizer = torch.optim.SGD(model.parameters(),lr=0.01)

model.train()

#epochs
epochs = 100


for x in range(epochs):
  #initialize the training loss to 0
  train_loss = 0
  #clear out gradients
  optimizer.zero_grad() 

  #calculate the output
  output = model(inputs)

  #calculate loss
  loss = criterion(output,target)

  #backpropagate
  loss.backward() 

  #update parameters
  optimizer.step()

  if ((x%5)==0):
    print('Training Loss after epoch {:2d} is {:2.6f}'.format(x,loss))

#set the model in evaluation mode
model.eval()

#Test the model on unseen data

test_output = model(test_data)

print(test_output)

下面是模型输出

#model output
tensor([[56.7579],
        [56.7579],
        [56.7579],
        [56.7579],
        [56.7579],
        [56.7579],
        [56.7579],
        [56.7579],
        [56.7579],
        [56.7579],
        [56.7579],
        [56.7579],
        [56.7579],
        [56.7579],
        [56.7579],
        [56.7579],
        [56.7579],
        [56.7579],
        [56.7579],
        [56.7579]], grad_fn=<AddmmBackward>)

【问题讨论】:

    标签: linear-regression pytorch


    【解决方案1】:

    您的模型正在崩溃。您可能会根据prints 看到这一点。您可能希望使用较低的学习率(1e-5、1e-6 等)。如果您没有经验并且希望减少微调这些 hparams 的麻烦,从 SGD(...) 切换到 Adam(...) 可能会更容易。此外,也许 100 个 epoch 是不够的。由于您没有分享MCVE,因此我无法确定它是什么。这是使用您使用的相同Net 的线拟合的MCVE

    import torch
    import numpy as np
    import torch.nn as nn
    import torch.nn.functional as F
    
    epochs = 1000
    max_range = 40
    interval = 4
    
    # DATA
    x_train = torch.arange(0, max_range, interval).view(-1, 1).float()
    x_train += torch.rand(x_train.size(0), 1) - 0.5  # small noise
    y_train = (4 * x_train) 
    y_train += torch.rand(x_train.size(0), 1) - 0.5  # small noise
    
    x_test  = torch.arange(interval // 2, max_range, interval).view(-1, 1).float()
    y_test  = 4 * x_test
    
    class Net(nn.Module):
      def __init__(self):
        super(Net, self).__init__()
        hidden1 = 3
        self.fc1 = nn.Linear(1, hidden1)
        self.fc3 = nn.Linear(hidden1, 1)
    
      def forward(self, x):
        x = F.relu(self.fc1(x))
        x = self.fc3(x)
        return x
    
    model = Net()
    print(model)
    
    criterion = nn.MSELoss()
    optimizer = torch.optim.SGD(model.parameters(), lr=1e-5)
    
    # TRAIN
    model.train()
    for epoch in range(epochs):
      optimizer.zero_grad()
      y_pred = model(x_train)
      loss = criterion(y_pred, y_train)
      loss.backward()
      optimizer.step()
    
      if epoch % 10 == 0:
        print('Training Loss after epoch {:2d} is {:2.6f}'.format(epoch, loss))
    
    # TEST
    model.eval()
    y_pred = model(x_test)
    print(torch.cat((x_test, y_pred, y_test), dim=-1))
    

    这是数据的样子:

    这就是训练的样子:

    Training Loss after epoch  0 is 7416.805664
    Training Loss after epoch 10 is 6645.655273
    Training Loss after epoch 20 is 5792.936523
    Training Loss after epoch 30 is 4700.106445
    Training Loss after epoch 40 is 3245.384277
    Training Loss after epoch 50 is 1779.370728
    Training Loss after epoch 60 is 747.418579
    Training Loss after epoch 70 is 246.781311
    Training Loss after epoch 80 is 68.635155
    Training Loss after epoch 90 is 17.332235
    Training Loss after epoch 100 is 4.280161
    Training Loss after epoch 110 is 1.170808
    Training Loss after epoch 120 is 0.453974
    ...
    Training Loss after epoch 970 is 0.232296
    Training Loss after epoch 980 is 0.232090
    Training Loss after epoch 990 is 0.231888
    

    这是输出的样子:

    |  x_test |  y_pred  |  y_test  |
    |:-------:|:--------:|:--------:|
    |  2.0000 |   8.6135 |   8.0000 |
    |  6.0000 |  24.5276 |  24.0000 |
    | 10.0000 |  40.4418 |  40.0000 |
    | 14.0000 |  56.3303 |  56.0000 |
    | 18.0000 |  72.1884 |  72.0000 |
    | 22.0000 |  88.0465 |  88.0000 |
    | 26.0000 | 103.9047 | 104.0000 |
    | 30.0000 | 119.7628 | 120.0000 |
    | 34.0000 | 135.6210 | 136.0000 |
    | 38.0000 | 151.4791 | 152.0000 |
    

    【讨论】:

    • 谢谢@Berriel。学习率是问题所在,一旦我纠正它似乎可以解决问题。很抱歉没有更早地为 MVCE 共享数据
    • @RajkumarKaliyaperumal 很高兴知道!别担心,但请记住,当您提供 MCVE 时,总会有更多人帮助您:)
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2023-03-25
    • 2011-01-22
    相关资源
    最近更新 更多