Question

有人可以帮助我理解为什么权重没有更新吗？

    unet = Unet()
    optimizer = torch.optim.Adam(unet.parameters(), lr=0.001)
    loss_fn = torch.nn.MSELoss()
    input =  Variable(torch.randn(32, 1, 64, 64, 64 ), requires_grad=True)
    target = Variable(torch.randn(32, 1, 64, 64, 64), requires_grad=False)

    optimizer.zero_grad()
    y_pred = unet(input)
    y = target[: , : , 20:44, 20:44, 20:44]

    loss = loss_fn(y_pred, y)
    print(unet.conv1.weight.data[0][0]) # weights of the first layer in the unet
    loss.backward()
    optimizer.step()
    print(unet.conv1.weight.data[0][0]) # weights havent changed

模型定义如下：

class Unet(nn.Module):

def __init__(self):
  super(Unet, self).__init__()

  # Down hill1
  self.conv1 = nn.Conv3d(1, 2, kernel_size=3,  stride=1)
  self.conv2 = nn.Conv3d(2, 2, kernel_size=3,  stride=1)

  # Down hill2
  self.conv3 = nn.Conv3d(2, 4, kernel_size=3,  stride=1)
  self.conv4 = nn.Conv3d(4, 4, kernel_size=3,  stride=1)

  #bottom
  self.convbottom1 = nn.Conv3d(4, 8, kernel_size=3,  stride=1)
  self.convbottom2 = nn.Conv3d(8, 8, kernel_size=3,  stride=1)

  #up hill1
  self.upConv0 = nn.Conv3d(8, 4, kernel_size=3,  stride=1)
  self.upConv1 = nn.Conv3d(4, 4, kernel_size=3,  stride=1)
  self.upConv2 = nn.Conv3d(4, 2, kernel_size=3,  stride=1)

  #up hill2
  self.upConv3 = nn.Conv3d(2, 2, kernel_size=3, stride=1)
  self.upConv4 = nn.Conv3d(2, 1, kernel_size=1, stride=1)

  self.mp = nn.MaxPool3d(kernel_size=3, stride=2, padding=1)
  # some more irrelevant properties...

前进功能如下：

def forward(self, input):
    # Use U-net Theory to Update the filters.
    # Example Approach...
    input = F.relu(self.conv1(input))
    input = F.relu(self.conv2(input))

    input = self.mp(input)

    input = F.relu(self.conv3(input))
    input = F.relu(self.conv4(input))

    input = self.mp(input)

    input = F.relu(self.convbottom1(input))
    input = F.relu(self.convbottom2(input))

    input = F.interpolate(input, scale_factor=2, mode='trilinear')

    input = F.relu(self.upConv0(input))
    input = F.relu(self.upConv1(input))

    input = F.interpolate(input, scale_factor=2, mode='trilinear')


    input = F.relu(self.upConv2(input))
    input = F.relu(self.upConv3(input))

    input = F.relu(self.upConv4(input))

    return input

我遵循了我可以找到的任何示例和文档的方法，这使我无法理解为什么？

在向后调用之后，我最多可以找出y_pred.grad，而不应该这样做。如果没有梯度，那么优化器当然不能在任何方向上改变权重，但是为什么没有梯度呢？

Answer 1

我将这个问题归结为“垂死的ReLu问题”，因为数据是Hounsfield单位，Pytorch的初始权重均匀分布，这意味着许多神经元将从ReLu的零区域开始，从而使它们瘫痪并依赖于其他神经元。产生可以将它们拉出零区域的梯度。随着训练的进行，所有神经元都被推入ReLu的零区。

对此问题有几种解决方案。您可以使用Leaky_relu或其他没有零区域的激活函数。

您还可以使用“批归一化”对输入数据进行归一化，并将权重初始化为仅正数。

第二种解决方案可能是最理想的解决方案，因为这两种解决方案都可以解决该问题，但是leaky_relu将延长训练时间，而批处理规范化则相反，并提高了准确性。另一方面，Leaky_relu是一个简单的解决方案，而其他解决方案则需要一些额外的工作。

对于Hounsfield数据，还可以向输入中添加常数1000，以消除数据中的负单位。这仍然需要与Pytorch的标准初始化不同的权重初始化。

Answer 2

我不认为应该使用您使用的命令来打印重量。尝试使用public class Post { [Key] [Required()] public int IDPost { get; set; } [Required()] public string Title { get; set; } [Required()] public byte[] Image { get; set; } [Required()] public string Image_Extension { get; set; } [Required()] public string Content { get; set; } [Required()] public string AuthorUserName { get; set; } [Required()] public DateTime DateCreated { get; set; } }代替yesterdayDate=`date -d '2018-11-24 00:09 -1 hour' +'%Y-%m-%d %H:%M'` echo $yesterdayDate Output: 2018-11-23 23:09。

Pytorch：“模型权重不变”

2 个答案: