如何在Pytorch LSTM / GRU / RNN中指定不同的图层大小

时间:2020-09-21 16:18:51

标签: machine-learning pytorch lstm recurrent-neural-network

因此,我知道通常与Pytorch一起使用LSTM。但这使我感到烦恼,您只能在LSTM中为所有图层指定一个hidden_​​size。像这样:

lstm = nn.LSTM(input_size=26, hidden_size=128, num_layers=3, dropout=dropout_chance, batch_first=True)

因此,对于所有三层,大小将为128。但是,实际上没有办法说第一层应该是128,第二层应该是32,第三层应该是128吗? 如果我错过了文档中的内容,或者您​​知道解决方法,请告诉我,谢谢!

1 个答案:

答案 0 :(得分:0)

实际上,这取决于输入的形状,您可以看到How to decide input and hidden layer dimension to torch.nn.RNN?。另外,您必须了解什么是输入和输出,因为有不同的方式来处理输入和输出。在A Beginner’s Guide on Recurrent Neural Networks with PyTorch中,您可以看到模型如何输入输入数据。 This image was taken from Andrej Karpathy’s blog post 您的模型可以是

lstm = nn.LSTM(input_size=26, hidden_size=128, num_layers=3, dropout=dropout_chance, batch_first=True)
lstm2 = nn.LSTM(input_size=26, hidden_size=32, num_layers=3, dropout=dropout_chance, batch_first=True)
lstm3 = nn.LSTM(input_size=26, hidden_size=128, num_layers=3, dropout=dropout_chance, batch_first=True)

有关多层,请参见此model

# sequence classification model
class M1(nn.Module):
    def __init__(self):
        super(M1, self).__init__()
        
        self.recurrent_layer  = nn.LSTM(hidden_size = 100, input_size = 75, num_layers = 5)
        self.recurrent_layer1  = nn.LSTM(hidden_size = 200, input_size = 100, num_layers = 5)
        self.recurrent_layer2  = nn.LSTM(hidden_size = 300, input_size = 200, num_layers = 5)
        self.project_layer     = nn.Linear(300, 200)
        self.project_layer1    = nn.Linear(200, 100)
        self.project_layer2    = nn.Linear(100, 10)
    
    # the size of input is [batch_size, seq_len(15), input_dim(75)]
    # the size of logits is [batch_size, num_class]
    def forward(self, input, h_t_1=None, c_t_1=None):
        # the size of rnn_outputs is [batch_size, seq_len, rnn_size]
        # self.recurrent_layer.flatten_parameters()
        rnn_outputs, (hn, cn) = self.recurrent_layer(input)
        rnn_outputs, (hn, cn) = self.recurrent_layer1(rnn_outputs)
        rnn_outputs, (hn, cn) = self.recurrent_layer2(rnn_outputs)
        # classify the last step of rnn_outpus
        # the size of logits is [batch_size, num_class]
        logits = self.project_layer(rnn_outputs[:,-1])
        logits = self.project_layer1(logits)
        logits = self.project_layer2(logits)
        return logits
相关问题