Pytorch-自定义DataLoader永远运行

时间:2019-04-16 00:33:10

标签: pytorch

 # Nav bar tab
                  tabPanel(
                    "Income prediction",
                    headerPanel("Income Prediction Model"),
                    # The sidebar contains the option for the end user to select
                    # multiple independent variables
                    sidebarLayout(
                      position = "right",
                      sidebarPanel(
                        h2("Build your model"),
                        br(),
                        checkboxGroupInput(
                          "iv1",
                          label = "Select a factor. You can change your selection at any time.",
                          list(
                            "SEX" = "SEX",
                            "RACE" = "RACE",
                            "OCC" = "OCC",
                            "INCWAGE" = "INCWAGE"

                          ),
                          selected = "INCWAGE"

                        )
                      ),
                      mainPanel(br(),

                                tabsetPanel(
                                  type = "tabs",

                                  tabPanel(
                                    "Regression Table",
                                    h3("Table of Regression Coefficients"),
                                    HTML('</br>'),
                                    tableOutput("regTab"),
                                    HTML('</br>'),
                                    helpText(
                                      "Help text"
                                    )

我需要数据加载器永久运行。现在,它总是在达到10000000或最大整数大小后终止。我如何使它永远运行,我不在乎“索引”,我没有使用它。我只是在使用此类的工作者功能

1 个答案:

答案 0 :(得分:0)

由于您需要在同一批次中进行多次迭代训练,因此以下代码框架应为您工作。

def train(args, data_loader):
    for idx, ex in enumerate(data_loader):
        # iterate over each mini-batches
        # add your code

def validate(args, data_loader):
     with torch.no_grad():
        for idx, ex in enumerate(data_loader):
            # iterate over each mini-batches
            # add your code

# args = dict() containing required parameters
for epoch in range(start_epoch, args.num_epochs):
    # train_loader = data loader for the training data
    train(args, train_loader)

您可以按照以下方式安装数据加载器。

class ReaderDataset(Dataset):
    def __init__(self, examples):
        # examples = a list of examples
        # add your code

    def __len__(self):
        # return total dataset size

    def __getitem__(self, index):
        # write your code to return each batch item

train_dataset = ReaderDataset(train_examples)
train_sampler = torch.utils.data.sampler.RandomSampler(train_dataset)
train_loader = torch.utils.data.DataLoader(
        train_dataset,
        batch_size=args.batch_size,
        sampler=train_sampler,
        num_workers=args.data_workers,
        collate_fn=batchify,
        pin_memory=args.cuda,
        drop_last=args.parallel
    )
# batchify is a custom function to prepare the mini-batches