具有预处理功能的R中的并行处理不起作用

时间:2019-03-14 19:27:43

标签: r parallel-processing

我有一个巨大的代码,其中包含//Dependency Injection public static void RegisterTypes(IUnityContainer container) { // Register manager mappings. container.RegisterType<IDatabaseContextProvider, EntityContextProvider>(new PerResolveLifetimeManager()); } } //Test Setup /// <summary> /// Mocked <see cref="IrdEntities" /> context to be used in testing. /// </summary> private Mock<CCMSEntities> _irdContextMock; /// <summary> /// Mocked <see cref="IDatabaseContextProvider" /> context to be used in testing. /// </summary> private Mock<IDatabaseContextProvider> _EntityContextProvider; ... _irdContextMock = new Mock<CCMSEntities>(); _irdContextMock.Setup(m => m.Outbreaks).Returns(new Mock<DbSet<Outbreak>>().SetupData(_outbreakData).Object); _irdContextMock.Setup(m => m.FDI_Number_Counter).Returns(new Mock<DbSet<FDI_Number_Counter>>().SetupData(new List<FDI_Number_Counter>()).Object); _EntityContextProvider = new Mock<IDatabaseContextProvider>(); _EntityContextProvider.Setup(m => m.Context).Returns(_irdContextMock.Object); _irdOutbreakRepository = new IrdOutbreakRepository(_EntityContextProvider.Object, _loggerMock.Object); // Usage in the Class being tested: //Constructor public IrdOutbreakRepository(IDatabaseContextProvider entityContextProvider, ILogger logger) { _entityContextProvider = entityContextProvider; _irdContext = entityContextProvider.Context; _logger = logger; } /// <summary> /// The wrapper for the Entity Framework context and transaction. /// </summary> private readonly IDatabaseContextProvider _entityContextProvider; // The usage of a transaction that automatically gets mocked because the return type is void. _entityContextProvider.BeginTransaction(); ... ,这需要花几个小时才能运行,最后我的PC冻结了。为了改进我的代码,我使用了for loop以及并行处理来减少运行时间。我的代码是这样的:

foreach

我遇到的错误是找不到library(doParallel) cores=detectCores() cl <- makeCluster(cores[1]-1) registerDoParallel(cl) foreach (i=1:3)%dopar%{ {some R code with i index} preProc <- preProcess(method="bagImpute", train[, 1:lengthvar]) train[, 1:lengthvar] <- predict(preProc, train[, 1:lengthvar]) test[, 1:lengthvar] <- predict(preProc, test[, 1:lengthvar]) } stopCluster(cl) 函数(用于插补缺失值)。我的另一个担心是,似乎使用并行计算会增加运行时间。

预先感谢您的宝贵指导。

1 个答案:

答案 0 :(得分:1)

您需要在.packages函数中添加foreach

    library(doParallel)
    cores=detectCores()
    cl <- makeCluster(cores[1]-1) 
    registerDoParallel(cl)

    foreach (i=1:3,.packages("caret"))%dopar%{
    {some R code with i index}
    preProc <- preProcess(method="bagImpute", train[, 1:lengthvar])
    train[, 1:lengthvar] <- predict(preProc, train[, 1:lengthvar])
    test[, 1:lengthvar] <- predict(preProc, test[, 1:lengthvar])
    }   
    stopCluster(cl)