加速foreach循环中的WebClient调用

时间:2016-06-27 15:31:47

标签: c# asp.net parallel-processing task-parallel-library

我正在开发一个asp.net mvc-5网络应用程序,我接下来打电话给第三方应用程序进行连续的WebClient()调用:

public async Task<List<Technology>> GetResource(int? filtertype)
{

  try
  {
     using (WebClient wc = new WebClient()) 
     {
         string url = currentURL + "resources?AUTHTOKEN=" + token;
         var json = await wc.DownloadStringTaskAsync(url);
         resourcesinfo = JsonConvert.DeserializeObject<ResourcesInfo>(json);
     }

     //for each resource get its tag + add the tag to the list
     foreach (var c in resourcesinfo.operation.Details)
     {    
        ResourceAccountListInfo resourceAccountListInfo = new ResourceAccountListInfo();
        using (WebClient wc = new WebClient()) 
        {    
        string url = currentURL + "resources/" + c.RESOURCEID + "?AUTHTOKEN=" + token;
        string tempurl = url.Trim();    
        var json = await wc.DownloadStringTaskAsync(tempurl);
        resourceAccountListInfo = JsonConvert.DeserializeObject<ResourceAccountListInfo>(json);     
                       AllTags.Add(resourceAccountListInfo.SingleOrDefault().CUSTOMFIELDVALUE.ToLower());   
     }    
   }
}

目前第一个WebClient将返回大约1,500条记录,因此我WebClient内的第二个foreach来电将被执行1,500次,因此整个过程大约需要20分钟才能完成。那么我该如何改进这个过程呢?

1 个答案:

答案 0 :(得分:2)

您需要一些方法来限制对DownloadStringTaskAsync的调用。您可以使用信号量和Task.Run手动执行此操作,也可以使用TPL Dataflow库提供所有URL并指定所需的并行度。数据流块将接受异步委托(与Parallel.For不同)

private static async Task<Thing[]> ProcessAllUrls(string[] urls)
{
    var workBlock = new TransformBlock<string, Thing>(
        async url => await DownloadAndProcessUrl(url),
        new ExecutionDataflowBlockOptions { MaxDegreeOfParallelism = 20 }
        );

    var outputBlock = new BufferBlock<Thing>();

    using (workBlock.LinkTo(outputBlock, new DataflowLinkOptions { PropagateCompletion = true }))
    {

        foreach (var url in urls)
        {
            workBlock.Post(url);
        }

        // signal no more input going into workblock
        workBlock.Complete();

        // wait for workblock to pump all data into outputblock
        await workBlock.Completion;

        IList<Thing> finalResult = null;
        bool result = outputBlock.TryReceiveAll(out finalResult);
        return finalResult.ToArray();
    }
}

您确实要小心在Web服务器进程中执行并行操作。虽然WebClient调用与CPU真正异步,但您对反应进行反序列化的工作将在线程池线程上运行,这意味着它在此期间与ASP.NET资源的CPU请求竞争

相关问题