Question

我必须编写一个名为Microsoft Dynamics CRM Web服务的控制台应用程序，以对超过八千个CRM对象执行操作。 Web服务调用的细节是无关紧要的，这里没有显示，但我需要一个多线程客户端，以便我可以并行调用。我希望能够控制配置设置中使用的线程数，并且如果服务错误数量达到配置定义的阈值，应用程序也可以取消整个操作。

我使用任务并行库Task.Run和ContinueWith编写它，跟踪正在进行的调用（线程）数量，我们收到了多少错误，以及用户是否已从键盘取消。一切都运行良好，我有大量的日志记录，以确保自己的线程干净整洁，并且在运行结束时一切都很整洁。我可以看到该程序并行使用了最大线程数，如果达到了我们的最大限制，则等到运行任务完成后再启动另一个。

在我的代码审查期间，我的同事建议最好使用async / await而不是任务和延续，所以我创建了一个分支并以这种方式重写。结果很有趣 - 异步/等待版本几乎慢两倍，并且它从未达到允许的并行操作/线程的最大数量。 TPL总是并行获得10个线程，而async / await版本永远不会超过5个。

我的问题是：我在编写async / await代码（或甚至是TPL代码）方面犯了错误吗？如果我没有编码错误，你能解释为什么async / await效率较低，这是否意味着最好继续使用TPL进行多线程代码。

请注意，我测试的代码实际上并没有调用CRM - CrmClient类只是在配置中指定的持续时间内线程休眠（5秒）然后抛出异常。这意味着没有可能影响性能的外部变量。

出于这个问题的目的，我创建了一个结合两个版本的精简程序;调用哪个是由配置设置决定的。它们中的每一个都以一个引导运行器开始，它设置环境，创建队列类，然后使用TaskCompletionSource等待完成。 CancellationTokenSource用于发信号通知用户取消。要处理的ID列表从嵌入文件中读取并推送到ConcurrentQueue。他们都开始调用StartCrmRequest和max-threads一样多次;随后，每次处理结果时，ProcessResult方法再次调用StartCrmRequest，继续执行直到处理完所有ID。

您可以从此处克隆/下载完整的程序：https://bitbucket.org/kentrob/pmgfixso/

以下是相关配置：

<appSettings>
    <add key="TellUserAfterNCalls" value="5"/>
    <add key="CrmErrorsBeforeQuitting" value="20"/>
    <add key="MaxThreads" value="10"/>
    <add key="CallIntervalMsecs" value="5000"/>
    <add key="UseAsyncAwait" value="True" />
</appSettings>

从TPL版本开始，这里是启动队列管理器的引导运行程序：

public static class TplRunner
{
    private static readonly CancellationTokenSource CancellationTokenSource = new CancellationTokenSource();

    public static void StartQueue(RuntimeParameters parameters, IEnumerable<string> idList)
    {
        Console.CancelKeyPress += (s, args) =>
        {
            CancelCrmClient();
            args.Cancel = true;
        };

        var start = DateTime.Now;
        Program.TellUser("Start: " + start);

        var taskCompletionSource = new TplQueue(parameters)
            .Start(CancellationTokenSource.Token, idList);

        while (!taskCompletionSource.Task.IsCompleted)
        {
            if (Console.KeyAvailable)
            {
                if (Console.ReadKey().Key != ConsoleKey.Q) continue;
                Console.WriteLine("When all threads are complete, press any key to continue.");
                CancelCrmClient();
            }
        }

        var end = DateTime.Now;
        Program.TellUser("End: {0}. Elapsed = {1} secs.", end, (end - start).TotalSeconds);
    }

    private static void CancelCrmClient()
    {
        CancellationTokenSource.Cancel();
        Console.WriteLine("Cancelling Crm client. Web service calls in operation will have to run to completion.");
    }
}

这是TPL队列管理器本身：

public class TplQueue
{
    private readonly RuntimeParameters parameters;
    private readonly object locker = new object();
    private ConcurrentQueue<string> idQueue = new ConcurrentQueue<string>();
    private readonly CrmClient crmClient;
    private readonly TaskCompletionSource<bool> taskCompletionSource = new TaskCompletionSource<bool>();
    private int threadCount;
    private int crmErrorCount;
    private int processedCount;
    private CancellationToken cancelToken;

    public TplQueue(RuntimeParameters parameters)
    {
        this.parameters = parameters;
        crmClient = new CrmClient();
    }

    public TaskCompletionSource<bool> Start(CancellationToken cancellationToken, IEnumerable<string> ids)
    {
        cancelToken = cancellationToken;

        foreach (var id in ids)
        {
            idQueue.Enqueue(id);
        }

        threadCount = 0;

        // Prime our thread pump with max threads.
        for (var i = 0; i < parameters.MaxThreads; i++)
        {
            Task.Run((Action) StartCrmRequest, cancellationToken);
        }

        return taskCompletionSource;
    }

    private void StartCrmRequest()
    {
        if (taskCompletionSource.Task.IsCompleted)
        {
            return;
        }

        if (cancelToken.IsCancellationRequested)
        {
            Program.TellUser("Crm client cancelling...");
            ClearQueue();
            return;
        }

        var count = GetThreadCount();

        if (count >= parameters.MaxThreads)
        {
            return;
        }

        string id;
        if (!idQueue.TryDequeue(out id)) return;

        IncrementThreadCount();
        crmClient.CompleteActivityAsync(new Guid(id), parameters.CallIntervalMsecs).ContinueWith(ProcessResult);

        processedCount += 1;
        if (parameters.TellUserAfterNCalls > 0 && processedCount%parameters.TellUserAfterNCalls == 0)
        {
            ShowProgress(processedCount);
        }
    }

    private void ProcessResult(Task<CrmResultMessage> response)
    {
        if (response.Result.CrmResult == CrmResult.Failed && ++crmErrorCount == parameters.CrmErrorsBeforeQuitting)
        {
            Program.TellUser(
                "Quitting because CRM error count is equal to {0}. Already queued web service calls will have to run to completion.",
                crmErrorCount);
            ClearQueue();
        }

        var count = DecrementThreadCount();

        if (idQueue.Count == 0 && count == 0)
        {
            taskCompletionSource.SetResult(true);
        }
        else
        {
            StartCrmRequest();
        }
    }

    private int GetThreadCount()
    {
        lock (locker)
        {
            return threadCount;
        }
    }

    private void IncrementThreadCount()
    {
        lock (locker)
        {
            threadCount = threadCount + 1;
        }
    }

    private int DecrementThreadCount()
    {
        lock (locker)
        {
            threadCount = threadCount - 1;
            return threadCount;
        }
    }

    private void ClearQueue()
    {
        idQueue = new ConcurrentQueue<string>();
    }

    private static void ShowProgress(int processedCount)
    {
        Program.TellUser("{0} activities processed.", processedCount);
    }
}

请注意，我知道有几个计数器不是线程安全的，但它们并不重要; threadCount变量是唯一的关键变量。

这是虚拟CRM客户端方法：

public Task<CrmResultMessage> CompleteActivityAsync(Guid activityId, int callIntervalMsecs)
{
    // Here we would normally call a CRM web service.
    return Task.Run(() =>
    {
        try
        {
            if (callIntervalMsecs > 0)
            {
                Thread.Sleep(callIntervalMsecs);
            }
            throw new ApplicationException("Crm web service not available at the moment.");
        }
        catch
        {
            return new CrmResultMessage(activityId, CrmResult.Failed);
        }
    });
}

以下是相同的async / await类（为简洁起见，删除了常用方法）：

public static class AsyncRunner
{
    private static readonly CancellationTokenSource CancellationTokenSource = new CancellationTokenSource();

    public static void StartQueue(RuntimeParameters parameters, IEnumerable<string> idList)
    {
        var start = DateTime.Now;
        Program.TellUser("Start: " + start);

        var taskCompletionSource = new AsyncQueue(parameters)
            .StartAsync(CancellationTokenSource.Token, idList).Result;

        while (!taskCompletionSource.Task.IsCompleted)
        {
            ...
        }

        var end = DateTime.Now;
        Program.TellUser("End: {0}. Elapsed = {1} secs.", end, (end - start).TotalSeconds);
    }
}

async / await队列管理器：

public class AsyncQueue
{
    private readonly RuntimeParameters parameters;
    private readonly object locker = new object();
    private readonly CrmClient crmClient;
    private readonly TaskCompletionSource<bool> taskCompletionSource = new TaskCompletionSource<bool>();
    private CancellationToken cancelToken;
    private ConcurrentQueue<string> idQueue = new ConcurrentQueue<string>();
    private int threadCount;
    private int crmErrorCount;
    private int processedCount;

    public AsyncQueue(RuntimeParameters parameters)
    {
        this.parameters = parameters;
        crmClient = new CrmClient();
    }

    public async Task<TaskCompletionSource<bool>> StartAsync(CancellationToken cancellationToken,
        IEnumerable<string> ids)
    {
        cancelToken = cancellationToken;

        foreach (var id in ids)
        {
            idQueue.Enqueue(id);
        }
        threadCount = 0;

        // Prime our thread pump with max threads.
        for (var i = 0; i < parameters.MaxThreads; i++)
        {
            await StartCrmRequest();
        }

        return taskCompletionSource;
    }

    private async Task StartCrmRequest()
    {
        if (taskCompletionSource.Task.IsCompleted)
        {
            return;
        }

        if (cancelToken.IsCancellationRequested)
        {
            ...
            return;
        }

        var count = GetThreadCount();

        if (count >= parameters.MaxThreads)
        {
            return;
        }

        string id;
        if (!idQueue.TryDequeue(out id)) return;

        IncrementThreadCount();
        var crmMessage = await crmClient.CompleteActivityAsync(new Guid(id), parameters.CallIntervalMsecs);
        ProcessResult(crmMessage);

        processedCount += 1;
        if (parameters.TellUserAfterNCalls > 0 && processedCount%parameters.TellUserAfterNCalls == 0)
        {
            ShowProgress(processedCount);
        }
    }

    private async void ProcessResult(CrmResultMessage response)
    {
        if (response.CrmResult == CrmResult.Failed && ++crmErrorCount == parameters.CrmErrorsBeforeQuitting)
        {
            Program.TellUser(
                "Quitting because CRM error count is equal to {0}. Already queued web service calls will have to run to completion.",
                crmErrorCount);
            ClearQueue();
        }

        var count = DecrementThreadCount();

        if (idQueue.Count == 0 && count == 0)
        {
            taskCompletionSource.SetResult(true);
        }
        else
        {
            await StartCrmRequest();
        }
    }
}

因此，将MaxThreads设置为10并将CrmErrorsBeforeQuitting设置为20，我机器上的TPL版本在19秒内完成，async / await版本需要35秒。鉴于我有超过8000个电话，这是一个显着的差异。有什么想法吗？

Answer 1

我想我在这里看到了这个问题，或者至少是其中的一部分。仔细看下面的两位代码;它们并不等同。

// Prime our thread pump with max threads.
for (var i = 0; i < parameters.MaxThreads; i++)
{
    Task.Run((Action) StartCrmRequest, cancellationToken);
}

和

// Prime our thread pump with max threads.
for (var i = 0; i < parameters.MaxThreads; i++)
{
    await StartCrmRequest();
}

在原始代码中（我认为它在功能上是合理的）只有一次调用ContinueWith。如果它意图保留原始行为，那么我希望在一个微不足道的重写中看到多少await个语句。

不是一个严格的规则，只适用于简单的情况，但仍然是一个值得注意的好事。

Answer 2

我认为你过度复杂了你的解决方案，最终没有达到你想要的任何一个实现。

首先，与任何HTTP主机的连接受service point manager的限制。客户端环境的default limit为2，但您可以自行增加它。

无论你产生多少线程，都不会有比那些allwed更多的活动请求。

然后，正如有人指出的那样，await在逻辑上阻止了执行流程。

最后，当您应该使用TPL data flows时，您花时间创建AsyncQueue。

Answer 3

当使用async / await实现时，我希望I / O绑定算法在单个线程上运行。与@KirillShlenskiy不同，我认为负责“回调”到调用者上下文的位不会导致减速。我认为你通过尝试将它用于I / O绑定操作来超越线程池。它主要是为计算限制的操作而设计的。

看看ForEachAsync。我觉得这就是你要找的东西（Stephen Toub的讨论，你会发现Wischik的视频也很有意义）：

http://blogs.msdn.com/b/pfxteam/archive/2012/03/05/10278165.aspx

（使用并发度来减少内存占用量）

http://vimeo.com/43808831 http://vimeo.com/43808833

为什么这个TAP异步/等待代码比TPL版本慢？

3 个答案: