程序在没有调试器的情况下运行不同[关闭]答案

【问题标题】：Program runs differently without debugger [closed]程序在没有调试器的情况下运行不同[关闭]
【发布时间】：2021-09-11 23:00:23
【问题描述】：

我有一个程序 (.NET 5)，它使用 WebClient.DownloadFile 同时下载一堆文件 (1k+)，它在使用调试器运行时按预期工作，在调试和发布模式下下载大约 99% 的文件;但是在没有调试器的情况下运行时，它无法下载超过 50% 的文件。

所有线程在程序结束之前完成，因为它们是前台线程。

程序的代码是：

using System;
using System.IO;
using System.Net;
using System.Threading;

namespace Dumper
{
    internal sealed class Program
    {
        private static void Main(string[] args)
        {
            Directory.CreateDirectory(args[1]);

            foreach (string uri in File.ReadAllLines(args[0]))
            {
                string filePath = Path.Combine(args[1], uri.Split('/')[^1]);

                new Thread((param) =>
                {
                    (string path, string url) = ((string, string))param!;
                    using WebClient webClient = new();

                    try
                    {
                        webClient.DownloadFile(new Uri(url.Replace("%", "%25")), path);

                        Console.WriteLine($"{path} has been successfully download.");
                    }
                    catch (UriFormatException)
                    {
                        throw;
                    }
                    catch (Exception e)
                    {
                        Console.WriteLine($"{path} failed to download: {e}");
                    }
                }).Start((filePath, uri));
            }
        }
    }
}

【问题讨论】：

失败怎么办？具体一点。
“不使用调试器运行”到底是什么意思？ RELEASE 模式？
您没有等待线程完成，所以程序将在Main 结束时结束。无论如何，1000 个线程和 1000 个 TCP 套接字听起来都是个坏主意
这个答案是一个如何等待线程完成的例子：stackoverflow.com/a/4190969/1233305
那么会发生什么？你得到什么异常，你得到控制台输出吗？

标签： c# .net-core webclient

【解决方案1】：

您的问题与调试无关，但总体而言，您的代码存在许多问题。这是一种更明智的方法，将等待所有下载完成。

注意：您也可以使用 Task.WhenAll，但是我选择使用 TPL 数据流 ActionBlock，以防您需要管理并行度

给定

private static readonly HttpClient _client = new();

private static string _basePath;

private static async Task ProcessAsync(string input)
{
   try
   {
      var uri = new Uri(Uri.EscapeUriString(input));

      var filePath = Path.Combine(_basePath, input.Split('/')[^1]);

      using var result = await _client
         .GetAsync(uri)
         .ConfigureAwait(false);

      // fail fast
      result.EnsureSuccessStatusCode();

      await using var fileStream = new FileStream(filePath, FileMode.Create, FileAccess.Write, FileShare.None, 1024 * 1024, FileOptions.Asynchronous);

      await using var stream = await result.Content
         .ReadAsStreamAsync()
         .ConfigureAwait(false);

      await stream.CopyToAsync(fileStream)
         .ConfigureAwait(false);

      Console.WriteLine($"Downloaded : {uri}");

   }
   catch (Exception e)
   {
      Console.WriteLine(e);
   }
}

用法

private static async Task Main(string[] args)
{
   var file = args.ElementAtOrDefault(0) ?? @"D:\test.txt";
   _basePath = args.ElementAtOrDefault(1) ?? @"D:\test";

   Directory.CreateDirectory(_basePath);

   var actionBlock = new ActionBlock<string>(ProcessAsync,new ExecutionDataflowBlockOptions()
   {
      EnsureOrdered = false,
      MaxDegreeOfParallelism = -1 // set this if you think the site is throttling you
   });

   foreach (var uri in File.ReadLines(file))
      await actionBlock.SendAsync(uri);

   actionBlock.Complete();
   // wait to make sure everything is completed
   await actionBlock.Completion;

}

【讨论】：

@DonAlex1 您对前台线程的工作方式是正确的，但要说您的代码不是最好的低估创建 1000 个线程的非常错误。 HTTP 调用可能会超时等待他们的线程被调度或类似的东西。理想情况下，您需要大约与 CPU 内核一样多的线程。将 1000 个 URL 放入一个队列中，并让少数线程在该队列中工作。
或者实际上，根本不显式创建线程。 I/O 绑定的东西应该使用 async/await 并使用托管线程池。仅为 CPU 密集型任务创建您自己的线程。
我实际上正在接近这个。您需要的唯一解释是，“可能有很多事情，不值得尝试弄清楚。”做对了，看看问题是否仍然存在，然后再尝试解决。