我尝试使用BenchmarkDotNet 对其进行基准测试,代码附在本文末尾。当程序启动时,它会初始化 BenchmarkDotNet,后者依次调用 GlobalSetup() 方法一次和两个基准测试方法(Pipe() 和 Tcp())多次。
在GlobalSetup() 中,启动了两个子进程。一种用于管道通信,一种用于 tcp 通信。一旦子进程准备就绪,它们就会等待触发信号和要传输的值的数量N(通过stdin 提供),然后开始发送数据。
当调用基准方法(Pipe() 和Tcp())时,它们会发送触发信号和值的数量N 并等待传入数据。
这表明设置TcpClient.NoDelay = true 以禁用首先收集小消息直到达到特定阈值或特定超时的Nagle-Algorithm 很重要。有趣的是,这只影响N = 10000 的 Linux 测试。使用NoDelay = false(默认),此测试的平均时间从~40 µs 跳转到~40 ms。
结果如下:
传奇
- N : N = 要传输的 int32 值的数量
- 平均值:所有测量值的算术平均值
- 错误:99.9% 置信区间的一半
- StdDev:所有测量值的标准偏差
- 中位数:分隔所有测量值的上半部分(第 50 个百分位数)的值
- 比率:比率分布的平均值([当前]/[基线])
- RatioSD:比率分布的标准偏差([当前]/[基线])
- 1 微秒:1 微秒(0.000001 秒)
虚拟机(Ubuntu 20.04)
BenchmarkDotNet=v0.13.0, OS=ubuntu 20.04
AMD Opteron(tm) Processor 4334, 4 CPU, 4 logical and 4 physical cores
.NET SDK=5.0.102
[Host] : .NET 5.0.2 (5.0.220.61120), X64 RyuJIT
DefaultJob : .NET 5.0.2 (5.0.220.61120), X64 RyuJIT
| Method |
N |
Mean |
Error |
StdDev |
Median |
Ratio |
RatioSD |
| Pipe |
1 |
27.33 μs |
1.660 μs |
4.895 μs |
30.75 μs |
1.00 |
0.00 |
| Tcp |
1 |
31.42 μs |
0.620 μs |
0.713 μs |
31.24 μs |
1.39 |
0.21 |
|
|
|
|
|
|
|
|
| Pipe |
100 |
26.72 μs |
1.990 μs |
5.867 μs |
26.63 μs |
1.00 |
0.00 |
| Tcp |
100 |
38.95 μs |
2.146 μs |
6.327 μs |
43.34 μs |
1.53 |
0.43 |
|
|
|
|
|
|
|
|
| Pipe |
10000 |
42.45 μs |
2.804 μs |
8.268 μs |
47.09 μs |
1.00 |
0.00 |
| Tcp |
10000 |
46.97 μs |
3.057 μs |
9.013 μs |
53.93 μs |
1.16 |
0.34 |
|
|
|
|
|
|
|
|
| Pipe |
1000000 |
1,621.87 μs |
116.924 μs |
344.752 μs |
1,893.49 μs |
1.00 |
0.00 |
| Tcp |
1000000 |
1,707.25 μs |
8.066 μs |
7.545 μs |
1,707.24 μs |
0.94 |
0.13 |
|
|
|
|
|
|
|
|
| Pipe |
10000000 |
21,013.86 μs |
166.250 μs |
129.797 μs |
21,007.89 μs |
1.00 |
0.00 |
| Tcp |
10000000 |
20,548.03 μs |
407.779 μs |
814.379 μs |
20,713.44 μs |
0.96 |
0.03 |
笔记本(Windows 10 + WSL2 上的 Ubuntu 20.04):
BenchmarkDotNet=v0.13.0, OS=ubuntu 20.04
Intel Core i7-8650U CPU 1.90GHz (Kaby Lake R), 1 CPU, 8 logical and 4 physical cores
.NET SDK=5.0.301
[Host] : .NET 5.0.7 (5.0.721.25508), X64 RyuJIT
DefaultJob : .NET 5.0.7 (5.0.721.25508), X64 RyuJIT
| Method |
N |
Mean |
Error |
StdDev |
Median |
Ratio |
RatioSD |
| Pipe |
1 |
44.66 μs |
0.882 μs |
1.051 μs |
44.45 μs |
1.00 |
0.00 |
| Tcp |
1 |
54.42 μs |
0.411 μs |
0.364 μs |
54.34 μs |
1.21 |
0.03 |
|
|
|
|
|
|
|
|
| Pipe |
100 |
45.07 μs |
0.895 μs |
1.496 μs |
44.63 μs |
1.00 |
0.00 |
| Tcp |
100 |
55.27 μs |
0.735 μs |
0.614 μs |
55.17 μs |
1.21 |
0.05 |
|
|
|
|
|
|
|
|
| Pipe |
10000 |
52.30 μs |
1.018 μs |
1.131 μs |
52.32 μs |
1.00 |
0.00 |
| Tcp |
10000 |
55.47 μs |
0.590 μs |
0.523 μs |
55.32 μs |
1.06 |
0.03 |
|
|
|
|
|
|
|
|
| Pipe |
1000000 |
4,034.01 μs |
77.978 μs |
65.115 μs |
4,035.58 μs |
1.00 |
0.00 |
| Tcp |
1000000 |
1,398.62 μs |
24.230 μs |
21.479 μs |
1,395.20 μs |
0.35 |
0.01 |
|
|
|
|
|
|
|
|
| Pipe |
10000000 |
69,767.35 μs |
4,993.492 μs |
14,723.423 μs |
64,169.46 μs |
1.00 |
0.00 |
| Tcp |
10000000 |
24,660.43 μs |
1,746.809 μs |
4,955.406 μs |
23,947.15 μs |
0.38 |
0.14 |
笔记本(Windows 10):
BenchmarkDotNet=v0.13.0, OS=Windows 10.0.19043.1083 (21H1/May2021Update)
Intel Core i7-8650U CPU 1.90GHz (Kaby Lake R), 1 CPU, 8 logical and 4 physical cores
.NET SDK=5.0.203
[Host] : .NET 5.0.6 (5.0.621.22011), X64 RyuJIT
DefaultJob : .NET 5.0.6 (5.0.621.22011), X64 RyuJIT
| Method |
N |
Mean |
Error |
StdDev |
Median |
Ratio |
RatioSD |
| Pipe |
1 |
22.60 μs |
0.441 μs |
1.013 μs |
22.21 μs |
1.00 |
0.00 |
| Tcp |
1 |
27.42 μs |
0.535 μs |
1.019 μs |
27.51 μs |
1.21 |
0.08 |
|
|
|
|
|
|
|
|
| Pipe |
100 |
21.93 μs |
0.146 μs |
0.122 μs |
21.94 μs |
1.00 |
0.00 |
| Tcp |
100 |
26.06 μs |
0.506 μs |
0.474 μs |
25.99 μs |
1.19 |
0.02 |
|
|
|
|
|
|
|
|
| Pipe |
10000 |
29.59 μs |
0.126 μs |
0.099 μs |
29.58 μs |
1.00 |
0.00 |
| Tcp |
10000 |
33.25 μs |
0.655 μs |
0.919 μs |
33.01 μs |
1.14 |
0.04 |
|
|
|
|
|
|
|
|
| Pipe |
1000000 |
1,675.35 μs |
32.862 μs |
43.870 μs |
1,685.37 μs |
1.00 |
0.00 |
| Tcp |
1000000 |
2,553.07 μs |
58.100 μs |
167.631 μs |
2,505.34 μs |
1.63 |
0.10 |
|
|
|
|
|
|
|
|
| Pipe |
10000000 |
23,421.61 μs |
141.337 μs |
132.207 μs |
23,380.19 μs |
1.00 |
0.00 |
| Tcp |
10000000 |
28,182.91 μs |
375.644 μs |
313.679 μs |
28,114.22 μs |
1.20 |
0.01 |
基准代码:
Benchmark.csproj
<Project Sdk="Microsoft.NET.Sdk">
<PropertyGroup>
<OutputType>Exe</OutputType>
<TargetFramework>net5.0</TargetFramework>
</PropertyGroup>
<ItemGroup>
<PackageReference Include="BenchmarkDotNet" Version="0.13.0" />
</ItemGroup>
</Project>
Program.cs
using BenchmarkDotNet.Running;
using System;
using System.IO;
using System.Linq;
using System.Net.Sockets;
using System.Runtime.InteropServices;
namespace Benchmark
{
public class Program
{
public const int MIN_LENGTH = 1;
public const int MAX_LENGTH = 10_000_000;
static void Main(string[] args)
{
if (!args.Any())
{
var summary = BenchmarkRunner.Run<PipeVsTcp>();
}
else
{
var data = MemoryMarshal
.AsBytes<int>(
Enumerable
.Range(0, MAX_LENGTH)
.ToArray())
.ToArray();
using var readStream = Console.OpenStandardInput();
if (args[0] == "pipe")
{
using var pipeStream = Console.OpenStandardOutput();
RunChildProcess(readStream, pipeStream, data);
}
else if (args[0] == "tcp")
{
var tcpClient = new TcpClient()
{
NoDelay = true
};
tcpClient.Connect("localhost", 55555);
var tcpStream = tcpClient.GetStream();
RunChildProcess(readStream, tcpStream, data);
}
else
{
throw new Exception("Invalid argument (args[0]).");
}
}
}
static void RunChildProcess(Stream readStream, Stream writeStream, byte[] data)
{
// wait for start signal
Span<byte> buffer = stackalloc byte[4];
while (true)
{
var length = readStream.Read(buffer);
if (length == 0)
throw new Exception($"The host process terminated early.");
var N = BitConverter.ToInt32(buffer);
// write
writeStream.Write(data, 0, N * sizeof(int));
}
}
}
}
PipeVsTcp.cs
using BenchmarkDotNet.Attributes;
using System;
using System.Buffers;
using System.Diagnostics;
using System.IO;
using System.Net;
using System.Net.Sockets;
using System.Reflection;
using System.Runtime.InteropServices;
namespace Benchmark
{
[MemoryDiagnoser]
public class PipeVsTcp
{
private Process _pipeProcess;
private Process _tcpProcess;
private TcpClient _tcpClient;
[GlobalSetup]
public void GlobalSetup()
{
// assembly path
// under Linux the Location property is an empty
// string (why?), therefore I have it replaced
// with an hard-coded string
var assemblyPath = Assembly.GetExecutingAssembly().Location;
// run pipe process
var pipePsi = new ProcessStartInfo("dotnet")
{
Arguments = $"{assemblyPath} pipe",
UseShellExecute = false,
RedirectStandardInput = true,
RedirectStandardOutput = true,
RedirectStandardError = true
};
_pipeProcess = new Process() { StartInfo = pipePsi };
_pipeProcess.Start();
// run tcp process
var tcpPsi = new ProcessStartInfo("dotnet")
{
Arguments = $"{assemblyPath} tcp",
UseShellExecute = false,
RedirectStandardInput = true,
RedirectStandardOutput = true,
RedirectStandardError = true
};
_tcpProcess = new Process() { StartInfo = tcpPsi };
_tcpProcess.Start();
var tcpListener = new TcpListener(IPAddress.Parse("127.0.0.1"), 55555);
tcpListener.Start();
_tcpClient = tcpListener.AcceptTcpClient();
_tcpClient.NoDelay = true;
}
[GlobalCleanup]
public void GlobalCleanup()
{
_pipeProcess?.Kill();
_tcpProcess?.Kill();
}
[Params(Program.MIN_LENGTH, 100, 10_000, 1_000_000, Program.MAX_LENGTH)]
public int N;
[Benchmark(Baseline = true)]
public Memory<byte> Pipe()
{
var pipeReadStream = _pipeProcess.StandardOutput.BaseStream;
var pipeWriteStream = _pipeProcess.StandardInput.BaseStream;
using var owner = MemoryPool<byte>.Shared.Rent(N * sizeof(int));
return ReadFromStream(pipeReadStream, pipeWriteStream, owner.Memory);
}
[Benchmark()]
public Memory<byte> Tcp()
{
var tcpReadStream = _tcpClient.GetStream();
var pipeWriteStream = _tcpProcess.StandardInput.BaseStream;
using var owner = MemoryPool<byte>.Shared.Rent(N * sizeof(int));
return ReadFromStream(tcpReadStream, pipeWriteStream, owner.Memory);
}
private Memory<byte> ReadFromStream(Stream readStream, Stream writeStream, Memory<byte> buffer)
{
// trigger
var Nbuffer = BitConverter.GetBytes(N);
writeStream.Write(Nbuffer);
writeStream.Flush();
// receive data
var remaining = N * sizeof(int);
var offset = 0;
while (remaining > 0)
{
var span = buffer.Slice(offset, remaining).Span;
var readBytes = readStream.Read(span);
if (readBytes == 0)
throw new Exception("The child process terminated early.");
remaining -= readBytes;
offset += readBytes;
}
var intBuffer = MemoryMarshal.Cast<byte, int>(buffer.Span);
// validate first 3 values
for (int i = 0; i < Math.Min(N, 3); i++)
{
if (intBuffer[i] != i)
throw new Exception($"Invalid data received. Data is {intBuffer[i]}, index = {i}.");
}
return buffer;
}
}
}