【发布时间】:2021-03-07 04:29:41
【问题描述】:
我有一个大型的 C# 多线程系统,我意识到两个线程之间的性能差异很大。现在我设计了两个几乎相同的线程,其中一个的执行速度提高了 4-5 倍(如果您更改它们必须运行的循环数量,它会线性扩展)。 和区别?围绕其中一个的实际繁重代码的一种笨拙情况。这对我来说毫无意义,如果这样一个小细节能产生如此巨大的影响,我觉得我在继续优化方面无能为力。这是在 Unity 中测试的,因此在其他环境中结果可能会有所不同。
线程完成时间:2.8 秒。 ThreadB 完成时间:0.6 秒。
请注意,ThreadB 是有条件的(在第一次迭代时会立即评估为真)。 如此愚蠢的代码添加如何使实际有效负载(for循环和数字运算)执行得如此之快?此外,如果我在 ThreadB 的条件下直接将“延迟”变量更改为静态“0.0”,它会再次像 ThreadA 一样执行。换句话说:一个双精度,无论是硬编码值还是引用变量,都会在性能上产生 4-5 倍的差异。
别管实际的算法,它只是为了让计算机处理一些数字。我知道我一次又一次地比较相同的数据,这不是重点。
我不是编译书呆子,我无法探究这在实际机器/汇编代码中有何不同。我只知道差异很大,对我来说毫无意义。我想念什么?我偶然发现了这一点,将来我可能无法知道给定线程以 20% 的可能速度执行,并且稍微改变一下就可以解决它。
请。我需要一个书呆子来把这从纯粹的魔法变成“哦,这就是为什么......!现在我知道如何在未来避免它......”。我知道 C# 的编译被多层托管的东西所包围,但必须有一个合乎逻辑的原因。对吧?
这里有一些测试代码和一些简单的结构来支持它。如果有人有时间检查他们是否得到与我相同的结果,我会很高兴。
using System.Threading;
public class ThreadTest
{
Thread threadA;
Thread threadB;
bool runThreadA = false;
bool runThreadB = false;
System.Diagnostics.Stopwatch stopWatch;
double elapsedTimeA = 0;
double elapsedTimeB = 0;
public ThreadTest()
{
stopWatch = new System.Diagnostics.Stopwatch();
StartThreads();
}
public void StartThreads ()
{
stopWatch.Reset();
stopWatch.Start();
threadA = new Thread(ThreadA);
threadB = new Thread(ThreadB);
runThreadA = true;
runThreadB = true;
elapsedTimeA = 0;
elapsedTimeB = 0;
threadA.Start();
threadB.Start();
}
void ThreadA ()
{
while (runThreadA)
{
runThreadA = false;
double preTicks = stopWatch.ElapsedTicks;
Line3Double lineA = new Line3Double(new Vector3DoublePrecision(10, 20, 30), new Vector3DoublePrecision(100, 140, 180));
Line3Double lineB = new Line3Double(new Vector3DoublePrecision(-10, -20, -30), new Vector3DoublePrecision(-100, -140, -180));
int lines = 1000;
for (int i = 0; i < 8; i++)
{
for (int j = 0; j < lines; j++)
{
double aStartX = lineA.startX;
double aStartY = lineA.startY;
double aStartZ = lineA.startZ;
double aEndX = lineA.endX;
double aEndY = lineA.endY;
double aEndZ = lineA.endZ;
double aDirX = lineA.dirX;
double aDirY = lineA.dirY;
double aDirZ = lineA.dirZ;
double aDotSelf = lineA.dotSelf;
for (int k = 0; k < 8; k++)
{
for (int l = 0; l < lines; l++)
{
double wX = aStartX - lineB.startX;
double wY = aStartY - lineB.startY;
double wZ = aStartZ - lineB.startZ;
double b = aDirX * lineB.dirX + aDirY * lineB.dirY + aDirZ * lineB.dirZ;
double d = aDirX * wX + aDirY * wY + aDirZ * wZ;
double e = lineB.dirX * wX + lineB.dirY * wY + lineB.dirZ * wZ;
double D = aDotSelf * lineB.dotSelf - b * b;
double sc, tc;
if (D < 0.0000001)
{
sc = 0.0f;
tc = (b > lineB.dotSelf ? d / b : e / lineB.dotSelf);
}
else
{
sc = (b * e - lineB.dotSelf * d) / D;
tc = (aDotSelf * e - b * d) / D;
}
double shortestX = wX + (sc * aDirX) - (tc * lineB.dirX);
double shortestY = wY + (sc * aDirY) - (tc * lineB.dirY);
double shortestZ = wZ + (sc * aDirZ) - (tc * lineB.dirZ);
double distance = shortestX * shortestX + shortestY * shortestY + shortestZ * shortestZ;
}
}
}
}
double postTicks = stopWatch.ElapsedTicks;
double time = ((postTicks - preTicks) / System.Diagnostics.Stopwatch.Frequency) * 1000;
elapsedTimeA = time;
}
}
void ThreadB()
{
long startTicks = stopWatch.ElapsedTicks;
double delay = 0;
while (runThreadB)
{
if ((double)(stopWatch.ElapsedTicks - startTicks) / System.Diagnostics.Stopwatch.Frequency >= delay)
{
runThreadB = false;
double preTicks = stopWatch.ElapsedTicks;
Line3Double lineA = new Line3Double(new Vector3DoublePrecision(10, 20, 30), new Vector3DoublePrecision(100, 140, 180));
Line3Double lineB = new Line3Double(new Vector3DoublePrecision(-10, -20, -30), new Vector3DoublePrecision(-100, -140, -180));
int lines = 1000;
for (int i = 0; i < 8; i++)
{
for (int j = 0; j < lines; j++)
{
double aStartX = lineA.startX;
double aStartY = lineA.startY;
double aStartZ = lineA.startZ;
double aEndX = lineA.endX;
double aEndY = lineA.endY;
double aEndZ = lineA.endZ;
double aDirX = lineA.dirX;
double aDirY = lineA.dirY;
double aDirZ = lineA.dirZ;
double aDotSelf = lineA.dotSelf;
for (int k = 0; k < 8; k++)
{
for (int l = 0; l < lines; l++)
{
double wX = aStartX - lineB.startX;
double wY = aStartY - lineB.startY;
double wZ = aStartZ - lineB.startZ;
double b = aDirX * lineB.dirX + aDirY * lineB.dirY + aDirZ * lineB.dirZ;
double d = aDirX * wX + aDirY * wY + aDirZ * wZ;
double e = lineB.dirX * wX + lineB.dirY * wY + lineB.dirZ * wZ;
double D = aDotSelf * lineB.dotSelf - b * b;
double sc, tc;
if (D < 0.0000001)
{
sc = 0.0f;
tc = (b > lineB.dotSelf ? d / b : e / lineB.dotSelf);
}
else
{
sc = (b * e - lineB.dotSelf * d) / D;
tc = (aDotSelf * e - b * d) / D;
}
double shortestX = wX + (sc * aDirX) - (tc * lineB.dirX);
double shortestY = wY + (sc * aDirY) - (tc * lineB.dirY);
double shortestZ = wZ + (sc * aDirZ) - (tc * lineB.dirZ);
double distance = shortestX * shortestX + shortestY * shortestY + shortestZ * shortestZ;
}
}
}
}
double postTicks = stopWatch.ElapsedTicks;
double time = ((postTicks - preTicks) / System.Diagnostics.Stopwatch.Frequency) * 1000;
elapsedTimeB = time;
}
}
}
}
public struct Vector3DoublePrecision
{
public double x;
public double y;
public double z;
public Vector3DoublePrecision(double x, double y, double z)
{
this.x = x;
this.y = y;
this.z = z;
}
}
public struct Line3Double
{
public double startX;
public double startY;
public double startZ;
public double endX;
public double endY;
public double endZ;
public double dirX;
public double dirY;
public double dirZ;
public double dotSelf;
public Line3Double(Vector3DoublePrecision start, Vector3DoublePrecision end)
{
startX = start.x;
startY = start.y;
startZ = start.z;
endX = end.x;
endY = end.y;
endZ = end.z;
dirX = end.x - start.x;
dirY = end.y - start.y;
dirZ = end.z - start.z;
dotSelf = dirX * dirX + dirY * dirY + dirZ * dirZ;
}
}
【问题讨论】:
-
这不是衡量性能的合适方法,因为它可能取决于调度。 1. 一次只运行一个操作(没有线程!)。 2. 为每个运行一次warmup-pass,以减少编译的影响。 3. 秒表不保证是线程安全的。 4. 运行代码固定次数,总时间约为一秒。 5. 如有疑问,请使用 Benchmark.Net
-
我无法在 .NET Framework 上重现。我会更改循环以运行该方法 N 次并使用单独的秒表并预热。可能是您正在测量第一个线程上的首次初始化效果。
-
以前没用过这个论坛,抱歉。我现在已经发布为“答案”:)
-
@HenrikKragh - 请使用
@向收件人发送通知。没有它,他们可能永远看不到您的回复。 -
@HenrikKragh - 这就是我们要做的 - 提高问题、答案和行为的质量。这就是这个网站如此出色的原因。
标签: c# multithreading performance