【发布时间】:2020-09-20 04:50:49
【问题描述】:
我有以下代码,我想在其中检索给定员工 ID 列表的员工信息。如果计数超过 100 万,我已经进行了验证以引发异常。在大多数情况下,请求将小于 200K,因此我将请求分成 4 个分区,每个分区包含相同数量的员工 ID。所有 4 个分区并行执行并在Task.WhenAll 之后连接在一起。有人可以给我一些进一步改进的提示吗?我查看了ParallelForEachAsync 和Parallel Foreach async in C#,但无法正常工作。下面提到的代码有效,但其硬编码为分成 4 个分区。如何使其与最大并行度设置为 50 的动态分区更加并行?如果输入是 100K ids,我想分成 10 个分区并并行执行所有 10 个。
public class Service
{
private async Task<List<EmployeeEntity>> GetInfo(List<long> input)
{
var breakup = input.Split(4);
var result1Task = GetResult(breakup.First().ToList());
var result2Task = GetResult(breakup.Skip(1).Take(1).First().ToList());
var result3Task = GetResult(breakup.Skip(2).Take(1).First().ToList());
var result4Task = GetResult(breakup.Skip(3).Take(1).First().ToList());
await Task.WhenAll(result1Task, result2Task, result3Task, result4Task);
List<EmployeeEntity> result1 = await result1Task;
List<EmployeeEntity> result2 = await result2Task;
List<EmployeeEntity> result3 = await result3Task;
List<EmployeeEntity> result4 = await result4Task;
return result1.Union(result2.Union(result3.Union(result4))).ToList();
}
private async Task<List<EmployeeEntity>> GetResult(List<long> employees)
{
using var context = new MyAppDBContext();
var EmployeeBand = await context.EmployeeBand.Where(x => employees.Contains(x.EmployeeId)).ToListAsync();
var EmployeeClient = await context.EmployeeClient.Where(x => employees.Contains(x.EmployeeId)).ToListAsync();
return await context.Employee.Where(x => employees.Contains(x.EmployeeId)).ToListAsync();
}
}
public static class ExtensionMethods
{
public static List<List<T>> Split<T>(this List<T> myList, int parts)
{
int i = 0;
var splits = from item in myList
group item by i++ % parts into part
select part.ToList();
return splits.ToList();
}
}
public class EmployeeEntity
{
public EmployeeEntity()
{
EmployeeBands = new HashSet<EmployeeBandEntity>();
EmployeeClients = new HashSet<EmployeeClientEntity>();
}
public long EmployeeId { get; set; }
public ICollection<EmployeeBandEntity> EmployeeBands { get; set; }
public ICollection<EmployeeClientEntity> EmployeeClients { get; set; }
}
public class EmployeeBandEntity
{
public long EmployeeBandId { get; set; }
public long EmployeeId { get; set; }
public EmployeeEntity EmployeeEntity { get; set; }
}
public class EmployeeClientEntity
{
public long EmployeeClientId { get; set; }
public long EmployeeId { get; set; }
public EmployeeEntity EmployeeEntity { get; set; }
}
public partial class MyAppDBContext : DbContext
{
public virtual DbSet<EmployeeEntity> Employee { get; set; }
public virtual DbSet<EmployeeBandEntity> EmployeeBand { get; set; }
public virtual DbSet<EmployeeClientEntity> EmployeeClient { get; set; }
}
【问题讨论】:
-
Tolist 是问题所在。您将整个数据集具体化只是为了创建任务。
GetResult的输入和输出应该是IQueryable<T> -
通常,db 是瓶颈。在您的情况下,绝对可能是这种情况。首先,您需要找出 GetResult 的最佳位置。我所说的甜蜜点的意思是,为什么只在 4 次分裂结束?此外,为什么不根据 GetResult 的最佳性能动态拆分。
-
如何改进?你需要做并行吗?为什么数据库不为你/汽车做这件事?
标签: c# async-await entity-framework-core parallel.foreach