异步加载事件如何优化速度_优化加载性能:了解异步上传管道

异步加载事件如何优化速度

Nobody likes loading screens. Did you know that you can quickly adjust Async Upload Pipeline (AUP) parameters to significantly improve your loading times? This article details how meshes and textures are loaded through the AUP. This understanding could help you speed up loading time significantly – some projects have seen over 2x performance improvements!

没有人喜欢加载屏幕。 您知道吗,您可以快速调整“异步上传管道”(AUP)参数来显着缩短加载时间? 本文详细介绍了如何通过AUP加载网格和纹理。 这种了解可以帮助您显着加快加载时间-一些项目的性能提高了2倍以上!

Read on to learn how the AUP works from a technical standpoint and what APIs you should be using to get the most out of it.

继续阅读,从技术角度了解AUP的工作原理,以及应使用哪些API来充分利用AUP。

试试看 (Try it Out)

The latest, most optimal implementation of the Asset Upload Pipeline is available in the 2018.3 beta.

最新的,最优化的资产上传管道实施方式已在2018.3 Beta中提供。

Download 2018.3 Beta Today

立即下载2018.3 Beta

First, let’s take a detailed look at when the AUP is used and how the loading process works.

首先,让我们详细了解何时使用AUP以及加载过程如何工作。

什么时候使用异步上传管道? (When is the Async Upload Pipeline used?)

Prior to 2018.3, the AUP only handled textures. Starting with 2018.3 beta, the AUP now loads textures and meshes, but there are some exceptions. Textures that are read/write enabled, or meshes that are read/write enabled or compressed, will not use the AUP. (Note that Texture Mipmap Streaming, which was introduced in 2018.2, also uses AUP.)

在2018.3之前,AUP仅处理纹理。 从2018.3 beta版本开始,AUP现在可以加载纹理和网格物体,但是也有一些例外。 启用了读/写的纹理,或启用了读/写或压缩的网格将不使用AUP。 (请注意,在2018.2中引入的Texture Mipmap Streaming也使用AUP。)

加载过程如何工作 (How the loading process works)

During the build process, the Texture or Mesh Object is written to a serialized file and the large binary data (texture or vertex data) is written to an accompanying .resS file. This layout applies to both player data and asset bundles. The separation of the object and binary data allows for faster loading of the serialized file (which will generally contain small objects), and it enables streamlined loading of the large binary data from the .resS file after. When the Texture or Mesh Object is deserialized, it submits a command to the AUP’s command queue. Once that command completes, the Texture or Mesh data has been uploaded to the GPU and the object can be integrated on the main thread.

在构建过程中,纹理或网格对象被写入序列化文件,大型二进制数据(纹理或顶点数据)被写入随附的.resS文件。 此布局适用于玩家数据和资产捆绑包。 对象和二进制数据的分离允许更快地加载序列化文件(通常将包含小对象),并且可以简化之后从.resS文件中加载大型二进制数据的过程。 反序列化“纹理”或“网格对象”时,它将向AUP的命令队列提交命令。 该命令完成后,纹理或网格数据已上传到GPU,并且对象可以集成在主线程上。

异步加载事件如何优化速度_优化加载性能:了解异步上传管道

Figure: Layout of mesh and texture data when serialized for a build.

图:序列化构建时的网格和纹理数据的布局。

During the upload process, the large binary data from the .resS file is read to a fixed-sized ring buffer. Once in memory, the data is uploaded to the GPU in a time-sliced fashion on the render thread. The size of the ring buffer and the duration of the time-slice are the two parameters that you can change to affect the behavior of the system.

在上载过程中,.resS文件中的大二进制数据被读取到固定大小的环形缓冲区。 一旦进入内存,数据就会以时间分段的方式在渲染线程上上传到GPU。 环形缓冲区的大小和时间片的持续时间是可以更改以影响系统行为的两个参数。

The Async Upload Pipeline has the following process for each command:

每个命令的“异步上传管道”都有以下过程:

  1. Wait until the required memory is available in the ring buffer.

    等待直到所需的内存在环形缓冲区中可用。
  2. Read data from the source .resS file to the allocated memory.

    从源.resS文件读取数据到分配的内存。
  3. Perform post-processing (texture decompression, mesh collision generation, per platform fixup, etc).

    执行后处理(纹理减压,网格碰撞生成,每个平台修复等)。
  4. Upload in a time-sliced manner on the render thread

    按时间分段上传到渲染线程
  5. Release Ring Buffer memory.

    释放环形缓冲区内存。

Multiple commands can be in progress simultaneously, but all must allocate their required memory out of the same shared ring buffer. When the ring buffer fills up, new commands will wait; this waiting will not cause main-thread blocking or affect frame rate, it simply slows the async loading process.

可以同时执行多个命令,但是所有命令都必须从同一共享环形缓冲区中分配其所需的内存。 环形缓冲区填满时,新命令将等待; 这种等待不会导致主线程阻塞或影响帧速率,而只是减慢了异步加载过程。

A summary of these impacts are as follows:

这些影响的摘要如下:

Load Pipeline Comparison
Without AUP AUP Impact on you
Memory Usage Allocate as data is read out of default heap. (High memory  watermarks) Fixed size ring buffer Reduced high memory watermarks
Upload Process Upload as data is available Amortized uploading with fixed time-slice Hitchless uploading
Post Processing Performed on loading thread (blocks loading thread) Performed on jobs in background Faster Loading
负载管道比较
没有AUP 联合会 对你的影响
内存使用情况 从默认堆中读取数据时分配。 (高内存水印) 固定大小的环形缓冲区 减少高内存水印
上载程序 可以上传数据 固定时间片的摊销上载 无障碍上传
后期处理 在加载线程上执行(阻止加载线程) 在后台作业中执行 更快的加载

哪些公共API可用于调整加载参数 (What public APIs are available to adjust loading parameters)

To take full advantage of the AUP in 2018.3, there are three parameters that can be adjusted at runtime for this system:

为了充分利用2018.3中的AUP,可以在运行时为该系统调整三个参数:

  • QualitySettings.asyncUploadTimeSlice – The amount of time in milliseconds spent uploading textures and mesh data on the render thread for each frame. When an async load operation is in progress, the system will perform two time slices of this size. The default value is 2ms. If this value is too small, you could become bottlenecked on texture/mesh GPU uploading. A value too large, on the other hand, might result in framerate hitching.

    QualitySettings.asyncUploadTimeSlice –每帧在渲染线程上上传纹理和网格数据所花费的时间(以毫秒为单位)。 当进行异步加载操作时,系统将执行两个此大小的时间片。 默认值为2ms。 如果该值太小,则可能会成为纹理/网格GPU上传的瓶颈。 另一方面,太大的值可能会导致帧速率下降。

  • QualitySettings.asyncUploadBufferSize – The size of the Ring Buffer in Megabytes. When the upload time slice occurs each frame, we want to be sure that we have enough data in the ring buffer to utilize the entire time-slice. If the ring buffer is too small, the upload time slice will be cut short. The default was 4MB in 2018.2 but has increased 16MB in 2018.3.

    QualitySettings.asyncUploadBufferSize –环形缓冲区的大小(以兆字节为单位)。 当上传时间片出现在每帧时,我们要确保环形缓冲区中有足够的数据来利用整个时间片。 如果环形缓冲区太小,则会缩短上传时间片。 默认值在2018.2中为4MB,但在2018.3中增加了16MB。

  • QualitySettings.asyncUploadPersistentBuffer – Introduced in 2018.3, this flag determines if the upload ring buffer is deallocated when all pending reads are complete. Allocating and deallocating this buffer can often cause memory fragmentation, so it should generally be left at its default(true). If you really need to reclaim memory when you are not loading, you can set this value to false.

    QualitySettings.asyncUploadPersistentBuffer –在2018.3中引入,此标志确定在所有未完成的读取完成后是否释放上载环形缓冲区。 分配和释放此缓冲区通常会导致内存碎片,因此通常应将其保留为default(true)。 如果确实需要在不加载时回收内存,则可以将此值设置为false。

These settings can be adjusted through the scripting API or via the QualitySettings menu.

这些设置可以通过脚本API或通过QualitySettings菜单进行调整。

工作流程示例 (Example workflow)

Let’s examine a workload with lots of textures and meshes being uploaded through the Async Upload Pipeline using the default 2ms time slice and a 4MB ring buffer. Since we’re loading, we get 2 time-slices per render frame, so we should have 4 milliseconds of upload time. Looking at the profiler data, we only use about 1.5 milliseconds. We can also see that immediately after the upload, a new read operation is issued now that memory is available in the ring buffer. This is a sign that a larger ring buffer is needed.

让我们检查一下使用默认的2ms时间片和4MB环形缓冲区通过异步上传管道上传的具有大量纹理和网格的工作负载。 由于正在加载,因此每个渲染帧得到2个时间片,因此我们应该有4毫秒的上传时间。 查看探查器数据,我们仅使用约1.5毫秒。 我们还可以看到,上传之后,由于环形缓冲区中有可用的内存,因此立即发出了新的读取操作。 这表明需要更大的环形缓冲区。

Let’s try increasing the Ring Buffer and since we’re in a loading screen, it is also a good idea to increase the upload time-slice. Here’s what a 16MB Ring Buffer and 4-millisecond time slice look like:

让我们尝试增加环形缓冲区,由于我们处于加载屏幕中,因此增加上传时间片段也是一个好主意。 这是一个16MB的环形缓冲区和4毫秒的时间片:

Now we can see that we are spending almost all our render thread time uploading, and just a short time between uploads rendering the frame.

现在我们可以看到,几乎所有渲染线程的时间都花在了上载上,而两次上载之间的间隔时间却很小。

Below are the loading times of the sample workload with a variety of upload time slices and Ring Buffer sizes. Tests were run on a MacBook Pro, 2.8GHz Intel Core i7 running OS X El Capitan. Upload speeds and I/O speeds will vary on different platforms and devices. The workload is a subset of the Viking Village sample project that we use internally for performance testing. Because there are other objects being loaded, we aren’t able to get the precise performance win of the different values. It’s safe to say in this case, however, that the texture and mesh loading is at least twice as fast when switching from the 4MB/2MS settings to the 16MB/4MS settings.

以下是带有各种上载时间片和环形缓冲区大小的示例工作负载的加载时间。 测试是在运行OS X El Capitan的MacBook Pro,2.8 GHz Intel Core i7上进行的。 在不同的平台和设备上,上传速度和I / O速度会有所不同。 工作负载是Viking Village示例项目的子集,我们在内部将其用于性能测试。 由于还有其他对象正在加载,因此我们无法获得不同值的精确性能优势。 从这种情况可以肯定地说,从4MB / 2MS设置切换到16MB / 4MS设置时,纹理和网格物体的加载速度至少是其两倍。

Experimenting with these parameters outputs the following results.

使用这些参数进行实验将输出以下结果。

To optimize loading times for this particular sample project, we should, therefore, configure settings like this:

因此,为了优化此特定示例项目的加载时间,我们应该配置如下设置:

1
2
3
QualitySettings.asyncUploadTimeSlice = 4
QualitySettings.asyncUploadBufferSize = 16
QualitySettings.asyncUploadPersistentBuffer = true
1
2
3
QualitySettings . asyncUploadTimeSlice = 4
QualitySettings . asyncUploadBufferSize = 16
QualitySettings . asyncUploadPersistentBuffer = true

外卖和建议 (Takeaways and recommendations)

General recommendations for optimizing loading speed of textures and meshes:

有关优化纹理和网格的加载速度的一般建议:

  • Choose the largest QualitySettings.asyncUploadTimeSlice that doesn’t result in dropping frames.

    选择最大的QualitySettings.asyncUploadTimeSlice,它不会导致丢帧。
  • During loading screens, temporarily increase QualitySettings.asyncUploadTimeSlice.

    在加载屏幕期间,临时增加QualitySettings.asyncUploadTimeSlice。
  • Use the profiler to examine the time slice utilization. The time slice will show up as AsyncUploadManager.AsyncResourceUpload in the profiler. Increase QualitySettings.asyncUploadBufferSize if your time slice is not being fully utilized.

    使用探查器检查时间片利用率。 该时间片将在探查器中显示为AsyncUploadManager.AsyncResourceUpload。 如果未充分利用您的时间片,则增加QualitySettings.asyncUploadBufferSize。
  • Things will generally load faster with a larger QualitySettings.asyncUploadBufferSize, so if you can afford the memory, increase it to 16MB or 32MB.

    通常,使用较大的QualitySettings.asyncUploadBufferSize可以使加载更快,因此,如果您负担得起内存,则将其增加到16MB或32MB。
  • Leave QualitySettings.asyncUploadPersistentBuffer set to true unless you have a compelling reason to reduce your runtime memory usage while not loading.

    除非有充分的理由减少不加载时的运行时内存使用量,否则将QualitySettings.asyncUploadPersistentBuffer设置为true。

常问问题 (FAQ)

Q: How often will time-sliced uploading occur on the render thread?

问:时间分段上传会在渲染线程上多久发生一次?

  • Time-sliced uploading will occur once per render frame, or twice during an async load operation. VSync affects this pipeline. While the render thread is waiting for a VSync, you could be uploading. If you are running at 16ms frames and then one frame goes long, say 17ms, you will end up waiting for the vsync for 15ms. In general, the higher the frame rate, the more frequently upload time slices will occur.

    时间分段上传将在每个渲染帧中发生一次,或者在异步加载操作期间发生两次。 VSync影​​响此管道。 渲染线程正在等待VSync时,您可能正在上传。 如果您以16ms帧的速度运行,然后一帧变长(例如17ms),您将最终等待vsync 15ms。 通常,帧速率越高,上载时间片就会越频繁。

Q: What is loaded through the AUP?

问:通过AUP加载了什么?

  • Textures that are not read/write-enabled are uploaded through the AUP.

    未启用读/写功能的纹理通过AUP上传。
  • As of 2018.2, texture mipmaps are streamed through the AUP.

    从2018.2开始,纹理Mipmap通过AUP流式传输。
  • As of 2018.3, meshes are also uploaded through the AUP so long as they are uncompressed and not read/write enabled.

    从2018.3开始,只要未压缩且未启用读/写功能,网格也将通过AUP上传。

Q: What if the ring buffer is not large enough to hold the data being uploaded(for example a really large texture)?

问:如果环形缓冲区的大小不足以容纳要上传的数据(例如,非常大的纹理)怎么办?

  • Upload commands that are larger than the ring buffer will wait until the ring buffer is fully consumed, then the ring buffer will be reallocated to fit the large allocation. Once the upload is complete, the ring buffer will be reallocated to its original size.

    大于环形缓冲区的上载命令将等待,直到环形缓冲区被完全消耗为止,然后将重新分配环形缓冲区以适合较大的分配。 上传完成后,环形缓冲区将重新分配为其原始大小。

Q: How do synchronous load APIs work? For example, Resources.Load, AssetBundle.LoadAsset, etc.

问:同步负载API如何工作? 例如,Resources.Load,AssetBundle.LoadAsset等。

  • Synchronous loading calls use the AUP and will essentially block the main thread until the async upload operation completes. The type of loading API used is not relevant.

    同步加载调用使用AUP,实质上将阻塞主线程,直到异步上载操作完成。 所使用的加载API的类型无关。

告诉我们你的想法 (Tell us what you think)

We’re always looking for feedback.  Let us know what you think in the comments or on the Unity 2018.3 beta forum!

我们一直在寻找反馈。 让我们知道您在评论中或Unity 2018.3 beta论坛中的想法

翻译自: https://blogs.unity3d.com/2018/10/08/optimizing-loading-performance-understanding-the-async-upload-pipeline/

异步加载事件如何优化速度

相关文章:

  • 2022-12-23
  • 2022-12-23
  • 2021-06-12
  • 2021-06-25
  • 2021-09-12
  • 2022-12-23
  • 2021-07-04
  • 2022-01-15
猜你喜欢
  • 2022-12-23
  • 2021-08-02
  • 2022-02-07
  • 2022-12-23
  • 2022-12-23
  • 2021-08-15
  • 2022-12-23
相关资源
相似解决方案