Skip to content

Draft: render/vulkan: Speed up write_pixels

Kenny Levinsen requested to merge kennylevinsen/wlroots:no-staging into master

After stepping through both radeonsi and radv too many times, I realized that the performance difference between the fast gles2 path and slow vulkan path was that radeonsi caches the memory mapping of the target buffer.

At the same time, I noted that radeonsi would only use staging buffers for the first 10 transfers on APUs. After that, it would reallocate the target texture as LINEAR and use direct mappings instead. Only dGPUs or mismatched buffers would lead to continued staging buffer usage.

Skipping the staging buffer makes it easy for us to just cache the memory mapping with the texture, so that is the route I went, netting me around 7x memcpy performance and 1/5th of the compositor CPU load.

We can later expand with parallel path using the staging buffer for either VK_PHYSICAL_DEVICE_TYPE_DISCRETE_GPU - effectively mimicking the radeonsi heuristic - but I felt that this was a fine starting point.

Merge request reports