- 26 November 2019
- Benjamin Anuworakarn
It’s time for some exciting news! As usual, our engineers here in the Developer Technology team have been hard at work making your life a little bit easier when developing for PowerVR devices.
Today, we’re releasing PVRPerfDoc a custom version of the Vulkan® validation layer, PerfDoc, which should make it more useful for developers working with PowerVR hardware. If that’s all you need to hear, why not check PVRPerfDoc on GitHub?
However, for the uninitiated…
What is PerfDoc?
PerfDoc is a Vulkan validation layer produced by our friends at ARM. Validation layers are important component of the Vulkan API which allows it to have a much lower CPU overhead compared to older APIs, like OpenGL® ES. Vulkan has very little in-built error checking, instead a developer can enable a layer, like the LunarG validation layer, which will check the API is being used correctly when the application is run.
So, what makes PerfDoc different from any other validation layer? Well, instead of checking whether Vulkan is being used correctly (in line with the Vulkan specification), it’s instead focused on whether the API is being used efficiently. When a Vulkan application is run with the PerfDoc layer active any errors will be reported to the application as callbacks or printed in an available console. This means inefficient uses of the API, such as creating a pipeline without a pipeline cache or using too many instanced vertex buffers, can be caught and fixed during development. Addressing the issues raised by PerfDoc can improve application performance and get the best out of Vulkan.
Unfortunately, since PerfDoc validates API usage against ARM’s performance recommendations for Mali, some of these recommendations may actually hurt performance on PowerVR hardware.
This is where we come in: we’ve forked PerfDoc to validate API usage against our own performance recommendations for PowerVR hardware.
What have we done?
We’ve customised PerfDoc for PowerVR in two ways:
- We’ve made all of the standard checks toggle-able, so you can select which of the original performance recommendations your application will be checked against. By default, we’ve enabled the checks which tell you how to improve performance on PowerVR.
- Next, we’ve added lots of extra checks which are based on our own PowerVR Performance Recommendations. This set of recommendations was developed by our experienced team of engineers here in the Developer Technology, in order to allow you to get the most of your applications running on PowerVR devices.
Whether you’re a Vulkan newbie or a more experience developer our customised layer is a great way to ensure you’re always thinking about performance when developing for PowerVR with Vulkan.
For the rest of this post we’ll take a look at some of the performance recommendations that this new layer checks against. This will give you an idea of how this new layer can help you.
Remember to use mipmapped and compressed textures
Using both compressed textures, such as PVRTC, ASTC, and ETC2, and mipmapping can help reduce the strain on memory bandwidth as textures being loaded will be smaller in size. Memory bandwidth issues are common bottlenecks for graphics-heavy applications, so reducing bandwidth usage in any application will generally give a nice performance boost.
The layer will output an error when a texture is using an uncompressed format or doesn’t have any specified mipmaps levels in its associated image view. An exception is when the image view is a render target.
An easy way to get mipmapped and compressed textures is with our texture processing tool, PVRTexTool. This powerful tool allows texture encoding into a variety of compressed and uncompressed formats including PVRTC, ASTC, and ETC. You can also generate a full mipmap chain with just one click.
Try to use indexed draw calls when a call involves a lot of geometry
Indexed draw calls can help to reduce the amount of geometry by eliminating redundant vertices in the vertex buffer. PowerVR GPUs are also optimised for indexed draw calls. This means you can really notice the difference when rendering scenes with complex geometry.
If you really want to push your performance further, you can also sort both the index and vertex buffers. This will improve cache efficiency. PVRGeoPOD, our scene exporter and optimisation tool can do this automatically for you.
Ensure framebuffer compression is used as much as possible
PowerVR GPUs use PowerVR Image Compression (PVRIC) which is Imagination’s proprietary, lossless framebuffer compression and decompression (FBCDC) algorithm. This compression scheme helps to reduce the demands on memory bandwidth by shrinking image sizes by around 50%. The layer will advise when framebuffer compression should be used.
Try to use optimal subpass dependency flags when creating subpasses
The layer can check which subpass dependency flag has been set during the creation of subpass and output an error if the flag selected is not optimal for PowerVR architectures. For PowerVR and other tile-based architectures, VK_DEPENDENCY_BY_REGION_BIT is the best flag for performance tile, as whole tiles can be kept in fast on-chip memory.
Avoid using partial clears of the framebuffer
Clear commands which do not clear the entire framebuffer can be detected with this layer. Partial clears are generally bad for performance as they result in overdraw.
Keep pipeline optimisation enabled
When creating a pipeline in Vulkan, you can set the
flags parameter to VK_PIPELINE_CREATE_DISABLE_OPTIMIZATION_BIT. This specifies that the created pipeline will not be optimised which may help to reduce pipeline creation time. However, it should only be set in debug or developer builds, not release builds, as it can harm overall performance.
Split multiple renderpasses into subpasses wherever possible
If you’re still getting used to Vulkan, it can be easy to forget about subpasses. In Vulkan, renderpasses can be divided into different phases of rendering called subpasses. Each of these subpasses have their own specified dependencies. This allows the driver to perform various optimisations, as it knows exactly what you’re trying to do. This layer can detect when multiple renderpasses are set to act on the same render target and will warn you. It’s best in these cases to integrate these renderpasses as multiple subpasses within a single renderpass.
But there’s much more…
We’ve added 15 new checks in total, these include:
- Checking whether textures are using linear or optimal tiling layout
- Ensuring the workgroup size is optimal for PowerVR
- Advising on texture formats
- And more….