We are very excited to announce that PowerVR Series6 GPUs have achieved full conformance with OpenGL® ES 3.1, the latest version of the well-known graphics API developed and maintained by the Khronos Group. Being able to offer PowerVR developers access to the latest graphics and compute feature set is something we firmly believe in and work really hard to achieve for every family of GPUs we release.
OpenGL ES 3.1 brings a number of improvements over the previous API release, but perhaps the most exciting feature is the presence of compute shaders. The demo presented below offers an overview of how OpenGL ES 3.1 compute shaders can be used for image processing. Take it away, James!
At the most basic level, the Face Detect demo is a chained sequence of image processing algorithms. The input image is taken either from a video, live camera feed or static image and the end result is a box drawn around regions that contain faces.
All images used in the algorithm are RGBA8 32-bit and the input image is taken at a typical standard definition resolution (640×480). A four channel RGBA8 32 bit image is used to store up to 4 binary images in each channel.
The compute shader feature inside OpenGL ES 3.1 was used for each of the following steps, apart from Region Analysis which was done on the CPU. In previous versions of OpenGL ES, developers would have implemented the algorithms below using a traditional fragment shader. However, since compute shaders are now available in OpenGL ES 3.1, we have used them for efficiency reasons.
De-noise
When retrieving data from cameras, the images are often noisy. By using a 3×3 median filter in a OpenGL ES 3.1 compute shader, we can reduce the noisy appearance of the image. This step is essential as the proceeding algorithms (the threshold in particular) are very noise-sensitive. For more information see this article on Wikipedia.
Segmentation (Threshold)
Once the image has been de-noised, pixels in the image that represent skin are classified into a binary image. A full bright value represents skin and black is written for pixels that do not represent skin. This pass is done on the PowerVR Rogue GPU using a threshold compute shader. A similar procedure is done for lips; the proximity of skin and lip and regions in the image will be a good indicator of a face when we later perform region analysis.
The image above shows the output from skin and lip segmentation
Downsample
After the segmentation step we are left with a binary image that is white for all skin pixels. However, because the threshold is not a perfect classifier we also have some false positives and false negatives. To remove these misclassified pixels, we can perform a sequence of morphological dilate and erode steps – which work in tandem to remove outliers. These are expensive operations so before we do them we downsample the image.
Morphological Dilate and Erode
These two filters are implemented as OpenGL ES 3.1 compute shaders. They work by taking the max and the min for a group of neighbouring pixels. In the demo, erosion uses 14 neighbours that are sampled in a diamond pattern; dilate uses 16 neighbour samples. Running these filters over the image essentially blobs together dense speckles of white pixels and filters away sparse speckles. For more information see this article on Morphological Image Processing.
Region Analysis
The region analysis pass essentially identifies blobs of skin that are roughly face-shaped. This pass is performed on the CPU for simplicity; an optimal implementation could make use of the PowerVR GPU, but such an implementation was beyond our capability under the given time and development constraints.
After the final pass, we can correctly recognize David Duchovny’s face
Several of our customers have already publicly announced 32 and 64-bit apps processors that use PowerVR. Here is a short list of what is available so far:
Our developer online store already offers a number of Android tablets with PowerVR Rogue GPUs which you can purchase right away; we promise to keep updating it with more devices as they become available.
The FaceDetect demo was first shown at GDC 2014 where we received very positive feedback from a lot of developers, all very eager to try compute shaders too. As a matter of fact, in a recent poll we’ve conducted among PowerVR Insider members, this feature is the most popular among the new functionality offered in OpenGL ES 3.1 so we look forward to see more applications making use of it in the future.
To find out more about PowerVR GPUs, follow us on Twitter (@ImaginationTech) and keep coming back to our blog.
Editor’s Note
* PowerVR Rogue GPUs are based on published Khronos specifications and have passed the Khronos Conformance Testing Process. Current conformance status can be found at www.khronos.org/conformance.
OpenGL is a registered trademark and the OpenGL ES logo is a trademark of Silicon Graphics Inc. used by permission by Khronos.