A library for performing GPGPU (General Purpose GPU) computations on BeagleBone Black using OpenGL ES.
Originally concevied during Google Summer of Code 2021 for beagleboard.org
Although this README contains most of the necessary details, for more in-depth descriptions and development diary please visit the project blog.
With this library you can accelerate your computations using built-in SGX GPU onboard the BBB. Beware! GPU <=> CPU transfers induce an overhead so choose your computations wisely!
Motivation for the project was scarcity of heteregenousity on the BBB platform which means that most computations were done either on the CPU or in the PRU or both of them. Meanwhile GPU on the SoC was laying mostly untouched apart from some rare occasions where rendering was required - this is unacceptable and this project aims to change that!
Usually, one would simply use compute shaders and just focus on writing efficient GPU computing code. However, we are limited with OpenGL ES version to 2.0 which does not support this kind of shaders. Therefore, we must do some hacks and trick out GPU that we want to render something while we actually do just computations in our shaders.
The targetted chip is SGX530 (on the BBB) and SGX544 (on the BBAI - TODO: yet untested).
Support for other platforms should follow quite easily (BBB was target of this project therefore it is the preferred platform), assuming the system has some OpenGL ES and EGL libraries preinstalled and has the appropriate video devices (/dev/dri/render*).
Caveat:
For the time being you require a headless dummy plug (similar to this one) to simulate having a display device connected. This is being worked on with Imagination engineers here.
This library allows you to call either:
- a single operation (single-shot)
- a chain of same/different operations (chain)
Deducting from experience, you will most likely be calling a chain of commands which will be offloaded to the GPU and then collected only after they are all done. Nonetheless, there might be operations which are heavy enough to be executed just once!
As mentioned earlier, this will guide you through the BBB-specific steps (stating which steps are common).
In order to prepare your environment, you first need to install the BBB image containing the necessary libraries. Flashing is described in detail here.
You can also follow the steps listed here to move the system to on-board eMMC.
You can run below commands either on the host or the target depending on where you want to run the library.
sudo apt update && sudo apt install cmake
Clone the repository:
git clone https://github.com/JDuchniewicz/GPGPU-with-GLES
Create the build folder and enter it:
mkdir GPGPU-with-GLES/cmake-build && cd GPGPU-with-GLES/cmake-build
Run the cmake command:
cmake ..
If running on your host, you need to specify EGL and OpenGL ES libraries directories:
cmake .. -DEGL_INCLUDE_DIR=/usr/include/ -DGLES2_INCLUDE_DIR=/usr/include
Finally, build it:
make
The examples of how to use the library are under examples directory and are compiled to the bin folder. They can be run from this folder.
This repository contains benchmarking code, which allows you to extend it and benchmark your own shaders/sequences of operations. The binary that runs these benchmarks is in the bin/benchmark repository.
The following two graphs show the performance of 2D convolution with 5x5 kernel when called using the chain API and repeated twice:
More benchmarks can be seen in the benchmarking post.
Important! Refer to benchmarks to assess whether your algorithm is suitable for GPGPU computing and specifically choose the proper size of your data. Remember that the texture sizes must be powers of 2 and a maximum of 2048 (on BBB). Feel free to submit a PR with your proposed algorithm and benchmarks for it :)
All contributions are welcome! Most importantly the library is missing several more operations:
- [] 1D convolution
- [] Matrix multiplication
- [] Your operation of choice :)