C++Now 2016 has ended
Please visit the C++Now website.
Back To Schedule
Friday, May 13 • 2:30pm - 4:00pm
HPX and GPU parallelized STL

Log in to save this to your schedule, view media, leave feedback and see who's attending!

The concept of executors, introduced by proposal N4406, has created a possibility for a flexible and dynamic choice of execution platform for STL algorithms. HPX implements executors to support serial and parallel execution of algorithms on CPUs. However, current solutions do not utilize the power of GPUs. The support for GPGPU frameworks has become a standard in modern graphic processors, but the usage of computing power provided is still far from common. The most frequent reasons include: less intuitive programming model, more complex architecture of both memory and streaming processors, and the host-device paradigm which enforces manual data transmission and synchronization. There is a need for parallelization including not only CPUs which can be easily introduced into applications, even by programmers who are not experienced in neither architecture nor programming languages of GPUs. We have been working on bringing GPU parallelization into HPX implementation of parallel STL. For this task, we have chosen two industrial standards for automatic parallelization of C++ into the device code: C++AMP, an open standard proposed by Microsoft, and SYCL, which has been recently introduced by the Khronos Group. Our approach is based on code generated for OpenCL which makes it fully portable and platform independent. Our solution includes the extension of existing implementation with both synchronous and asynchronous GPU executors. This approach allows for minimizing developer’s work, who is not forced to write code responsible for data transfers. Templated design of algorithms encapsulates all buffering and synchronization which is necessary for execution on GPU. An additional configuration allows the optimization of performance by choosing a block size for a GPU thread. HPX has been integrated and tested with the latest and still developed Clang-based compilers, open source Kalmar, which implements AMP standard, and commercial ComputeCpp providing support for Khronos SYCL. Although both compilers are not bug-free and ready for release, the results of our project prove that these standards can be integrated with an existing, complex and mature C++ project.

avatar for Marcin Copik

Marcin Copik

I am a final year Master student at the German Research School for Simulation Sciences/RWTH Aachen. I have obtained my Bachelor's degree in Computer Science from the Silesian University of Technology. I have worked with the STE||AR Group during Google Summer of Code 2015 and I came... Read More →

Friday May 13, 2016 2:30pm - 4:00pm MDT