The HPC team at RWTH Aachen University celebrated a major success in September, bringing home two Best Paper Awards. Semih Burak received the Rusty Lusk Award for the Best Paper at EuroMPI/Australia 2024 for his paper titled “SPMD IR: Unifying SPMD and Multi-value IR Showcased for Static Verification of Collectives”. At the International Workshop on OpenMP (IWOMP 2024), co-located with EuroMPI and also held in Perth, Australia, Jannis Klinkenberg secured the first place with his paper “Towards Locality-Aware Host-to-Device Offloading in OpenMP”. Both EuroMPI and IWOMP are recognized as well-established key events for MPI and OpenMP, two leading parallel programming paradigms widely and extensively utilized on High Performance Computing (HPC) clusters.
The paper by Jannis Klinkenberg et al. “Towards Locality-Aware Host-to-Device Offloading in OpenMP” addresses the optimization of data transfers between host and device memory in OpenMP-supported, heterogeneous computing systems such as CPU-GPU architectures. Nowadays, these systems often comprise multiple sockets and multiple GPUs per compute node introducing performance variability due to differences in memory access across Non-Uniform Memory Access (NUMA) domains. As a result, performance issues frequently occur in these complex systems due to suboptimal offloading strategies and device selections, leading to non-local memory accesses. Existing programming models, such as OpenMP, lack robust features to account for the locality between CPU cores, data, and devices, which limits their ability to make optimal choices for efficient processing.
This work examined offloading performance between CPU cores and GPUs, proposing OpenMP API extensions to prioritize nearby GPUs for faster data transfer. A prototype implementation within the LLVM OpenMP runtime and experiments on two recent heterogeneous architectures with Nvidia and AMD GPUs confirm that the locality-aware approach significantly improves computational efficiency and performance in systems with multiple GPUs.