Presentation

· Presenters · Organizations · Search Program · Flagged · Happening Now · Maps · Notifications

Workshop

: OpenMP Target Offloading: Splitting GPU Kernels, Pipelining Communication and Computation, and Selecting Better Grid Geometries

SessionFifth Workshop on Accelerator Programming Using Directives (WACCPD)

Author/Presenters

Artem Chikin

Tyler Gobran

Jose N. Amaral

Event Type

Workshop

Registration Categories

Tags

TimeSunday, November 11th10:51am - 11:15am

LocationD175

DescriptionThis paper presents three ideas that focus on improving the execution of high-level parallel code in GPUs. The first addresses programs that include multiple parallel blocks within a single region of GPU code. A proposed compiler transformation can split such regions into multiple, leading to the launching of multiple kernels, one for each parallel region. Advantages include the opportunity to tailor grid geometry of each kernel to the parallel region that it executes and the elimination of the overheads imposed by a code-generation scheme meant to handle multiple nested parallel regions. Second, is a code transformation that sets up a pipeline of kernel execution and asynchronous data transfer. This transformation enables the overlap of communication and computation. Intricate technical details that are required for this transformation are described. The third idea is that the selection of a grid geometry for the execution of a parallel region must balance the GPU occupancy with the potential saturation of the memory throughput in the GPU. Adding this additional parameter to the geometry selection heuristic can often yield better performance at lower occupancy levels.

Program November 11–16, 2018

Exhibits November 12–15, 2018

KAY BAILEY HUTCHISON CONVENTION CENTER DALLAS

The International Conference for High Performance
Computing, Networking, Storage, and Analysis

Presentation