A C++ Library for Memory Layout and Performance Portability of Scientific Applications

Publikation: Beitrag in Buch/Konferenzbericht/Sammelband/GutachtenBeitrag in KonferenzbandBeigetragenBegutachtung

Beitragende

Abstract

We present a C++14 library for performance portability of scientific computing codes across CPU and GPU architectures. Our library combines generic data structures like vectors, multi-dimensional arrays, maps, graphs, and sparse grids with basic, reusable algorithms like convolutions, sorting, prefix sum, reductions, and scan. The memory layout of the data structures is adapted at compile-time using tuples with optional memory mirroring between CPU and GPU. We combine this transparent memory mapping with generic algorithms under two alternative programming interfaces: a CUDA-like kernel interface for multi-core CPUs, Nvidia GPUs, and AMD GPUs, as well as a lambda interface. We validate and benchmark the presented library using micro-benchmarks, showing that the abstractions introduce negligible performance overhead, and we compare performance against the current state of the art.

Details

OriginalspracheEnglisch
TitelEuro-Par 2022
Redakteure/-innenJeremy Singer, Yehia Elkhatib, Dora Blanco Heras, Patrick Diehl, Nick Brown, Aleksandar Ilic
Herausgeber (Verlag)Springer Science and Business Media B.V.
Seiten109-120
Seitenumfang12
ISBN (Print)9783031312083
PublikationsstatusVeröffentlicht - 2023
Peer-Review-StatusJa

Publikationsreihe

ReiheLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Band13835 LNCS
ISSN0302-9743

Konferenz

Titel28th International European Conference on Parallel and Distributed Computing , Euro-Par 2022
Dauer22 - 26 August 2022
StadtGlasgow
LandGroßbritannien/Vereinigtes Königreich

Externe IDs

ORCID /0000-0003-4414-4340/work/159608271

Schlagworte

Schlagwörter

  • C++ tuples, generic algorithms, GPU, memory layout, multi-core, performance portability