A C++ Library for Memory Layout and Performance Portability of Scientific Applications
Publikation: Beitrag in Buch/Konferenzbericht/Sammelband/Gutachten › Beitrag in Konferenzband › Beigetragen › Begutachtung
Beitragende
Abstract
We present a C++14 library for performance portability of scientific computing codes across CPU and GPU architectures. Our library combines generic data structures like vectors, multi-dimensional arrays, maps, graphs, and sparse grids with basic, reusable algorithms like convolutions, sorting, prefix sum, reductions, and scan. The memory layout of the data structures is adapted at compile-time using tuples with optional memory mirroring between CPU and GPU. We combine this transparent memory mapping with generic algorithms under two alternative programming interfaces: a CUDA-like kernel interface for multi-core CPUs, Nvidia GPUs, and AMD GPUs, as well as a lambda interface. We validate and benchmark the presented library using micro-benchmarks, showing that the abstractions introduce negligible performance overhead, and we compare performance against the current state of the art.
Details
Originalsprache | Englisch |
---|---|
Titel | Euro-Par 2022 |
Redakteure/-innen | Jeremy Singer, Yehia Elkhatib, Dora Blanco Heras, Patrick Diehl, Nick Brown, Aleksandar Ilic |
Herausgeber (Verlag) | Springer Science and Business Media B.V. |
Seiten | 109-120 |
Seitenumfang | 12 |
ISBN (Print) | 9783031312083 |
Publikationsstatus | Veröffentlicht - 2023 |
Peer-Review-Status | Ja |
Publikationsreihe
Reihe | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
---|---|
Band | 13835 LNCS |
ISSN | 0302-9743 |
Konferenz
Titel | 28th International European Conference on Parallel and Distributed Computing , Euro-Par 2022 |
---|---|
Dauer | 22 - 26 August 2022 |
Stadt | Glasgow |
Land | Großbritannien/Vereinigtes Königreich |
Externe IDs
ORCID | /0000-0003-4414-4340/work/159608271 |
---|
Schlagworte
ASJC Scopus Sachgebiete
Schlagwörter
- C++ tuples, generic algorithms, GPU, memory layout, multi-core, performance portability