A C++ Library for Memory Layout and Performance Portability of Scientific Applications
Publikation: Beitrag in Buch/Konferenzbericht/Sammelband/Gutachten › Beitrag in Konferenzband › Beigetragen › Begutachtung
Beitragende
Abstract
We present a C++14 library for performance portability of scientific computing codes across CPU and GPU architectures. Our library combines generic data structures like vectors, multi-dimensional arrays, maps, graphs, and sparse grids with basic, reusable algorithms like convolutions, sorting, prefix sum, reductions, and scan. The memory layout of the data structures is adapted at compile-time using tuples with optional memory mirroring between CPU and GPU. We combine this transparent memory mapping with generic algorithms under two alternative programming interfaces: a CUDA-like kernel interface for multi-core CPUs, Nvidia GPUs, and AMD GPUs, as well as a lambda interface. We validate and benchmark the presented library using micro-benchmarks, showing that the abstractions introduce negligible performance overhead, and we compare performance against the current state of the art.
Details
| Originalsprache | Englisch |
|---|---|
| Titel | Euro-Par 2022 |
| Redakteure/-innen | Jeremy Singer, Yehia Elkhatib, Dora Blanco Heras, Patrick Diehl, Nick Brown, Aleksandar Ilic |
| Herausgeber (Verlag) | Springer Science and Business Media B.V. |
| Seiten | 109-120 |
| Seitenumfang | 12 |
| ISBN (Print) | 9783031312083 |
| Publikationsstatus | Veröffentlicht - 2023 |
| Peer-Review-Status | Ja |
Publikationsreihe
| Reihe | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
|---|---|
| Band | 13835 LNCS |
| ISSN | 0302-9743 |
Konferenz
| Titel | 28th International European Conference on Parallel and Distributed Computing |
|---|---|
| Kurztitel | Euro-Par 2022 |
| Veranstaltungsnummer | 28 |
| Dauer | 22 - 26 August 2022 |
| Webseite | |
| Ort | University of Glasgow |
| Stadt | Glasgow |
| Land | Großbritannien/Vereinigtes Königreich |
Externe IDs
| ORCID | /0000-0003-4414-4340/work/159608271 |
|---|
Schlagworte
ASJC Scopus Sachgebiete
Schlagwörter
- C++ tuples, generic algorithms, GPU, memory layout, multi-core, performance portability