A C++ Library for Memory Layout and Performance Portability of Scientific Applications

Pietro Incardona; Aryaman Gupta; Serhii Yaskovets; Ivo F. Sbalzarini

doi:10.1007/978-3-031-31209-0_8

A C++ Library for Memory Layout and Performance Portability of Scientific Applications

Research output: Contribution to book/Conference proceedings/Anthology/Report › Conference contribution › Contributed › peer-review

Contributors

Pietro Incardona - , Chair of Scientific Computing for Systems Biology, Max Planck Institute of Molecular Cell Biology and Genetics, Center for Systems Biology Dresden (CSBD) (Author)
Aryaman Gupta - , Chair of Scientific Computing for Systems Biology, Max Planck Institute of Molecular Cell Biology and Genetics, Center for Systems Biology Dresden (CSBD) (Author)
Serhii Yaskovets - , Chair of Scientific Computing for Systems Biology, TUD Dresden University of Technology (Author)
Ivo F. Sbalzarini - , Center for Scalable Data Analytics and Artificial Intelligence (ScaDS.AI Dresden), Chair of Scientific Computing for Systems Biology, TUD Dresden University of Technology, Max Planck Institute of Molecular Cell Biology and Genetics, Center for Systems Biology Dresden (CSBD) (Author)

Abstract

We present a C++14 library for performance portability of scientific computing codes across CPU and GPU architectures. Our library combines generic data structures like vectors, multi-dimensional arrays, maps, graphs, and sparse grids with basic, reusable algorithms like convolutions, sorting, prefix sum, reductions, and scan. The memory layout of the data structures is adapted at compile-time using tuples with optional memory mirroring between CPU and GPU. We combine this transparent memory mapping with generic algorithms under two alternative programming interfaces: a CUDA-like kernel interface for multi-core CPUs, Nvidia GPUs, and AMD GPUs, as well as a lambda interface. We validate and benchmark the presented library using micro-benchmarks, showing that the abstractions introduce negligible performance overhead, and we compare performance against the current state of the art.

Details

Original language	English
Title of host publication	Euro-Par 2022
Editors	Jeremy Singer, Yehia Elkhatib, Dora Blanco Heras, Patrick Diehl, Nick Brown, Aleksandar Ilic
Publisher	Springer Science and Business Media B.V.
Pages	109-120
Number of pages	12
ISBN (print)	9783031312083
Publication status	Published - 2023
Peer-reviewed	Yes

Publication series

Series	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume	13835 LNCS
ISSN	0302-9743

Conference

Title	28th International European Conference on Parallel and Distributed Computing
Abbreviated title	Euro-Par 2022
Conference number	28
Duration	22 - 26 August 2022
Website	https://2022.euro-par.org/
Location	University of Glasgow
City	Glasgow
Country	United Kingdom

External IDs

ORCID	/0000-0003-4414-4340/work/159608271

Keywords

ASJC Scopus subject areas

Keywords

C++ tuples, generic algorithms, GPU, memory layout, multi-core, performance portability

Research Portal of the TU Dresden

Contributors

Abstract

Details

Publication series

Conference

External IDs

Keywords

ASJC Scopus subject areas

Keywords