Performance analysis of multi-level parallelism: inter-node, intra-node and hardware accelerators
Publikation: Sonstige Veröffentlichung › Sonstiges › Beigetragen › Begutachtung
Beitragende
Abstract
The advent of multi-core processors has made parallel computing techniques mandatory on mainstream systems. With the recent rise in hardware accelerators, hybrid parallelism adds yet another dimension of complexity to the process of software development. The inner workings of a parallel program are usually difficult to understand and verify. This paper presents a tool for graphical program flow analysis of hardware accelerated parallel programs. It monitors the hybrid program execution to record and visualize many performance relevant events along the way. Representative real-world applications written for both IBM's Cell processor and NVIDIA's CUDA API are studied exemplarily. With our combined monitoring and visualization approach for hardware accelerated multi-core and multi-node systems we take the next step in tool evolution towards a highly improved level of detail, precision, and completeness. The contents of this paper is of interest to developers of hardware accelerated applications as well as performance tool architects. Copyright
Details
Originalsprache | Englisch |
---|---|
Seitenumfang | 11 |
Band | Vol. 24 |
Publikationsstatus | Veröffentlicht - 2012 |
Peer-Review-Status | Ja |
No renderer: customAssociatesEventsRenderPortal,dk.atira.pure.api.shared.model.researchoutput.OtherContribution
Externe IDs
Scopus | 84855220688 |
---|---|
ORCID | /0000-0002-8491-770X/work/141543265 |
Schlagworte
Schlagwörter
- performance analysis, hardware