HDEEM: High Definition Energy Efficiency Monitoring

Research output: Contribution to book/Conference proceedings/Anthology/ReportChapter in book/Anthology/ReportContributed

Abstract

Accurate and fine-grained power measurements of computing systems are essential for energy-aware performance optimizations of HPC systems and applications. Although cluster wide instrumentation options are available, fine spatial granularity and temporal resolution are not supported by the system vendors and extra hardware is needed to capture the power consumption information. We introduce the High Definition Energy Efficiency Monitoring (HDEEM) infrastructure, a sophisticated approach towards systemwide and fine-grained power measurements that enable energy-aware performance optimizations of parallel codes. Our approach is targeted at instrumenting multiple HPC racks with power sensors that have a sampling rate of about 8 kSa/s as well as finer spatial granularity, e.g., for per-CPU measurements. We specifically focus on the correctness of power measurement samples and energy consumption calculations based on these power samples. We also discuss scalable and low-overhead or overhead-free options for online and offline (post-mortem) processing of power measurement data.

Details

Original languageEnglish
Title of host publicationProceedings of the 2nd International Workshop on Energy Efficient Supercomputing
Pages1-10
Number of pages10
Publication statusPublished - 2014
Peer-reviewedNo

External IDs

ORCID /0000-0002-8491-770X/work/141543273
ORCID /0009-0003-0666-4166/work/151475566
ORCID /0000-0002-5437-3887/work/154740494

Keywords

Sustainable Development Goals

Keywords

  • HDEEM, supercomputing