An Experimental Setup to Evaluate RAPL Energy Counters for Heterogeneous Memory

Research output: Contribution to book/conference proceedings/anthology/reportConference contributionContributedpeer-review

Contributors

Abstract

Power consumption of the main memory in modern heterogeneous high-performance computing (HPC) constitutes a significant part of the total power consumption of a node. This motivates energyefficient solutions targeting the memory domain as well. Practitioners need reliable energy measurement techniques for analyzing energy and power consumption of applications and performance optimizations. Running Average Power Limit (RAPL) is a common choice, as it provides uncomplicated access to the energy measurements. While RAPL's accuracy has been studied and validated on homogeneous memory platforms, no work we are aware of investigated its accuracy on heterogeneous memory platforms, specifically with high-capacity memory (HCM). This paper describes the process of measuring the memory power consumption externally using riser cards in detail. We validate RAPL's accuracy by comparing results obtained from Intel's Ice Lake-SP system equipped with DDR4 DRAM and Intel Optane Persistent Memory Modules (PMM). In addition, we verify the accuracy of our instrumentation setup by comparing the results from an older Broadwell system with the results in the literature. We show that the RAPL values on a heterogeneous memory system report a higher offset from the reference measurements. The difference is more pronounced at lower memory load for all memory types. Also, we find that RAPL readings are inconsistent between multiple sockets and over time. Based on the evaluated scenarios, we conclude that RAPL overestimates the actual power consumption on heterogeneous memory systems and provide a discussion on the possible causes of this effect.

Details

Original languageEnglish
Title of host publicationICPE 2024 - Proceedings of the 15th ACM/SPEC International Conference on Performance Engineering
PublisherAssociation for Computing Machinery, Inc
Pages71-82
Number of pages12
ISBN (electronic)9798400704444
Publication statusPublished - 7 May 2024
Peer-reviewedYes

Publication series

SeriesICPE: ACM/SPEC International Conference on Performance Engineering

Conference

Title15th ACM/SPEC International Conference on Performance Engineering
Abbreviated titleICPE 2024
Conference number15
Duration7 - 11 May 2024
Website
Degree of recognitionInternational event
Location Imperial College London
CityLondon
CountryUnited Kingdom

External IDs

ORCID /0000-0002-5437-3887/work/167217166

Keywords

Sustainable Development Goals

Keywords

  • Energy Efficiency, Heterogeneous Memory, HPC, Intel Optane Persistent Memory, Running Average Power Limit