Cache Coherence Protocol and Memory Performance of the Intel Haswell-EP Architecture

Research output: Contribution to conferencesPaperContributedpeer-review


A major challenge in the design of contemporary microprocessors is the increasing number of cores in conjunction with the persevering need for cache coherence. To achieve this, the memory subsystem steadily gains complexity that has evolved to levels beyond comprehension of most application performance analysts. The Intel Haswell-EP architecture is such an example. It includes considerable advancements regarding memory hierarchy, on-chip communication, and cache coherence mechanisms compared to the previous generation. We have developed sophisticated benchmarks that allow us to perform in-depth investigations with full memory location and coherence state control. Using these benchmarks we investigate performance data and architectural properties of the Haswell-EP micro-architecture, including important memory latency and bandwidth characteristics as well as the cost of core-to-core transfers. This allows us to further the understanding of such complex designs by documenting implementation details the are either not publicly available at all, or only indirectly documented through patents.


Original languageEnglish
Publication statusPublished - 2015

External IDs

ORCID /0000-0002-8491-770X/work/141543274
ORCID /0009-0003-0666-4166/work/151475567



  • Benchmarking, cache coherency, numa, snoop filter