Comparing Cache Architectures and Coherency Protocols on x86-64 Multicore SMP Systems
Research output: Contribution to book/conference proceedings/anthology/report › Conference contribution › Contributed › peer-review
Contributors
Abstract
Across a broad range of applications, multicore technology is the most important factor that drives today's microprocessor performance improvements. Closely coupled is a growing complexity of the memory subsystems with several cache levels that need to be exploited efficiently to gain optimal application performance. Many important implementation details of these memory subsystems are undocumented. We therefore present a set of sophisticated benchmarks for latency and bandwidth measurements to arbitrary locations in the memory subsystem. We consider the coherency state of cache lines to analyze the cache coherency protocols and their performance impact. The potential of our approach is demonstrated with an in-depth comparison of ccNUMA multiprocessor systems with AMD (Shanghai) and Intel (Nehalem-EP) quad-core x86-64 processors that both feature integrated memory controllers and coherent point-to-point interconnects. Using our benchmarks we present fundamental memory performance data and architectural properties of both processors. Our comparison reveals in detail how the microarchitectural differences tremendously affect the performance of the memory subsystem.
Details
Original language | English |
---|---|
Title of host publication | 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO) |
Publisher | ACM Press |
Pages | 413-422 |
Number of pages | 10 |
Publication status | Published - 2009 |
Peer-reviewed | Yes |
External IDs
Scopus | 76749126627 |
---|---|
ORCID | /0000-0002-8491-770X/work/141543298 |