Main Memory and Cache Performance of Intel Sandy Bridge and AMD Bulldozer
Research output: Contribution to conferences › Paper › Contributed › peer-review
Contributors
Abstract
Application performance on multicore processors is seldom constrained by the speed of floating point or integer units. Much more often, limitations are caused by the memory subsystem, particularly shared resources such as last level caches or memory controllers. Measuring, predicting and modeling memory performance becomes a steeper challenge with each new processor generation due to the growing complexity and core count. We tackle the important aspect of measuring and understanding undocumented memory performance numbers in order to create valuable insight into microprocessor details. For this, we build upon a set of sophisticated benchmarks that support latency and bandwidth measurements to arbitrary locations in the memory subsystem. These benchmarks are extended to support AVX instructions for bandwidth measurements and to integrate the coherence states (O)wned and (F)orward. We then use these benchmarks to perform an indepth analysis of current ccNUMA multiprocessor systems with Intel (Sandy Bridge-EP) and AMD (Bulldozer) processors. Using our benchmarks we present fundamental memory performance data and illustrate performance-relevant architectural properties of both designs.
Details
Original language | English |
---|---|
Pages | 261-270 |
Number of pages | 10 |
Publication status | Published - 2014 |
Peer-reviewed | Yes |
External IDs
Scopus | 84904540843 |
---|---|
ORCID | /0000-0002-8491-770X/work/141543271 |
ORCID | /0009-0003-0666-4166/work/151475564 |
Keywords
Keywords
- memory, performance, Intel, AMD