Memory Performance and Cache Coherency Effects on an Intel Nehalem Multiprocessor System
Publikation: Beitrag in Buch/Konferenzbericht/Sammelband/Gutachten › Beitrag in Konferenzband › Beigetragen › Begutachtung
Beitragende
Abstract
Today's microprocessors have complex memory subsystems with several cache levels. The efficient use of this memory hierarchy is crucial to gain optimal performance, especially on multicore processors. Unfortunately, many implementation details of these processors are not publicly available. In this paper we present such fundamental details of the newly introduced Intel Nehalem microarchitecture with its integrated memory controller, quick path interconnect, and ccNUMA architecture. Our analysis is based on sophisticated benchmarks to measure the latency and bandwidth between different locations in the memory subsystem. Special care is taken to control the coherency state of the data to gain insight into performance relevant implementation details of the cache coherency protocol. Based on these benchmarks we present undocumented performance data and architectural properties.
Details
Originalsprache | Deutsch |
---|---|
Titel | 2009 18th International Conference on Parallel Architectures and Compilation Techniques |
Herausgeber (Verlag) | IEEE Computer Society, Washington |
Seiten | 261-270 |
Seitenumfang | 10 |
ISBN (Print) | 978-0-7695-3771-9 |
Publikationsstatus | Veröffentlicht - 16 Sept. 2009 |
Peer-Review-Status | Ja |
Konferenz
Titel | 2009 18th International Conference on Parallel Architectures and Compilation Techniques |
---|---|
Dauer | 12 - 16 September 2009 |
Ort | Raleigh, NC, USA |
Externe IDs
Scopus | 70449643566 |
---|---|
ORCID | /0000-0002-8491-770X/work/141543288 |
ORCID | /0009-0003-0666-4166/work/151475588 |
Schlagworte
Schlagwörter
- Multiprocessing systems, Microarchitecture, Bandwidth, Delay, Multicore processing, Protocols, High performance computing, Performance gain, Scalability, Yarn