Improving the Performance of Block-based DRAM Caches Via Tag-Data Decoupling

Publikation: Beitrag in FachzeitschriftForschungsartikelBeigetragenBegutachtung

Abstract

In-package DRAM-based Last-Level-Caches (LLCs) that cache data in small chunks (i.e., blocks) are promising for improving system performance due to their efficient main memory bandwidth utilization. However, in these high-capacity DRAM caches, managing metadata (i.e., tags) at low cost is challenging. Storing the tags in SRAM has the advantage of quick tag access but is impractical due to a large area overhead. Storing the tags in DRAM reduces the area overhead but incurs tag serialization latency for an associative LLC design, which is inevitable for achieving high cache hit rate. To address the area and latency overhead problem, we propose a block-based DRAM LLC design that decouples tag and data into two regions in DRAM. Our design stores the tags in a latency-optimized DRAM region as the tags are accessed more often than the data. In contrast, we optimize the data region for area efficiency and map spatially-adjacent cache blocks to the same DRAM row to exploit spatial locality. Our design mitigates the tag serialization latency of existing associative DRAM LLCs via selective in-DRAM tag comparison, which overlaps the latency of tag and data accesses. This efficiently enables LLC bypassing via a novel DRAM Absence Table (DAT) that not only provides fast LLC miss detection but also reduces in-package bandwidth requirements. Our evaluation using SPEC2006 benchmarks shows that our tag-data decoupled LLC improves system performance by 11.7 percent compared to a state-of-the-art direct-mapped LLC design and by 7.2 percent compared to an existing associative LLC design.

Details

OriginalspracheEnglisch
Seiten (von - bis)1914-1927
Seitenumfang14
FachzeitschriftIEEE Transactions on Computers
Jahrgang70
Ausgabenummer11
PublikationsstatusVeröffentlicht - 1 Nov. 2021
Peer-Review-StatusJa

Externe IDs

ORCID /0000-0002-5007-445X/work/141545516

Schlagworte

Forschungsprofillinien der TU Dresden

Schlagwörter

  • associative cache, cache bypassing, Die-stacked DRAM cache, direct mapped cache, last level cache (LLC), metadata management