Improving the Performance of Block-based DRAM Caches Via Tag-Data Decoupling

Research output: Contribution to journalResearch articleContributedpeer-review

Abstract

In-package DRAM-based Last-Level-Caches (LLCs) that cache data in small chunks (i.e., blocks) are promising for improving system performance due to their efficient main memory bandwidth utilization. However, in these high-capacity DRAM caches, managing metadata (i.e., tags) at low cost is challenging. Storing the tags in SRAM has the advantage of quick tag access but is impractical due to a large area overhead. Storing the tags in DRAM reduces the area overhead but incurs tag serialization latency for an associative LLC design, which is inevitable for achieving high cache hit rate. To address the area and latency overhead problem, we propose a block-based DRAM LLC design that decouples tag and data into two regions in DRAM. Our design stores the tags in a latency-optimized DRAM region as the tags are accessed more often than the data. In contrast, we optimize the data region for area efficiency and map spatially-adjacent cache blocks to the same DRAM row to exploit spatial locality. Our design mitigates the tag serialization latency of existing associative DRAM LLCs via selective in-DRAM tag comparison, which overlaps the latency of tag and data accesses. This efficiently enables LLC bypassing via a novel DRAM Absence Table (DAT) that not only provides fast LLC miss detection but also reduces in-package bandwidth requirements. Our evaluation using SPEC2006 benchmarks shows that our tag-data decoupled LLC improves system performance by 11.7 percent compared to a state-of-the-art direct-mapped LLC design and by 7.2 percent compared to an existing associative LLC design.

Details

Original languageEnglish
Pages (from-to)1914-1927
Number of pages14
JournalIEEE Transactions on Computers
Volume70
Issue number11
Publication statusPublished - 1 Nov 2021
Peer-reviewedYes

External IDs

ORCID /0000-0002-5007-445X/work/141545516

Keywords

Research priority areas of TU Dresden

Keywords

  • associative cache, cache bypassing, Die-stacked DRAM cache, direct mapped cache, last level cache (LLC), metadata management