ROLLED: Racetrack Memory Optimized Linear Layout and Efficient Decomposition of Decision Trees

Publikation: Beitrag in FachzeitschriftForschungsartikelBeigetragenBegutachtung

Beitragende

  • Christian Hakert - , Technische Universität (TU) Dortmund (Autor:in)
  • Asif Ali Khan - , Professur für Compilerbau (cfaed), Technische Universität (TU) Dortmund (Autor:in)
  • Kuan Hsun Chen - , Technische Universität (TU) Dortmund (Autor:in)
  • Fazal Hameed - , Institute of Space Technology (Autor:in)
  • Jeronimo Castrillon - , Professur für Compilerbau (cfaed), Technische Universität (TU) Dortmund (Autor:in)
  • Jian Jia Chen - , Technische Universität (TU) Dortmund (Autor:in)

Abstract

Modern low power distributed systems tend to integrate machine learning algorithms. In resource-constrained setups, the execution of the models has to be optimized for performance and energy consumption. Racetrack memory (RTM) promises to achieve these goals by offering unprecedented integration density, smaller access latency, and reduced energy consumption. However, to access data in RTM, it needs to be shifted to the access port first. We investigate decision trees and develop placement strategies to reduce the total number of shifts in RTM. Decision trees allow profiling during training, resulting in tree paths' access probabilities. We map tree nodes to RTM so that the total number of shifts is minimal. Concretely, we present two different placement approaches: 1) where tree nodes are closely packed and placed uniformly in a single RTM location and 2) where decision tree nodes are decomposedto separate RTM blocks. We discuss theoretical cost models for both approaches, we formally prove an upper bound of 4× for the unified and an upper bound of 12× for the decomposed organization towards the optimal placement. We conduct a thorough experimental evaluation to compare our algorithms to the state-of-the-art placement strategies Our experimental evaluations show that the unified and decomposed solutions reduce the number of shifts by 58.1 and 80.1, respectively, leading to a 53.8% and 46.3% reduction in the overall runtime and 52.6 and 61.7% reduction in the energy consumption, compared to a naive baseline.

Details

OriginalspracheEnglisch
Seiten (von - bis)1488-1502
Seitenumfang15
FachzeitschriftIEEE Transactions on Computers
Jahrgang72
Ausgabenummer5
PublikationsstatusVeröffentlicht - 1 Mai 2023
Peer-Review-StatusJa

Externe IDs

ORCID /0000-0002-5007-445X/work/160049127

Schlagworte

Ziele für nachhaltige Entwicklung

Schlagwörter

  • decision tree, Non-volatile memory, optimal linear ordering, racetrack memory