Quantifying Performance and Scalability of the Distributed Monitoring Infrastructure SLAte

Publikation: Beitrag in FachzeitschriftKonferenzartikelBeigetragenBegutachtung

Abstract

Job-centric monitoring allows to observe the execution of programs and services (so called jobs) on remote and local computing resources. Especially large installations like Grids, Clouds and HPC systems with many thousands of jobs can have large benefits from intelligent visualisations of recorded monitoring data and semi-automatic analyses. The latter can reveal misbehaving jobs or non-optimal job execution and enables future optimisations to establish a more efficient use of the allocated resources. The challenge of job-centric monitoring infrastructures is to store, search and access data collected on huge installations. We take this challenge with a distributed layer-based architecture which provides a uniform view to all monitoring data. The concept of this infrastructure called SLAte, a performance evaluation, and the consequences for scalability are presented in this paper.

Details

OriginalspracheEnglisch
FachzeitschriftPARS-Mitteilungen
Jahrgang32
Ausgabenummer1
PublikationsstatusVeröffentlicht - 2015
Peer-Review-StatusJa

Schlagworte

Forschungsprofillinien der TU Dresden