Quantifying Performance and Scalability of the Distributed Monitoring Infrastructure SLAte
Research output: Contribution to journal › Conference article › Contributed › peer-review
Contributors
Abstract
Job-centric monitoring allows to observe the execution of programs and services (so called jobs) on remote and local computing resources. Especially large installations like Grids, Clouds and HPC systems with many thousands of jobs can have large benefits from intelligent visualisations of recorded monitoring data and semi-automatic analyses. The latter can reveal misbehaving jobs or non-optimal job execution and enables future optimisations to establish a more efficient use of the allocated resources. The challenge of job-centric monitoring infrastructures is to store, search and access data collected on huge installations. We take this challenge with a distributed layer-based architecture which provides a uniform view to all monitoring data. The concept of this infrastructure called SLAte, a performance evaluation, and the consequences for scalability are presented in this paper.
Details
Original language | English |
---|---|
Journal | PARS-Mitteilungen |
Volume | 32 |
Issue number | 1 |
Publication status | Published - 2015 |
Peer-reviewed | Yes |