Assessing Anonymized System Logs Usefulness for Behavioral Analysis in RNN Models

Publikation: Beitrag in Buch/Konferenzbericht/Sammelband/GutachtenBeitrag in KonferenzbandBeigetragenBegutachtung

Abstract

System logs are a common source of monitoring data for analyzing computing systems' behavior. Due to the complexity of modern computing systems and the large size of collected monitoring data, automated analysis mechanisms are required. Numerous machine learning and deep learning methods are proposed to address this challenge. However, due to the existence of sensitive data in system logs their analysis and storage raise serious privacy concerns. Anonymization methods could be used to clean the monitoring data before analysis. However, anonymized system logs, in general, do not provide adequate usefulness for the majority of behavioral analysis. Content-aware anonymization mechanisms such as PaRS preserve the correlation of system logs even after anonymization. This work evaluates the usefulness of anonymized system logs taken from the Taurus HPC cluster anonymized using PaRS, for behavioral analysis via recurrent neural network models.

Details

OriginalspracheEnglisch
TitelD2R2 2022 : Data-driven Resilience Research 2022
Redakteure/-innenNatanael Arndt, Sabine Gründer-Fahrer, Julia Holze, Michael Martin, Sebastian Tramp
Herausgeber (Verlag)RTWH Aachen
Seitenumfang12
PublikationsstatusVeröffentlicht - 2 Dez. 2022
Peer-Review-StatusJa

Publikationsreihe

ReiheCEUR Workshop Proceedings
Band3376
ISSN1613-0073

Workshop

Titel2022 International Workshop on Data-Driven Resilience Research
KurztitelD2R2 2022
Dauer6 Juli 2022
OrtNeues Rathaus & online
StadtLeipzig
LandDeutschland

Externe IDs

ArXiv http://arxiv.org/abs/2212.01101v1

Schlagworte

ASJC Scopus Sachgebiete

Schlagwörter

  • Data usefulness, System log analysis, Time series analysis