Mining large distributed log-data in near real-time
Research output: Contribution to conferences › Paper › Contributed
Contributors
Abstract
Analyzing huge amounts of log data is often a difficult task, especially if it has to be done in real time (e.g., fraud detection) or when large amounts of stored data are required for the analysis. Graphs are a data structure often used in log analysis. Examples are clique analysis and communities of interest (COI). However, little attention has been paid to large distributed graphs that allow a high throughput of updates with very low latency.
In this paper, we present a distributed graph mining system that is able to process around 39 million log entries per second on a 50 node cluster while providing processing latencies below 10 ms. We validate our approach by presenting two example applications, namely telephony fraud detection and internet attack detection. A thorough evaluation proves the scalability and near real-time properties of our system.
In this paper, we present a distributed graph mining system that is able to process around 39 million log entries per second on a 50 node cluster while providing processing latencies below 10 ms. We validate our approach by presenting two example applications, namely telephony fraud detection and internet attack detection. A thorough evaluation proves the scalability and near real-time properties of our system.
Details
Original language | English |
---|---|
Pages | 1-8 |
Publication status | Published - 2011 |
Peer-reviewed | No |
Conference
Title | Managing Large-Scale Systems via the Analysis of System Logs and the Application of Machine Learning Techniques (SLAML/SOSP) (SLAML '11), ACM, 2011 |
---|---|
Abbreviated title | (SLAML '11 |
Conference number | |
Duration | 23 October 2011 |
Degree of recognition | International event |
Location | |
City | Cascais |
Country | Portugal |
Keywords
Research priority areas of TU Dresden
DFG Classification of Subject Areas according to Review Boards
Keywords
- Log processing, distriuted graphs, COI