Felhőrendszerek monitorozása és hibaelőrejelzése

The quality of service provided by cloud-deployed online applications is often affected by faults in the underlying cloud platform and infrastructure. In order to discover the cause and effect at application failures, a cloud monitoring system must be in place. The sheer amount of the produced monitoring data calls for smart and automatic handling in order to find the patterns that can be used for fault management. In this paper we present an open source, cloud-native, lightweight cloud monitoring system, and a data analytics pipeline that efficiently processes the gathered data and is able to discover useful inference between infrastructure-, and application-level metrics. We apply time series clustering steps within the pipeline to compress the collected data for fast and lightweight data mining. We show the capabilities of our proposed system in a reactive and a proactive use case. The results prove that the proposed system brings precious insights for root-cause analysis and proactive fault management frameworks of cloud applications.

Toka László, Bíró József, Rahimian Pegah, Szalay Márk, Haja Dávid, Dobreff Gergely, Kecskés- Solymosi Zsófia, Tárnok Balázs, Fodor Balázs

2021/ 3/ 7.

Támogató: Ericsson

Be the first to comment

Leave a Reply

Your email address will not be published.




Ez az oldal az Akismet szolgáltatást használja a spam csökkentésére. Ismerje meg a hozzászólás adatainak feldolgozását .