Automated IT Service Fault Diagnosis Based on Event Correlation Techniques

Automated IT Service Fault Diagnosis Based on Event Correlation Techniques

Beschreibung

vor 17 Jahren
In the previous years a paradigm shift in the area of IT service
management could be witnessed. IT management does not only deal
with the network, end systems, or applications anymore, but is more
and more concerned with IT services. This is caused by the need of
organizations to monitor the efficiency of internal IT departments
and to have the possibility to subscribe IT services from external
providers. This trend has raised new challenges in the area of IT
service management, especially with respect to service level
agreements laying down the quality of service to be guaranteed by a
service provider. Fault management is also facing new challenges
which are related to ensuring the compliance to these service level
agreements. For example, a high utilization of network links in the
infrastructure can imply a delay increase in the delivery of
services with respect to agreed time constraints. Such
relationships have to be detected and treated in a service-oriented
fault diagnosis which therefore does not deal with faults in a
narrow sense, but with service quality degradations. This thesis
aims at providing a concept for service fault diagnosis which is an
important part of IT service fault management. At first, a
motivation of the need of further examinations regarding this issue
is given which is based on the analysis of services offered by a
large IT service provider. A generalization of the scenario forms
the basis for the specification of requirements which are used for
a review of related research work and commercial products. Even
though some solutions for particular challenges have already been
provided, a general approach for service fault diagnosis is still
missing. For addressing this issue, a framework is presented in the
main part of this thesis using an event correlation component as
its central part. Event correlation techniques which have been
successfully applied to fault management in the area of network and
systems management are adapted and extended accordingly. Guidelines
for the application of the framework to a given scenario are
provided afterwards. For showing their feasibility in a real world
scenario, they are used for both example services referenced
earlier.

Kommentare (0)

Lade Inhalte...

Abonnenten

15
15
:
: