Author(s): Henriques, Taigo Almeida
Date: 2012
Persistent ID: http://hdl.handle.net/10451/13864
Origin: Repositório da Universidade de Lisboa
Subject(s): system; alarm; failure; CI; monitoting
Author(s): Henriques, Taigo Almeida
Date: 2012
Persistent ID: http://hdl.handle.net/10451/13864
Origin: Repositório da Universidade de Lisboa
Subject(s): system; alarm; failure; CI; monitoting
All information systems are prone to failures, whether they’re human, infrastructural or applicational, therefore requiring constant monitoring in order to prevent service unavailability affecting business. Sputnik is a platform for graphical and intuitive representation of infrastructural and applicational monitoring, allowing to represent circuits, tables, graphs or for example checking the availability of a server or the accumulation of records in a table. This platform allows real-time problem detection/visualization and supports it’s resolution/escalation in a efficient way. The Checklist is a centralized platform where knowledge and information are organized/ structured in CI’s (Configuration Items) and relationships between them. Intended as a framework for searching any type of relevant information, including relationships between the existing CIs. Asterisk Gateway is a pilot platform (proof-of-concept), designed to trigger automatic phone calls replacing the need for human action, thus speeding the process of scaling. In the context of this PEI, calls are triggered when a CI falis, alerting the respective support team. In this PEI I intended not only, to improve/develop alarmistic platforms over the PTSI systems but also to improve my technical knowledge, management and effective input on the process of monitoring and alarms, minimizing human action and helping them when necessary.