Document details

Fault-tolerant Stochastic Distributed Systems

Author(s): Silvestre, Daniel

Date: 2017

Persistent ID:

Origin: Camões - Repositório Institucional da Universidade Autónoma de Lisboa

Subject(s): Fault-tolerant; Distributed Systems; Networked Control Systems; Set-valued Observers; Event-triggered Systems; Self-triggered Systems


The present doctoral thesis discusses the design of fault-tolerant distributed systems, placing emphasis in addressing the case where the actions of the nodes or their interactions are stochastic. The main objective is to detect and identify faults to improve the resilience of distributed systems to crash-type faults, as well as detecting the presence of malicious nodes in pursuit of exploiting the network. The proposed analysis considers malicious agents and computational solutions to detect faults. Crash-type faults, where the affected component ceases to perform its task, are tackled in this thesis by introducing stochastic decisions in deterministic distributed algorithms. Prime importance is placed on providing guarantees and rates of convergence for the steady-state solution. The scenarios of a social network (state-dependent example) and consensus (time- dependent example) are addressed, proving convergence. The proposed algorithms are capable of dealing with packet drops, delays, medium access competition, and, in particular, nodes failing and/or losing network connectivity. The concept of Set-Valued Observers (SVOs) is used as a tool to detect faults in a worst-case scenario, i.e., when a malicious agent can select the most unfavorable sequence of communi- cations and inject a signal of arbitrary magnitude. For other types of faults, it is introduced the concept of Stochastic Set-Valued Observers (SSVOs) which produce a confidence set where the state is known to belong with at least a pre-specified probability. It is shown how, for an algorithm of consensus, it is possible to exploit the structure of the problem to reduce the computational complexity of the solution. The main result allows discarding interactions in the model that do not contribute to the produced estimates. The main drawback of using classical SVOs for fault detection is their computational burden. By resorting to a left-coprime factorization for Linear Parameter-Varying (LPV) systems, it is shown how to reduce the computational complexity. By appropriately selecting the factorization, it is possible to consider detectable systems (i.e., unobservable systems where the unobservable component is stable). Such a result plays a key role in the domain of Cyber-Physical Systems (CPSs). These techniques are complemented with Event- and Self-triggered sampling strategies that enable fewer sensor updates. Moreover, the same triggering mechanisms can be used to make decisions of when to run the SVO routine or resort to over-approximations that temporarily compromise accuracy to gain in performance but maintaining the convergence characteristics of the set-valued estimates. A less stringent requirement for network resources that is vital to guarantee the applicability of SVO-based fault detection in the domain of Networked Control Systems (NCSs).

Document Type Doctoral thesis
Language English
Advisor(s) Silvestre, Carlos Jorge Ferreira; Hespanha, Joao Pedro Cordeiro Pereira Botelho
Contributor(s) Silvestre, Daniel
facebook logo  linkedin logo  twitter logo 
mendeley logo