Publicação

Scalable trace analysis of distributed systems: finding data races

Detalhes bibliográficos
Resumo:	Distributed Systems and Protocols are widely employed in the infrastructure that supports the Internet and the services available online such as streaming services and social networks. At the same time, they are well known for usually being hard to implement correctly, even when this task is left to experienced programmers. Consequently, Distributed Systems are prone to suffer from distributed concurrency bugs, which are a frequent source of significant service outages. Thus, it is of the utmost importance to ensure that widely-used distributed systems are reliable and do not suffer from this kind of bugs. Formal Verification looks like a promising way to achieve this. However, we argue that the currently available techniques require too much of an investment in order to verify correctness of implementations of complex distributed systems. Instead, we defend the usage of clever testing techniques and tools for all but the most critical of contexts. In this dissertation, we present one such tool – Spider – designed to automatically detect data races from traced executions of distributed systems. Data races originate when two memory accesses to the same memory location occur concurrently and they have been shown to be a major source of concurrency bugs in distributed systems. Unfortunately, data races are often triggered by non-deterministic event orderings that are hard to detect when testing complex distributed systems. Spider encodes the causal relations between the events in the trace as a symbolic constraint model, which is then fed into an SMT solver to check for the presence of conflicting concurrent accesses. To reduce the constraint solving time, Spider employs a pruning technique aimed at removing redundant portions of the trace. Our experiments with multiple benchmarks show that Spider is effective in detecting data races in distributed executions in a practical amount of time, providing evidence of its usefulness as a testing tool.
Autores principais:	Pereira, João Carlos Mendes
Assunto:	Distributed systems Satisfiability modulo theories Software testing Offline monitoring Dynamic race detection Sistemas distribuídos Teste de software Monitorização offline Deteção dinâmica de races
Ano:	2020
País:	Portugal
Tipo de documento:	dissertação de mestrado
Tipo de acesso:	acesso aberto
Instituição associada:	Universidade do Minho
Idioma:	inglês
Origem:	RepositóriUM - Universidade do Minho

Descrição
Resumo:	Distributed Systems and Protocols are widely employed in the infrastructure that supports the Internet and the services available online such as streaming services and social networks. At the same time, they are well known for usually being hard to implement correctly, even when this task is left to experienced programmers. Consequently, Distributed Systems are prone to suffer from distributed concurrency bugs, which are a frequent source of significant service outages. Thus, it is of the utmost importance to ensure that widely-used distributed systems are reliable and do not suffer from this kind of bugs. Formal Verification looks like a promising way to achieve this. However, we argue that the currently available techniques require too much of an investment in order to verify correctness of implementations of complex distributed systems. Instead, we defend the usage of clever testing techniques and tools for all but the most critical of contexts. In this dissertation, we present one such tool – Spider – designed to automatically detect data races from traced executions of distributed systems. Data races originate when two memory accesses to the same memory location occur concurrently and they have been shown to be a major source of concurrency bugs in distributed systems. Unfortunately, data races are often triggered by non-deterministic event orderings that are hard to detect when testing complex distributed systems. Spider encodes the causal relations between the events in the trace as a symbolic constraint model, which is then fed into an SMT solver to check for the presence of conflicting concurrent accesses. To reduce the constraint solving time, Spider employs a pruning technique aimed at removing redundant portions of the trace. Our experiments with multiple benchmarks show that Spider is effective in detecting data races in distributed executions in a practical amount of time, providing evidence of its usefulness as a testing tool.