Publicação
Scalable trace analysis of distributed systems: finding data races
| Resumo: | Distributed Systems and Protocols are widely employed in the infrastructure that supports the Internet and the services available online such as streaming services and social networks. At the same time, they are well known for usually being hard to implement correctly, even when this task is left to experienced programmers. Consequently, Distributed Systems are prone to suffer from distributed concurrency bugs, which are a frequent source of significant service outages. Thus, it is of the utmost importance to ensure that widely-used distributed systems are reliable and do not suffer from this kind of bugs. Formal Verification looks like a promising way to achieve this. However, we argue that the currently available techniques require too much of an investment in order to verify correctness of implementations of complex distributed systems. Instead, we defend the usage of clever testing techniques and tools for all but the most critical of contexts. In this dissertation, we present one such tool – Spider – designed to automatically detect data races from traced executions of distributed systems. Data races originate when two memory accesses to the same memory location occur concurrently and they have been shown to be a major source of concurrency bugs in distributed systems. Unfortunately, data races are often triggered by non-deterministic event orderings that are hard to detect when testing complex distributed systems. Spider encodes the causal relations between the events in the trace as a symbolic constraint model, which is then fed into an SMT solver to check for the presence of conflicting concurrent accesses. To reduce the constraint solving time, Spider employs a pruning technique aimed at removing redundant portions of the trace. Our experiments with multiple benchmarks show that Spider is effective in detecting data races in distributed executions in a practical amount of time, providing evidence of its usefulness as a testing tool. |
|---|---|
| Autores principais: | Pereira, João Carlos Mendes |
| Assunto: | Distributed systems Satisfiability modulo theories Software testing Offline monitoring Dynamic race detection Sistemas distribuídos Teste de software Monitorização offline Deteção dinâmica de races |
| Ano: | 2020 |
| País: | Portugal |
| Tipo de documento: | dissertação de mestrado |
| Tipo de acesso: | acesso aberto |
| Instituição associada: | Universidade do Minho |
| Idioma: | inglês |
| Origem: | RepositóriUM - Universidade do Minho |
| Resumo: | Distributed Systems and Protocols are widely employed in the infrastructure that supports the Internet and the services available online such as streaming services and social networks. At the same time, they are well known for usually being hard to implement correctly, even when this task is left to experienced programmers. Consequently, Distributed Systems are prone to suffer from distributed concurrency bugs, which are a frequent source of significant service outages. Thus, it is of the utmost importance to ensure that widely-used distributed systems are reliable and do not suffer from this kind of bugs. Formal Verification looks like a promising way to achieve this. However, we argue that the currently available techniques require too much of an investment in order to verify correctness of implementations of complex distributed systems. Instead, we defend the usage of clever testing techniques and tools for all but the most critical of contexts. In this dissertation, we present one such tool – Spider – designed to automatically detect data races from traced executions of distributed systems. Data races originate when two memory accesses to the same memory location occur concurrently and they have been shown to be a major source of concurrency bugs in distributed systems. Unfortunately, data races are often triggered by non-deterministic event orderings that are hard to detect when testing complex distributed systems. Spider encodes the causal relations between the events in the trace as a symbolic constraint model, which is then fed into an SMT solver to check for the presence of conflicting concurrent accesses. To reduce the constraint solving time, Spider employs a pruning technique aimed at removing redundant portions of the trace. Our experiments with multiple benchmarks show that Spider is effective in detecting data races in distributed executions in a practical amount of time, providing evidence of its usefulness as a testing tool. |
|---|