Publicação
QUERY PROCESSING IN CLOUD DATABASES WITH PARTIAL REPLICATION
| Resumo: | Distributed databases are an essential component of large-scale applications where data is stored durably. Given the size of the systems, this type of applications is often presented with network partitions, which can cause considerable down time which impacts user experience and relevant economic losses to companies that resort to such solutions. Because of this, application developers are increasingly opting for highly-available low latency solutions that eschew strong consistency properties that require replica coordination and penalize performance especially when application and database instances are geo-replicated across multiple datacenters. In this type of applications it also becomes difficult to have data fully replicated in each site. Servers might not have enough physical resources to do so and it may become really expensive to do so. Partial replication solutions where every replica doesn’t have all data are an alternative that still is able provide fault-tolerance and low latency. PotionDB is a database that follows the partial replication model and this work extends it with the added support for queries. This contribution includes a description and study of internal structures used to get records that satisfy the queries’ condition along with a study of different algorithms used in query processing to get objects a database instance does not possess locally given the partial replication of the data. |
|---|---|
| Autores principais: | Martins, João Gonçalves |
| Assunto: | Available Databases Weak Consistency Geo-replication Partitioning Database Indexes Query Processing |
| Ano: | 2023 |
| País: | Portugal |
| Tipo de documento: | dissertação de mestrado |
| Tipo de acesso: | acesso aberto |
| Instituição associada: | Universidade Nova de Lisboa |
| Idioma: | inglês |
| Origem: | Repositório Institucional da UNL |
| Resumo: | Distributed databases are an essential component of large-scale applications where data is stored durably. Given the size of the systems, this type of applications is often presented with network partitions, which can cause considerable down time which impacts user experience and relevant economic losses to companies that resort to such solutions. Because of this, application developers are increasingly opting for highly-available low latency solutions that eschew strong consistency properties that require replica coordination and penalize performance especially when application and database instances are geo-replicated across multiple datacenters. In this type of applications it also becomes difficult to have data fully replicated in each site. Servers might not have enough physical resources to do so and it may become really expensive to do so. Partial replication solutions where every replica doesn’t have all data are an alternative that still is able provide fault-tolerance and low latency. PotionDB is a database that follows the partial replication model and this work extends it with the added support for queries. This contribution includes a description and study of internal structures used to get records that satisfy the queries’ condition along with a study of different algorithms used in query processing to get objects a database instance does not possess locally given the partial replication of the data. |
|---|