Publicação

QUERY PROCESSING IN CLOUD DATABASES WITH PARTIAL REPLICATION

Ver documento

Detalhes bibliográficos
Resumo:Distributed databases are an essential component of large-scale applications where data is stored durably. Given the size of the systems, this type of applications is often presented with network partitions, which can cause considerable down time which impacts user experience and relevant economic losses to companies that resort to such solutions. Because of this, application developers are increasingly opting for highly-available low latency solutions that eschew strong consistency properties that require replica coordination and penalize performance especially when application and database instances are geo-replicated across multiple datacenters. In this type of applications it also becomes difficult to have data fully replicated in each site. Servers might not have enough physical resources to do so and it may become really expensive to do so. Partial replication solutions where every replica doesn’t have all data are an alternative that still is able provide fault-tolerance and low latency. PotionDB is a database that follows the partial replication model and this work extends it with the added support for queries. This contribution includes a description and study of internal structures used to get records that satisfy the queries’ condition along with a study of different algorithms used in query processing to get objects a database instance does not possess locally given the partial replication of the data.
Autores principais:Martins, João Gonçalves
Assunto:Available Databases Weak Consistency Geo-replication Partitioning Database Indexes Query Processing
Ano:2023
País:Portugal
Tipo de documento:dissertação de mestrado
Tipo de acesso:acesso aberto
Instituição associada:Universidade Nova de Lisboa
Idioma:inglês
Origem:Repositório Institucional da UNL
Descrição
Resumo:Distributed databases are an essential component of large-scale applications where data is stored durably. Given the size of the systems, this type of applications is often presented with network partitions, which can cause considerable down time which impacts user experience and relevant economic losses to companies that resort to such solutions. Because of this, application developers are increasingly opting for highly-available low latency solutions that eschew strong consistency properties that require replica coordination and penalize performance especially when application and database instances are geo-replicated across multiple datacenters. In this type of applications it also becomes difficult to have data fully replicated in each site. Servers might not have enough physical resources to do so and it may become really expensive to do so. Partial replication solutions where every replica doesn’t have all data are an alternative that still is able provide fault-tolerance and low latency. PotionDB is a database that follows the partial replication model and this work extends it with the added support for queries. This contribution includes a description and study of internal structures used to get records that satisfy the queries’ condition along with a study of different algorithms used in query processing to get objects a database instance does not possess locally given the partial replication of the data.