Publicação

Towards a transactional and analytical data management system for Big Data

Ver documento

Detalhes bibliográficos
Resumo:Hybrid database systems are on the verge of making Big Data analytics a reality. This new class of database systems bypasses traditional methodologies considered to update data on the analytical processing engine, moving such processing to be computed directly on top of production data. Uncovering a unified database engine that can achieve scalable analytics while simultaneously keep a steady operational capacity, needs to overcome some of the current system hurdles, namely the Extract, Transform and Load (ETL) process. By eschewing such process, hybrid database engines are poised to reduce implementation, management and storage costs and ultimately, enabling real-time Big Data analytics. This dissertation addresses hybrid database systems, particularly tackling some of the inherent functional and non-functional challenges associated with the provision of real-time analytics. This was achieved by specializing in a particular class of analytical functions designated as Window Functions. We considered this class of analytical functions as a vehicle to understand and address the low-latency requirements in hybrid systems, by considering a highly scalable and cloud-based operational database as foundation. While we equipped it with the ability to compute analytical functions, new algorithms were developed to account for the highly distributed scenario. We devised a new metric and evaluation system specifically targeted to assess hybrid database systems, showing that the accomplished prototype is able to meet current requirements. Each one of these achievements is presented as a novel contribution that addresses the proposed challenges and unravels the path for a real-time analytics database.
Autores principais:Coelho, Fábio André Castanheira Luís
Assunto:Engenharia e Tecnologia::Engenharia Eletrotécnica, Eletrónica e Informática
Ano:2018
País:Portugal
Tipo de documento:tese de doutoramento
Tipo de acesso:acesso aberto
Instituição associada:Universidade do Minho
Idioma:inglês
Origem:RepositóriUM - Universidade do Minho
Descrição
Resumo:Hybrid database systems are on the verge of making Big Data analytics a reality. This new class of database systems bypasses traditional methodologies considered to update data on the analytical processing engine, moving such processing to be computed directly on top of production data. Uncovering a unified database engine that can achieve scalable analytics while simultaneously keep a steady operational capacity, needs to overcome some of the current system hurdles, namely the Extract, Transform and Load (ETL) process. By eschewing such process, hybrid database engines are poised to reduce implementation, management and storage costs and ultimately, enabling real-time Big Data analytics. This dissertation addresses hybrid database systems, particularly tackling some of the inherent functional and non-functional challenges associated with the provision of real-time analytics. This was achieved by specializing in a particular class of analytical functions designated as Window Functions. We considered this class of analytical functions as a vehicle to understand and address the low-latency requirements in hybrid systems, by considering a highly scalable and cloud-based operational database as foundation. While we equipped it with the ability to compute analytical functions, new algorithms were developed to account for the highly distributed scenario. We devised a new metric and evaluation system specifically targeted to assess hybrid database systems, showing that the accomplished prototype is able to meet current requirements. Each one of these achievements is presented as a novel contribution that addresses the proposed challenges and unravels the path for a real-time analytics database.