Publicação
Towards a transactional and analytical data management system for Big Data
| Resumo: | Hybrid database systems are on the verge of making Big Data analytics a reality. This new class of database systems bypasses traditional methodologies considered to update data on the analytical processing engine, moving such processing to be computed directly on top of production data. Uncovering a unified database engine that can achieve scalable analytics while simultaneously keep a steady operational capacity, needs to overcome some of the current system hurdles, namely the Extract, Transform and Load (ETL) process. By eschewing such process, hybrid database engines are poised to reduce implementation, management and storage costs and ultimately, enabling real-time Big Data analytics. This dissertation addresses hybrid database systems, particularly tackling some of the inherent functional and non-functional challenges associated with the provision of real-time analytics. This was achieved by specializing in a particular class of analytical functions designated as Window Functions. We considered this class of analytical functions as a vehicle to understand and address the low-latency requirements in hybrid systems, by considering a highly scalable and cloud-based operational database as foundation. While we equipped it with the ability to compute analytical functions, new algorithms were developed to account for the highly distributed scenario. We devised a new metric and evaluation system specifically targeted to assess hybrid database systems, showing that the accomplished prototype is able to meet current requirements. Each one of these achievements is presented as a novel contribution that addresses the proposed challenges and unravels the path for a real-time analytics database. |
|---|---|
| Autores principais: | Coelho, Fábio André Castanheira Luís |
| Assunto: | Engenharia e Tecnologia::Engenharia Eletrotécnica, Eletrónica e Informática |
| Ano: | 2018 |
| País: | Portugal |
| Tipo de documento: | tese de doutoramento |
| Tipo de acesso: | acesso aberto |
| Instituição associada: | Universidade do Minho |
| Idioma: | inglês |
| Origem: | RepositóriUM - Universidade do Minho |
| Resumo: | Hybrid database systems are on the verge of making Big Data analytics a reality. This new class of database systems bypasses traditional methodologies considered to update data on the analytical processing engine, moving such processing to be computed directly on top of production data. Uncovering a unified database engine that can achieve scalable analytics while simultaneously keep a steady operational capacity, needs to overcome some of the current system hurdles, namely the Extract, Transform and Load (ETL) process. By eschewing such process, hybrid database engines are poised to reduce implementation, management and storage costs and ultimately, enabling real-time Big Data analytics. This dissertation addresses hybrid database systems, particularly tackling some of the inherent functional and non-functional challenges associated with the provision of real-time analytics. This was achieved by specializing in a particular class of analytical functions designated as Window Functions. We considered this class of analytical functions as a vehicle to understand and address the low-latency requirements in hybrid systems, by considering a highly scalable and cloud-based operational database as foundation. While we equipped it with the ability to compute analytical functions, new algorithms were developed to account for the highly distributed scenario. We devised a new metric and evaluation system specifically targeted to assess hybrid database systems, showing that the accomplished prototype is able to meet current requirements. Each one of these achievements is presented as a novel contribution that addresses the proposed challenges and unravels the path for a real-time analytics database. |
|---|