Publicação
Implementing a linear algebra approach to data processing
| Resumo: | Data analysis is among the main strategies of our time for enterprises to take advantage of the vast amounts of data their systems generate and store everyday. Thus the standard relational database model is challenged everyday to cope with quantitative operations over a traditionally qualitative, relational model. A novel approach to the semantics of data is based on (typed) linear algebra (LA), rather than relational algebra, bridging the gap between data dimensions and data measures in a unified way. Also, this bears the promise of increased parallelism, as most operations in LA admit a ‘divide & conquer’ implementation. This paper presents a first experiment in implementing such a typed linear algebra approach and testing its performance on a data distributed system. It presents solutions to some theoretical limitations and evaluates the overall performance. |
|---|---|
| Autores principais: | Pontes, Rogério |
| Outros Autores: | Matos, Miguel Ângelo Marques; Oliveira, José Nuno Fonseca; Pereira, José |
| Assunto: | Big data Formal methods Hive Linear algebra Map reduce |
| Ano: | 2017 |
| País: | Portugal |
| Tipo de documento: | comunicação em conferência |
| Tipo de acesso: | acesso restrito |
| Instituição associada: | Universidade do Minho |
| Idioma: | inglês |
| Origem: | RepositóriUM - Universidade do Minho |
| Resumo: | Data analysis is among the main strategies of our time for enterprises to take advantage of the vast amounts of data their systems generate and store everyday. Thus the standard relational database model is challenged everyday to cope with quantitative operations over a traditionally qualitative, relational model. A novel approach to the semantics of data is based on (typed) linear algebra (LA), rather than relational algebra, bridging the gap between data dimensions and data measures in a unified way. Also, this bears the promise of increased parallelism, as most operations in LA admit a ‘divide & conquer’ implementation. This paper presents a first experiment in implementing such a typed linear algebra approach and testing its performance on a data distributed system. It presents solutions to some theoretical limitations and evaluates the overall performance. |
|---|