Document details

Mapper: An Efficient Data Transformation Operator

Author(s): Carreira, Paulo J.F.

Date: 2008

Persistent ID: http://hdl.handle.net/10451/14295

Origin: Repositório da Universidade de Lisboa

Subject(s): Relational Algebra; Data Transformation; Data Integration; Data Cleaning; Data Warehousing


Description

Data transformations are fundamental operations in legacy data migration, data integration, data cleaning, and data warehousing. These operations are often implemented as relational queries that aim at leveraging the optimization capabilities of most DBMSs. However, relational query languages like SQL are not expressive enough to specify one-to-many data transformations, an important class of data transformations that produce several output tuples for a single input tuple. These transformations are required for solving several types of data heterogeneities, like those that occur when the source data represents aggregations of the target data. This thesis proposes a new relational operator, named data mapper, as an extension to the relational algebra to address one-to-many data transformations and focus on its optimization. It also provides algebraic rewriting rules and execution algorithms for the logical and physical optimization, respectively. As a result, queries may be expressed as a combination of standard relational operators and mappers. The proposed optimizations have been experimentally validated and the key factors that influence the obtained performance gains identified.

Document Type Doctoral thesis
Language Portuguese
Advisor(s) Galhardas, Helena Isabel de Jesus; Silva, Mário Jorge Costa Gaspar da
Contributor(s) Repositório da Universidade de Lisboa
facebook logo  linkedin logo  twitter logo 
mendeley logo

Related documents