Distributed data warehouse resource monitoring'

Martins, Pedro; Sá, Filipe; Caldeira, Filipe; Abbasi, Maryam

http://hdl.handle.net/10400.19/7847

Utilize este identificador para referenciar este registo.

Nome:	Descrição:	Tamanho:	Formato:
Artigo_Conf_010.pdf		620.77 KB	Adobe PDF	Ver/Abrir

Contacte-nos

Autores

Resumo(s)

In this paper, we investigate the problem of providing scalability (out and in) to Extraction, Transformation, Load (ETL) and Querying (Q) (ETL+Q) process of data warehouses. In general, data loading, transformation, and integration are heavy tasks that are performed only periodically, instead of row by row. Parallel architectures and mechanisms can optimize the ETL process by speeding up each part of the pipeline process as more performance is needed. We propose parallelization solutions for each part of the ETL+Q, which we integrate into a framework, that is, an approach that enables the automatic scalability and freshness of any data warehouse and ETL+Q process. Our results show that the proposed system algorithms can handle scalability to provide the desired processing speed in big-data and small-data scenarios.

Palavras-chave

Scalability monitoring actuate ETL data warehouse

URI

http://hdl.handle.net/10400.19/7847

Coleções

ESTGV - DI - Artigo em ata de evento científico internacional

Ver registo completo