DataCalc: Ad-hoc Analyses on Heterogeneous Data Sources

Publikation: Beitrag in Buch/Konferenzbericht/Sammelband/GutachtenBeitrag in KonferenzbandBeigetragenBegutachtung

Beitragende

Abstract

Storing and processing data at different locations using a heterogeneous set of formats and data managements systems is state-of-the-art in many organizations. However, data analyses can often provide better insight when data from several sources is integrated into a combined perspective. In this paper we present an overview of our data integration system DataCalc. DataCalc is an extensible integration platform that executes adhoc analytical queries on a set of heterogeneous data processors. Our novel platform uses an expressive function shipping interface that promotes local computation and reduces data movement between processors. In this paper, we provide a discussion of the overall architecture and the main components of DataCalc. Moreover, we discuss the cost of integrating additional processors and evaluate the overall performance of the platform.

Details

OriginalspracheEnglisch
TitelProceedings - 2019 IEEE International Conference on Big Data, Big Data 2019
Redakteure/-innenChaitanya Baru, Jun Huan, Latifur Khan, Xiaohua Tony Hu, Ronay Ak, Yuanyuan Tian, Roger Barga, Carlo Zaniolo, Kisung Lee, Yanfang Fanny Ye
Herausgeber (Verlag)IEEE, New York [u. a.]
Seiten463-468
Seitenumfang6
ISBN (elektronisch)9781728108582
PublikationsstatusVeröffentlicht - Dez. 2019
Peer-Review-StatusJa

Publikationsreihe

Reihe2019 IEEE International Conference on Big Data (Big Data)

Konferenz

Titel2019 IEEE International Conference on Big Data, Big Data 2019
Dauer9 - 12 Dezember 2019
StadtLos Angeles
LandUSA/Vereinigte Staaten

Externe IDs

Scopus 85081362423
ORCID /0000-0001-8107-2775/work/142253464