DataCalc: Ad-hoc Analyses on Heterogeneous Data Sources
Research output: Contribution to book/Conference proceedings/Anthology/Report › Conference contribution › Contributed › peer-review
Contributors
Abstract
Storing and processing data at different locations using a heterogeneous set of formats and data managements systems is state-of-the-art in many organizations. However, data analyses can often provide better insight when data from several sources is integrated into a combined perspective. In this paper we present an overview of our data integration system DataCalc. DataCalc is an extensible integration platform that executes adhoc analytical queries on a set of heterogeneous data processors. Our novel platform uses an expressive function shipping interface that promotes local computation and reduces data movement between processors. In this paper, we provide a discussion of the overall architecture and the main components of DataCalc. Moreover, we discuss the cost of integrating additional processors and evaluate the overall performance of the platform.
Details
| Original language | English |
|---|---|
| Title of host publication | Proceedings - 2019 IEEE International Conference on Big Data, Big Data 2019 |
| Editors | Chaitanya Baru, Jun Huan, Latifur Khan, Xiaohua Tony Hu, Ronay Ak, Yuanyuan Tian, Roger Barga, Carlo Zaniolo, Kisung Lee, Yanfang Fanny Ye |
| Publisher | Institute of Electrical and Electronics Engineers (IEEE) |
| Pages | 463-468 |
| Number of pages | 6 |
| ISBN (electronic) | 9781728108582 |
| Publication status | Published - Dec 2019 |
| Peer-reviewed | Yes |
Publication series
| Series | IEEE International Conference on Big Data |
|---|
Conference
| Title | 2019 IEEE International Conference on Big Data |
|---|---|
| Abbreviated title | IEEE Big Data 2019 |
| Duration | 9 - 12 December 2019 |
| Website | |
| City | Los Angeles |
| Country | United States of America |
External IDs
| Scopus | 85081362423 |
|---|---|
| ORCID | /0000-0001-8107-2775/work/142253464 |