Performance Evaluation of Query Plan Recommendation with Apache Hadoop and Apache Spark

Publikation: Beitrag in FachzeitschriftForschungsartikelBeigetragenBegutachtung

Beitragende

  • Elham Azhir - , Islamic Azad University, Mobile Telecommunication Company of Iran (Autor:in)
  • Mehdi Hosseinzadeh - , Duy Tan University, University of Human Development (Autor:in)
  • Faheem Khan - , Gachon University (Autor:in)
  • Amir Mosavi - , Technische Universität Dresden, Óbuda University, Slovak University of Technology (Autor:in)

Abstract

Access plan recommendation is a query optimization approach that executes new queries using prior created query execution plans (QEPs). The query optimizer divides the query space into clusters in the mentioned method. However, traditional clustering algorithms take a significant amount of execution time for clustering such large datasets. The MapReduce distributed computing model provides efficient solutions for storing and processing vast quantities of data. Apache Spark and Apache Hadoop frameworks are used in the present investigation to cluster different sizes of query datasets in the MapReduce-based access plan recommendation method. The performance evaluation is performed based on execution time. The results of the experiments demonstrated the effectiveness of parallel query clustering in achieving high scalability. Furthermore, Apache Spark achieved better performance than Apache Hadoop, reaching an average speedup of 2x.

Details

OriginalspracheEnglisch
Aufsatznummer3517
Seitenumfang11
FachzeitschriftMathematics
Jahrgang10
Ausgabenummer19
PublikationsstatusVeröffentlicht - 26 Sept. 2022
Peer-Review-StatusJa

Schlagworte

Schlagwörter

  • access plan recommendation, Apache Hadoop, Apache Spark, artificial intelligence, big data, cloud computing, data science, MapReduce, parallel processing, soft computing

Bibliotheksschlagworte