A case study on machine learning for synthesizing benchmarks

Research output: Contribution to book/conference proceedings/anthology/reportConference contributionContributedpeer-review

Contributors

Abstract

Good benchmarks are hard to find because they require a substantial effort to keep them representative for the constantly changing challenges of a particular field. Synthetic benchmarks are a common approach to deal with this, and methods from machine learning are natural candidates for synthetic benchmark generation. In this paper we investigate the usefulness of machine learning in the prominent CLgen benchmark generator. We re-evaluate CLgen by comparing the benchmarks generated by the model with the raw data used to train it. This re-evaluation indicates that, for the use case considered, machine learning did not yield additional benefit over a simpler method using the raw data. We investigate the reasons for this and provide further insights into the challenges the problem could pose for potential future generators.

Details

Original languageEnglish
Title of host publicationMAPL 2019 - Proceedings of the 3rd ACM SIGPLAN International Workshop on Machine Learning and Programming Languages, co-located with PLDI 2019
EditorsTim Mattson, Abdullah Muzahid, Armando Solar-Lezama
PublisherAssociation for Computing Machinery (ACM), New York
Pages38-46
Number of pages9
ISBN (electronic)9781450367196
Publication statusPublished - 22 Jun 2019
Peer-reviewedYes

Publication series

SeriesPLDI: Programming Language Design and Implementation

Conference

Title3rd ACM SIGPLAN International Workshop on Machine Learning and Programming Languages, MAPL 2019, co-located with PLDI 2019
Duration22 June 2019
CityPhoenix
CountryUnited States of America

External IDs

ORCID /0000-0002-5007-445X/work/141545540

Keywords

Research priority areas of TU Dresden

ASJC Scopus subject areas

Keywords

  • Benchmarking, CLGen, Generative models, Machine Learning, Synthetic program generation