A case study on machine learning for synthesizing benchmarks

Andrés Goens; Chris Cummins; Alexander Brauckmann; Hugh Leather; Sebastian Ertel; Jeronimo Castrillon

doi:10.1145/3315508.3329976

A case study on machine learning for synthesizing benchmarks

Research output: Contribution to book/Conference proceedings/Anthology/Report › Conference contribution › Contributed › peer-review

Contributors

Andrés Goens - , Chair of Compiler Construction (cfaed) (Author)
Chris Cummins - , University of Edinburgh (Author)
Alexander Brauckmann - , Chair of Compiler Construction (cfaed) (Author)
Hugh Leather - , University of Edinburgh (Author)
Sebastian Ertel - , Chair of Compiler Construction (cfaed) (Author)
Jeronimo Castrillon - , Chair of Compiler Construction (cfaed) (Author)

Abstract

Good benchmarks are hard to find because they require a substantial effort to keep them representative for the constantly changing challenges of a particular field. Synthetic benchmarks are a common approach to deal with this, and methods from machine learning are natural candidates for synthetic benchmark generation. In this paper we investigate the usefulness of machine learning in the prominent CLgen benchmark generator. We re-evaluate CLgen by comparing the benchmarks generated by the model with the raw data used to train it. This re-evaluation indicates that, for the use case considered, machine learning did not yield additional benefit over a simpler method using the raw data. We investigate the reasons for this and provide further insights into the challenges the problem could pose for potential future generators.

Details

Original language	English
Title of host publication	MAPL 2019 - Proceedings of the 3rd ACM SIGPLAN International Workshop on Machine Learning and Programming Languages, co-located with PLDI 2019
Editors	Tim Mattson, Abdullah Muzahid, Armando Solar-Lezama
Publisher	Association for Computing Machinery (ACM), New York
Pages	38-46
Number of pages	9
ISBN (electronic)	9781450367196
Publication status	Published - 22 Jun 2019
Peer-reviewed	Yes

Publication series

Series	PLDI: Programming Language Design and Implementation

Workshop

Title	3rd ACM SIGPLAN International Workshop on Machine Learning and Programming Languages
Abbreviated title	MAPL 2019
Conference number	3
Description	co-located with PLDI 2019
Duration	22 June 2019
Website	https://pldi19.sigplan.org/home/mapl-2019
Degree of recognition	International event
Location	Phoenix Convention Center
City	Phoenix
Country	United States of America

External IDs

ORCID	/0000-0002-5007-445X/work/141545540

Keywords

Research priority areas of TU Dresden

Information Technology and Microelectronics

ASJC Scopus subject areas

Software

Keywords

Benchmarking, CLGen, Generative models, Machine Learning, Synthetic program generation

Research Portal of the TU Dresden

Contributors

Abstract

Details

Publication series

Workshop

External IDs

Keywords

Research priority areas of TU Dresden

ASJC Scopus subject areas

Keywords