A Comparative Study on the Accuracy and the Speed of Static and Dynamic Program Classifiers

Research output: Contribution to book/Conference proceedings/Anthology/ReportConference contributionContributedpeer-review

Contributors

Abstract

Classifying programs based on their tasks is essential in fields such as plagiarism detection, malware analysis, and software auditing. Traditionally, two classification approaches exist: static classifiers analyze program syntax, while dynamic classifiers observe their execution. Although dynamic analysis is regarded as more precise, it is often considered impractical due to high overhead, leading the research community to largely dismiss it. In this paper, we revisit this perception by comparing static and dynamic analyses using the same classification representation: opcode histograms. We show that dynamic histograms-generated from instructions actually executed-are only marginally (4-5%) more accurate than static histograms in non-adversarial settings. However, if an adversary is allowed to obfuscate programs, the accuracy of the dynamic classifier is twice higher than the static one, due to its ability to avoid observing dead-code. Obtaining dynamic histograms with a state-of-the-art Valgrind-based tool incurs an 85x slowdown; however, once we account for the time to produce the representations for static analysis of executables, the overall slowdown reduces to 4x: a result significantly lower than previously reported in the literature.

Details

Original languageEnglish
Title of host publicationCC 2025 - Proceedings of the 34th ACM SIGPLAN International Conference on Compiler Construction
EditorsDaniel Kluss, Sara Achour, Jens Palsberg
PublisherAssociation for Computing Machinery, Inc
Pages13-24
Number of pages12
ISBN (electronic)9798400714078
Publication statusPublished - 25 Feb 2025
Peer-reviewedYes

Conference

Title34th ACM SIGPLAN International Conference on Compiler Construction
Abbreviated titleCC 2025
Conference number34
Descriptionco-located with CGO, PPoPP and HPCA
Duration1 - 2 March 2025
LocationWestin Las Vegas
CityLas Vegas
CountryUnited States of America

External IDs

ORCID /0000-0002-5007-445X/work/190572579

Keywords

Keywords

  • Binary Diffing, Classification, Valgrind