Thread-parallel Java applications received a substantial boost in the research field of High Performance Computing over the past years. In order to efficiently execute multithreaded Java applications, an analysis of their performance is indispensable. This requires a scalable runtime performance measurement infrastructure due to the high number of used threads. The established, open-source tracing framework Score-P provides such an infrastructure. However, it only provides basic support for multi-threaded Java applications so far. In this paper, we present a more sophisticated instrumentation approach based on Java bytecode transformations and implement the approach in Score-P. The approach allows to trace an application without requiring users to modify their source code. In contrast to instrumentation approaches based on the Java Virtual Machine Tool Interface JVMTI, it does not need filter checking and thus, has a comparable low run-time overhead. However, if needed, function and thread related events of the application can be filtered at runtime. Additionally, class files can be selected such that only those classes of an application are recorded, which users are interested in. We apply the proposed instrumentation approach to the LU kernel of the established Java benchmark suite SPECjvm2008 at a modern HPC shared-memory machine and show the quality of the implementation by determining the tracing overheads of the instrumented versions for different test scenarios using varying numbers of Java threads.
|Publication status||Published - 1 May 2017|
Research priority areas of TU Dresden
- instrumentation, Java, JVMTI, Multi-Threaded, performance analysis, profiling, Score-P, tracing