The HOPSA Workflow and Tools

Research output: Contribution to book/conference proceedings/anthology/reportConference contributionContributedpeer-review

Contributors

  • Bernd Mohr - , Jülich Research Centre (Author)
  • Vladimir Voevodin - , Lomonosov Moscow State University (Author)
  • Judit Giménez - , Barcelona Supercomputing Center (Author)
  • Erik Hagersten - , Rogue Wave Software AB (RWAB) (Author)
  • Andreas Knüpfer - , Center for Information Services and High Performance Computing (ZIH) (Author)
  • Dmitry A. Nikitenko - , Lomonosov Moscow State University (Author)
  • Mats Nilsson - , Rogue Wave Software AB (RWAB) (Author)
  • Harald Servat - , Barcelona Supercomputing Center (Author)
  • Aamer Shah - , German Research School for Simulation Sciences (Author)
  • Frank Winkler - , Center for Information Services and High Performance Computing (ZIH) (Author)
  • Felix Wolf - , German Research School for Simulation Sciences (Author)
  • Ilya Zhukov - (Author)

Abstract

To maximise the scientific output of a high-performance computing system, different stakeholders pursue different strategies. While individual application developers are trying to shorten the time to solution by optimising their codes, system administrators are tuning the configuration of the overall system to increase its throughput. Yet, the complexity of today's machines with their strong interrelationship between application and system performance presents serious challenges to achieving these goals. The HOPSA project (HOlistic Performance System Analysis) therefore sets out to create an integrated diagnostic infrastructure for combined application and system-level tuning -- with the former provided by the EU and the latter by the Russian project partners. Starting from system-wide basic performance screening of individual jobs, an automated workflow routes findings on potential bottlenecks either to application developers or system administrators with recommendations on how to identify their root cause using more powerful diagnostic tools. Developers can choose from a variety of mature performance-analysis tools developed by our consortium. Within this project, the tools will be further integrated and enhanced with respect to scalability, depth of analysis, and support for asynchronous tasking, a node-level paradigm playing an increasingly important role in hybrid programs on emerging hierarchical and heterogeneous systems.

Details

Original languageUndefined
Title of host publicationTools for High Performance Computing 2012
EditorsAlexey Cheptsov, Steffen Brinkmann, José Gracia, Michael M. Resch, Wolfgang E. Nagel
Place of PublicationBerlin, Heidelberg
PublisherSpringer, Berlin [u. a.]
Pages127-146
Number of pages20
ISBN (print)978-3-642-37349-7
Publication statusPublished - 2013
Peer-reviewedYes