BWoS: Formally Verified Block-based Work Stealing for Parallel Processing

Jiawei Wang; Bohdan Trach; Ming Fu; Diogo Behrens; Jonathan Schwender; Yutao Liu; Jitang Lei; Viktor Vafeiadis; Hermann Härtig; Haibo Chen

BWoS: Formally Verified Block-based Work Stealing for Parallel Processing

Research output: Contribution to book/Conference proceedings/Anthology/Report › Conference contribution › Contributed › peer-review

Contributors

Jiawei Wang - , Professor (rtd.) of Operating Systems, Chair of Computer Architecture, Dresden Research Lab Huawei Technologies (Author)
Bohdan Trach - , Chair of Systems Engineering, Dresden Research Lab Huawei Technologies (Author)
Ming Fu - , Dresden Research Lab Huawei Technologies (Author)
Diogo Behrens - , Dresden Research Lab Huawei Technologies (Author)
Jonathan Schwender - , Dresden Research Lab Huawei Technologies (Author)
Yutao Liu - , Dresden Research Lab Huawei Technologies (Author)
Jitang Lei - , Dresden Research Lab Huawei Technologies (Author)
Viktor Vafeiadis - , Max Planck Institute for Software Systems (Author)
Hermann Härtig - , Professor (rtd.) of Operating Systems (Author)
Haibo Chen - , Dresden Research Lab Huawei Technologies, Shanghai Jiao Tong University (Author)

Abstract

Work stealing is a widely-used scheduling technique for parallel processing on multicore. Each core owns a queue of tasks and avoids idling by stealing tasks from other queues. Prior work mostly focuses on balancing workload among cores, disregarding whether stealing may adversely impact the owner’s performance or hinder synchronization optimizations. Real-world industrial runtimes for parallel processing heavily rely on work-stealing queues for scalability, and such queues can become bottlenecks to their performance. We present Block-based Work Stealing (BWoS), a novel and pragmatic design that splits per-core queues into multiple blocks. Thieves and owners rarely operate on the same blocks, greatly removing interferences and enabling aggressive optimizations on the owner’s synchronization with thieves. Furthermore, BWoS enables a novel probabilistic stealing policy that guarantees thieves steal from longer queues with higher probability. In our evaluation, using BWoS improves performance by up to 1.25x in the Renaissance macrobenchmark when applied to Java G1GC, provides an average 1.26x speedup in JSON processing when applied to Go runtime, and improves maximum throughput of Hyper HTTP server by 1.12x when applied to Rust Tokio runtime. In microbenchmarks, it provides 8-11x better performance than state-of-the-art designs. We have formally verified and optimized BWoS on weak memory models with a model-checking-based framework.

Details

Original language	English
Title of host publication	Proceedings of the 17th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2023
Publisher	USENIX Association
Pages	833-850
Number of pages	18
ISBN (electronic)	9781939133342
Publication status	Published - 2023
Peer-reviewed	Yes

Publication series

Series	USENIX Annual Technical Conference (ATC)

Conference

Title	17th USENIX Symposium on Operating Systems Design and Implementation
Abbreviated title	OSDI 2023
Conference number	17
Duration	10 - 12 July 2023
Website	https://www.usenix.org/conference/osdi23
Location	Sheraton Boston Hotel
City	Boston
Country	United States of America

Research Portal of the TU Dresden

BWoS: Formally Verified Block-based Work Stealing for Parallel Processing

Contributors

Abstract

Details

Publication series

Conference

Keywords

ASJC Scopus subject areas