fix(bthread): refactor sharded priority queue with lock-free MPSC inb…#3270
fix(bthread): refactor sharded priority queue with lock-free MPSC inb…#3270yannan-wyn wants to merge 1 commit intoapache:masterfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Refactors the bthread “global priority” scheduling path to avoid unsafe multi-producer use of WorkStealingQueue by introducing a sharded design with an inbound lock-free queue per shard, plus adds new unit tests and a microbenchmark.
Changes:
- Introduces
PriorityShardper tag and routes priority submissions via shard inbound queues with owner bind/unbind lifecycle. - Updates scheduling/stealing logic to prefer owner shards, steal across shards, and provide fallback behavior during teardown.
- Adds unit tests for correctness/concurrency/owner changes and a microbenchmark suite for the new primitives.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| test/bthread_priority_queue_unittest.cpp | Adds correctness + concurrent-producer tests for priority queue behavior. |
| test/bthread_priority_queue_owner_unittest.cpp | Adds tests around dynamic worker/owner binding behavior under load. |
| test/bthread_priority_queue_benchmark.cpp | Adds microbenchmarks for inbound enqueue/dequeue, flush, steal, and baseline comparisons. |
| src/bthread/task_group.h | Adds _priority_shard_index for O(1) lookup of a TaskGroup’s owned shard. |
| src/bthread/task_control.h | Adds PriorityShard definition and TaskControl helper APIs for shard management. |
| src/bthread/task_control.cpp | Implements sharded priority queue logic, owner lifecycle, flushing, stealing, and fallback enqueue behavior. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
@wwbmmm |
|
LGTM |
|
#2819 #3078 |
|
@yannan-wyn Could you provide some performance test data, like in #2819? |
| if (_priority_queues[tag].steal(tid)) { | ||
| return true; | ||
| // priority queue: owner-first, then steal from other shards | ||
| if (_enable_priority_queue && !_priority_shards[tag].empty()) { |
There was a problem hiding this comment.
Wouldn't it be better to encapsulate the priority_queue-related logic in a separate function?
Hi @chenBright , I run the e2e benchmark and attach the results below. Params:
Summary: Latency (us) — 10k QPS
Latency (us) — 50k QPS
Latency (us) — 100k QPS
perf stat (50k QPS, 30s)
|
8c6f4a6 to
8c434ca
Compare
Each EventDispatcher gets its own WorkStealingQueue, making concurrent push from multiple EDs naturally SPMC-safe without spinlocks.
8c434ca to
886cc48
Compare



What problem does this PR solve?
Problem Summary:
The original priority queue implementation misuses
WorkStealingQueue— multiple producers callpush()concurrently, butpush()is designed for single-owner use only. This leads to potential data races under contention.See also: #2819, #3055, #3078, #3096
What is changed and the side effects?
Changed:
Why per-ED WSQ instead of sharded MPSC?
Side effects:
FLAGS_enable_bthread_priority_queue(default false).