Skip to content

Dev runtime partition prune#62589

Open
BiteTheDDDDt wants to merge 11 commits intomasterfrom
dev_runtime_partition_prune
Open

Dev runtime partition prune#62589
BiteTheDDDDt wants to merge 11 commits intomasterfrom
dev_runtime_partition_prune

Conversation

@BiteTheDDDDt
Copy link
Copy Markdown
Contributor

@BiteTheDDDDt BiteTheDDDDt commented Apr 17, 2026

-- =============================================================================
-- PART A — RANGE-partitioned fact table
--   4096 partitions, ~50M rows
-- =============================================================================

DROP TABLE IF EXISTS rf_perf_fact;
CREATE TABLE rf_perf_fact (
    id        BIGINT       NOT NULL,
    part_col  INT          NOT NULL,
    payload   VARCHAR(64)  NOT NULL
)
DUPLICATE KEY(id)
PARTITION BY RANGE(part_col) (
    -- 4096 buckets of width 1024 covering [0, 4096*1024)
    FROM (0) TO (4194304) INTERVAL 1024
)
DISTRIBUTED BY HASH(id) BUCKETS 4
PROPERTIES ("replication_num" = "1");

-- 50M rows; part_col uniformly spread across the 4096 partitions
INSERT INTO rf_perf_fact
SELECT
    number                              AS id,
    CAST(number % 4194304 AS INT)       AS part_col,
    CONCAT('p', CAST(number AS STRING)) AS payload
FROM numbers("number" = "50000000");

DROP TABLE IF EXISTS rf_perf_dim;
CREATE TABLE rf_perf_dim (
    dim_key  INT NOT NULL
)
DUPLICATE KEY(dim_key)
DISTRIBUTED BY HASH(dim_key) BUCKETS 1
PROPERTIES ("replication_num" = "1");

-- 5 keys, all in [0, 1024)  => all hit partition #0
INSERT INTO rf_perf_dim VALUES (1), (7), (42), (100), (999);

ANALYZE TABLE rf_perf_fact WITH SYNC;
ANALYZE TABLE rf_perf_dim  WITH SYNC;

-- =============================================================================
-- Baseline: feature OFF — BE must scan every partition and filter row by row
-- =============================================================================
SET enable_runtime_filter_partition_prune = false;

-- warm up file cache / page cache
SELECT COUNT(*) FROM rf_perf_fact f JOIN rf_perf_dim d ON f.part_col = d.dim_key;

SELECT /*+ SET_VAR(runtime_filter_type='IN_OR_BLOOM_FILTER') */
       'rf_prune_OFF' AS tag,
       COUNT(*)       AS matched_rows
FROM rf_perf_fact f
JOIN rf_perf_dim  d ON f.part_col = d.dim_key;

-- =============================================================================
-- Optimized: feature ON — BE drops 1022 of 1024 partitions before scanning
-- =============================================================================
SET enable_runtime_filter_partition_prune = true;

-- warm up under the new path too
SELECT COUNT(*) FROM rf_perf_fact f JOIN rf_perf_dim d ON f.part_col = d.dim_key;

SELECT /*+ SET_VAR(runtime_filter_type='IN_OR_BLOOM_FILTER') */
       'rf_prune_ON' AS tag,
       COUNT(*)      AS matched_rows
FROM rf_perf_fact f
JOIN rf_perf_dim  d ON f.part_col = d.dim_key;

before:

- ExecTime: 468.760ms
- NumScanners: 2.048K (2048)
- TabletNum: 2.048K (2048)

after:

- ExecTime: 43.52ms
- TabletsPrunedByRuntimeFilter: 2.047K (2047)
- NumScanners: 1
Fragment RPC Phase1 Time: 247ms -> 335ms
Wait and Fetch Result Time: 541ms ->35ms
Total: 868ms -> 450ms

BiteTheDDDDt and others added 2 commits April 17, 2026 11:36
fix

fix

[improvement](be) Add thrift definitions for BE-side runtime filter partition pruning

Problem Summary: Add thrift structures needed for BE-side runtime filter
partition pruning: partition_boundaries container in TOlapScanNode,
TTargetExprMonotonicity enum, and target_expr_monotonicity field in
TRuntimeFilterDesc. Also add detailed comments to existing
TPartitionBoundary and partiton_to_tablets fields.

Key design decisions reflected in this thrift change:
- No is_default_partition field: FE omits partitions it does not want pruned
- No is_range_partition field: partition type inferred from TPartitionBoundary
  field presence (range_start/range_end → Range, list_values → List)
- Single TTargetExprMonotonicity enum (NON_MONOTONIC / MONOTONIC_INCREASING /
  MONOTONIC_DECREASING) instead of two separate booleans

None

- Test: No need to test (thrift definition only, no runtime behavior change)
- Behavior changed: No
- Does this need documentation: No

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings April 17, 2026 09:29
@hello-stephen
Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds BE-side runtime-filter-based partition pruning for OLAP scans by having FE send partition boundary metadata to BE, then using runtime filter values to mark partitions (and associated tablets/scanners) as prunable during scan execution. It also introduces a dedicated regression test suite to validate pruning counters across range/list partitions and multiple data types.

Changes:

  • Extend Thrift plan structures to carry partition boundary descriptors (and related RF metadata) from FE to BE.
  • FE: populate partition boundary descriptors in OlapScanNode thrift; annotate runtime filter targets with a monotonicity hint.
  • BE: add a reusable runtime-filter partition pruner, integrate it into scan local state and scanner scheduling, and expose profile counters for pruned/total partitions.

Reviewed changes

Copilot reviewed 16 out of 16 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
regression-test/suites/query_p0/runtime_filter/rf_partition_pruning.groovy New regression tests asserting partition pruning via profile counters across partition types and expressions.
regression-test/data/query_p0/runtime_filter/rf_partition_pruning.out Golden output for the new regression suite queries.
regression-test/conf/regression-conf.groovy Changes default JDBC/HTTP ports used by regression framework configuration.
gensrc/thrift/PlanNodes.thrift Adds TPartitionBoundary, adds scan-node fields for partition pruning, and adds TTargetExprMonotonicity + RF descriptor field.
fe/fe-core/src/main/java/org/apache/doris/planner/RuntimeFilter.java Sets target_expr_monotonicity for eligible RF targets.
fe/fe-core/src/main/java/org/apache/doris/planner/OlapScanNode.java Populates partition_boundaries on TOlapScanNode based on selected partitions.
be/src/exec/scan/scanner_scheduler.cpp Stops scanner work early when its partition has been pruned after late RF processing.
be/src/exec/scan/scanner.h Adds a virtual hook for scanners to report partition-pruned state.
be/src/exec/scan/olap_scanner.{h,cpp} Implements partition-pruned checks for OLAP scanners.
be/src/exec/runtime_filter/runtime_filter_partition_pruner.{h,cpp} New BE component to parse boundaries and prune partitions based on RF conjuncts.
be/src/exec/operator/scan_operator.{h,cpp} Integrates pruner into scan local state and adds profile counters.
be/src/exec/operator/olap_scan_operator.{h,cpp} Parses FE-provided boundaries and filters tablets whose partitions are pruned.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +941 to +947
// Partition-id → tablet-id list mapping; used together with partition_boundaries
// for BE-side runtime filter partition pruning.
26: optional map<Types.TPartitionId, list<Types.TTabletId>> partiton_to_tablets
// Partition boundary descriptors for BE-side runtime filter partition pruning.
// Only partitions that are candidates for pruning are included; partitions FE
// does not want pruned (e.g. default catch-all) are omitted from this list.
27: optional list<TPartitionBoundary> partition_boundaries
Comment on lines +1425 to +1426
// When the target expression is a plain SlotRef (identity), FE may omit this
// field; BE treats an absent value as NON_MONOTONIC (conservative).
Comment on lines +293 to +316
for (RuntimeFilterTarget target : targets) {
if (target.expr instanceof SlotRef && target.node instanceof OlapScanNode) {
SlotRef slotRef = (SlotRef) target.expr;
Column col = slotRef.getColumn();
if (col != null) {
OlapScanNode scanNode = (OlapScanNode) target.node;
OlapTable table = scanNode.getOlapTable();
PartitionType partType = table.getPartitionInfo().getType();
if (partType == PartitionType.RANGE || partType == PartitionType.LIST) {
for (Column partCol : table.getPartitionInfo().getPartitionColumns()) {
if (partCol.getName().equalsIgnoreCase(col.getName())) {
tFilter.setTargetExprMonotonicity(
TTargetExprMonotonicity.MONOTONIC_INCREASING);
break;
}
}
}
}
}
// Only check the first target; if there are multiple targets on
// different tables, monotonicity may differ. We conservatively
// set it based on the first target only.
break;
}
Comment on lines +193 to +197
for (const auto& pb : boundaries_it->second) {
if (_pruned_partition_ids.contains(pb.partition_id) ||
newly_pruned.contains(pb.partition_id)) {
continue;
}
Comment on lines +27 to +28
jdbcUrl = "jdbc:mysql://127.0.0.1:9333/?useLocalSessionState=true&allowLoadLocalInfile=true&zeroDateTimeBehavior=round"
targetJdbcUrl = "jdbc:mysql://127.0.0.1:9333/?useLocalSessionState=true&allowLoadLocalInfile=true&zeroDateTimeBehavior=round"
feSyncerPassword = ""

feHttpAddress = "127.0.0.1:8030"
feHttpAddress = "127.0.0.1:8333"
@BiteTheDDDDt
Copy link
Copy Markdown
Contributor Author

run buildall

@BiteTheDDDDt
Copy link
Copy Markdown
Contributor Author

/review

@github-actions
Copy link
Copy Markdown
Contributor

OpenCode automated review failed and did not complete.

Error: Review step was skipped (possibly timeout or cancelled)
Workflow run: https://github.com/apache/doris/actions/runs/24560451334

Please inspect the workflow logs and rerun the review after the underlying issue is resolved.

@hello-stephen
Copy link
Copy Markdown
Contributor

Cloud UT Coverage Report

Increment line coverage 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 78.48% (1798/2291)
Line Coverage 64.12% (32283/50345)
Region Coverage 65.03% (16235/24967)
Branch Coverage 55.54% (8675/15620)

@BiteTheDDDDt
Copy link
Copy Markdown
Contributor Author

/review

@hello-stephen
Copy link
Copy Markdown
Contributor

BE Regression && UT Coverage Report

Increment line coverage 84.54% (257/304) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 73.71% (27513/37328)
Line Coverage 57.47% (298118/518704)
Region Coverage 54.54% (247670/454115)
Branch Coverage 56.05% (107208/191288)

@hello-stephen
Copy link
Copy Markdown
Contributor

FE Regression Coverage Report

Increment line coverage 77.32% (75/97) 🎉
Increment coverage report
Complete coverage report

@github-actions
Copy link
Copy Markdown
Contributor

OpenCode automated review failed and did not complete.

Error: Review step was failure (possibly timeout or cancelled)
Workflow run: https://github.com/apache/doris/actions/runs/24567893050

Please inspect the workflow logs and rerun the review after the underlying issue is resolved.

@BiteTheDDDDt
Copy link
Copy Markdown
Contributor Author

/review

Copy link
Copy Markdown
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is not ready to merge yet.

Key blocking issues:

  1. be/src/exec/operator/olap_scan_operator.cpp: the OLAP path ignores the existing enable_runtime_filter_partition_prune session variable, so SET enable_runtime_filter_partition_prune = false no longer disables runtime-filter partition pruning for OLAP scans.
  2. fe/fe-core/src/main/java/org/apache/doris/planner/OlapScanNode.java: multi-column RANGE partitions can be pruned incorrectly because FE serializes only the first partition column as an independent range, but Doris range partition membership is lexicographic over the full PartitionKey.
  3. be/src/exec/runtime_filter/runtime_filter_partition_pruner.cpp: nullable LIST partitions are unsafe because FE now serializes NULL_LITERAL boundary values, while BE converts every boundary literal into a concrete CppType instead of preserving NULL semantics.

Critical checkpoint conclusions:

  • Goal of task: add BE-side runtime-filter partition pruning for OLAP scans. The simple single-column non-null cases are covered, but the implementation is not correct for all supported partition shapes and it regresses the existing disable switch.
  • Modification size/focus: mostly focused, but it also adds thrift/API surface (partition_to_tablets, planId_to_target_monotonicity) that is not consumed end-to-end in this PR.
  • Concurrency: no new deadlock or lock-order issue stood out in the reviewed path; the new pruning state uses the existing _conjuncts_lock plus a shared_mutex.
  • Lifecycle/static initialization: no special lifecycle or SIOF issue found.
  • Configuration items: the existing enable_runtime_filter_partition_prune behavior is regressed on the OLAP path.
  • Compatibility: the new thrift fields are optional, so wire compatibility looks acceptable.
  • Functionally parallel code paths: external/file scans still honor the session flag, while OLAP scans do not.
  • Special conditional checks: the multi-column RANGE handling is not safe as implemented.
  • Test coverage: good single-column happy-path coverage, but no multi-column RANGE coverage, no nullable LIST-partition coverage, and the suite avoids late-arrival timing by forcing runtime_filter_wait_infinitely=true and parallel_pipeline_task_num=1.
  • Test result file changes: the new .out file looks consistent with the cases the suite actually covers.
  • Observability: the added counters are useful; no blocking observability issue found.
  • Transaction/persistence/data-write checkpoints: not applicable here.
  • FE-BE variable passing: the new partition-boundary payload is wired through, but its semantics are incomplete for multi-column and nullable partition cases.
  • Performance: no separate blocking perf issue beyond the unexpected extra work caused by ignoring the disable flag.
  • Other issues: none beyond the blockers above.

Please fix the blockers above before approval.

@@ -91,6 +95,9 @@ Status OlapScanLocalState::init(RuntimeState* state, LocalStateInfo& info) {

RETURN_IF_ERROR(Base::init(state, info));
RETURN_IF_ERROR(_sync_cloud_tablets(state));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This new OLAP path bypasses the existing enable_runtime_filter_partition_prune session flag. FileScanner still gates runtime-filter partition pruning on query_options().enable_runtime_filter_partition_prune, but here we always parse partition boundaries and later run _do_partition_pruning_by_rf() unconditionally from _process_conjuncts() / _on_runtime_filter_update(). As a result, SET enable_runtime_filter_partition_prune = false no longer disables the optimization for OLAP scans, which is a user-visible behavior regression.

}

private void addRangeBoundaries(List<TPartitionBoundary> boundaries, long partitionId,
RangePartitionItem rangeItem, List<Column> partColumns,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not safe for multi-column RANGE partitions. Doris range partition membership is based on lexicographic PartitionKey comparison, not on an independent range over the first column. For example, a legal partition like [(1, 1), (1, 5)) becomes a first-column boundary [1, 1), so BE will see an empty interval and can prune the partition even when k1 = 1 should still match rows inside it. Unless FE can serialize a provably correct projection, this optimization needs to be disabled for multi-column RANGE partitions.

bool is_range = tb.__isset.range_start || tb.__isset.range_end; \
if (!is_list && !is_range) break; \
ColumnValueRange<TYPE_##NAME> cvr(slot->col_name(), is_nullable, precision, scale); \
auto parse_texpr_node = [&](const TExprNode& node) -> std::pair<CppType, ColumnPtr> { \
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FE now emits NULL_LITERAL entries for nullable LIST partitions (OlapScanNode.addListBoundaries()), but this parser assumes every boundary literal can be converted into a concrete CppType. For a NULL_LITERAL, VLiteral builds a ColumnNullable; get_data_at(0) returns {nullptr, 0}, so the non-string branch dereferences a null pointer and the string branch collapses NULL to "". That makes nullable LIST partitions unsafe here. We need to propagate NULL through ColumnValueRange::contain_null() (or skip NULL boundaries for now) instead of treating it as an ordinary fixed value.

BiteTheDDDDt and others added 4 commits April 20, 2026 10:19
…time_filter_partition_prune

### What problem does this PR solve?

Issue Number: close #xxx

Problem Summary: The new internal-table runtime filter partition pruning
feature on this branch was always emitting partition boundaries
(OlapScanNode) and per-target monotonicity (RuntimeFilter) into the
thrift plan, regardless of the existing session variable
`enable_runtime_filter_partition_prune`. This made the switch
ineffective for turning the new feature off.

Gate both FE-side hooks on the session variable. When the switch is
off, the BE receives no partition_boundaries and no
plan_id_to_target_monotonicity, so the BE-side pruner stays empty and
all pruning code paths short-circuit naturally. No BE change required.

### Release note

None

### Check List (For Author)

- Test: No need to test (gating-only change; behavior covered by
  existing rf_partition_pruning regression suite which already toggles
  the variable)
- Behavior changed: Yes — `enable_runtime_filter_partition_prune=false`
  now actually disables the new internal-table feature.
- Does this need documentation: No

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
… RANGE and nullable LIST partitions

### What problem does this PR solve?

Issue Number: close #xxx

Problem Summary:

1. Multi-column RANGE: Doris RANGE partition membership is lexicographic
   PartitionKey comparison across all partition columns, not an
   independent per-column range. Projecting a multi-column range like
   [(1,1),(1,5)) onto its first column yields the empty interval [1,1),
   which the BE pruner would then use to wrongly prune the partition
   even though k1=1 rows still match. Until we can serialize a provably
   correct projection, only emit RANGE boundaries when there is exactly
   one partition column.

2. Nullable LIST partitions: addListBoundaries() emitted NULL_LITERAL
   into list_values, but the BE parser builds a typed CppType from each
   literal via VLiteral::get_column_ptr()->get_data_at(0). For a NULL
   literal that returns {nullptr, 0}, so the non-string branch
   dereferences a null pointer and the string branch silently collapses
   NULL into the empty string. Skip the whole column boundary for any
   partition whose values for that column include NULL — leaving the
   partition unpruned is always safe; emitting an incorrect boundary is
   not.

### Release note

None

### Check List (For Author)

- Test: No need to test (correctness fix; existing rf_partition_pruning
  regression suite covers single-column RANGE and non-null LIST cases
  that remain enabled)
- Behavior changed: Yes — multi-column RANGE and nullable LIST
  partitions are no longer pruned by RF (previously incorrect).
- Does this need documentation: No

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…e NULL in LIST runtime filter partition pruning

### What problem does this PR solve?

Issue Number: close #xxx

Problem Summary:

Two correctness issues in the runtime filter partition pruning path:

1. Multi-column RANGE partitions previously projected the partition's
   first column as the half-open interval [L1, U1), which is unsafe.
   For example, the partition [(1,1), (1,5)) only contains rows whose
   first key equals 1, so projecting to [1, 1) would mark the range
   as empty and incorrectly prune the partition. The mathematically
   safe projection is the closed interval [L1, U1]. A new optional
   thrift field 'range_end_inclusive' on TPartitionBoundary is
   introduced so the FE can tell BE to treat range_end as inclusive
   for multi-column RANGE; single-column RANGE keeps the original
   half-open semantics.

2. LIST partitions can legally contain NULL. The previous code sent a
   string literal for NULL, which caused the BE parser to read from a
   null pointer (UB / crash) and silently treat it as empty string.
   The FE now emits a real NULL_LITERAL TExprNode, and the BE parser
   recognises it, marks the column-value-range as 'contain_null', and
   the partition pruner conservatively skips any partition whose
   boundary contains NULL until the pruner is taught to reason about
   null_aware runtime filters. This is always safe (never prunes a
   partition that could match).

### Release note

None

### Check List (For Author)

- Test:
    - Manual test (BE single-file compile verified; full regression
      pending)
- Behavior changed: No (gated by enable_runtime_filter_partition_prune
  which defaults to true but the new logic is strictly more correct
  than before)
- Does this need documentation: No

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…tition pruner

### What problem does this PR solve?

Issue Number: close #xxx

Problem Summary:

Previously the BE pruner conservatively skipped any partition whose
boundary contained NULL (set via FE-emitted NULL_LITERAL on LIST keys).
That was correct but pessimistic: a NULL row in a partition can match
the runtime filter only when the RF itself accepts NULL, which is
already encoded as HybridSet::contain_null() (FilterBase folds in the
filter's null_aware bit).

This change uses that signal to enable pruning in two more cases:

1. "Only-NULL" LIST partitions (e.g. PARTITION pNull VALUES IN
   ((NULL))): tracked by a new ParsedBoundary::only_null flag. Pruned
   when the RF does not accept NULL (typical EQ / MinMax / IN-without-
   null), kept otherwise.

2. Mixed LIST partitions (NULL plus concrete values): if the RF does
   not accept NULL, the NULL rows can never match, so we ignore the
   contain_null bit and let the regular value intersection decide
   prunability. Still kept if the RF accepts NULL.

For BINARY_PRED runtime filters (=, <, >, <=, >=) NULL never compares
true, so they are treated as not accepting NULL.

### Release note

None

### Check List (For Author)

- Test:
    - Manual test (BE single-file compile verified)
- Behavior changed: No (gated by enable_runtime_filter_partition_prune;
  strictly more partitions can be pruned, never fewer, vs. the previous
  conservative behaviour)
- Does this need documentation: No

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@BiteTheDDDDt
Copy link
Copy Markdown
Contributor Author

/review

@BiteTheDDDDt
Copy link
Copy Markdown
Contributor Author

run buildall

Copy link
Copy Markdown
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is much closer, but one blocking correctness issue remains on the current head.

Blocking issue:

  1. be/src/exec/runtime_filter/runtime_filter_partition_pruner.cpp: mixed NULL+value LIST partitions are encoded as NULL-only. parse_boundaries() first intersects the concrete list values into cvr, but ColumnValueRange::set_contain_null(true) then calls set_empty_value_range() and drops those fixed values. For a partition like {NULL, 5}, the BE stores only the NULL marker, so a later RF on 5 can incorrectly prune the partition.

Critical checkpoint conclusions:

  • Goal: add FE->BE runtime-filter partition pruning for OLAP scans. The current head now wires the main flow correctly and the earlier config-gating / multi-column RANGE issues look addressed, but mixed-null LIST pruning is still incorrect.
  • Scope/focus: still reasonably focused; the remaining bug is localized to BE LIST-boundary parsing.
  • Concurrency: reviewed the pruner integration again; _conjuncts_lock serializes pruning, _prune_mutex protects shared state, and I did not find a new lock-order or lifecycle issue.
  • Lifecycle/static initialization: no new lifecycle or static-init problem found.
  • Configuration: enable_runtime_filter_partition_prune is now gated correctly on the FE side.
  • Compatibility: the new thrift fields remain optional; no new rolling-upgrade issue stood out.
  • Parallel code paths: the OLAP path now matches the session-variable behavior of the other scan paths.
  • Special conditions: the mixed-null LIST case needs a dedicated representation because set_contain_null(true) is only safe for NULL-only predicates, not NULL plus fixed values.
  • Tests: the regression suite is broad for single-column happy paths, but it still lacks a LIST partition case containing both NULL and concrete values; that missing negative case is what let this bug through.
  • Test results: the new .out file looks consistent with the cases the suite actually covers.
  • Observability: the added counters are sufficient for diagnosis.
  • Transaction/persistence/data-write checkpoints: not applicable here.
  • FE-BE variable passing: no additional wiring gap found on the current head.
  • Performance: no separate blocking performance issue found on the current head.
  • Other issues: none beyond the blocker above.

Please fix the mixed NULL LIST handling and add coverage for a LIST partition that contains both NULL and concrete values.

cvr.intersection(empty_cvr); \
} \
if (list_has_null && is_nullable) { \
cvr.set_contain_null(true); \
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When a LIST boundary contains both NULL and concrete values, cvr.intersection(empty_cvr) first materializes the concrete fixed-value set, but set_contain_null(true) immediately calls set_empty_value_range() and discards those fixed values. That turns a partition like {NULL, 5} into a NULL-only boundary, so _try_prune_by_single_rf() can later prune it even when the runtime filter contains 5. Please preserve both the concrete values and the NULL marker here instead of resetting the CVR.

BiteTheDDDDt and others added 2 commits April 20, 2026 11:41
…ns in RF pruner

### What problem does this PR solve?

Issue Number: close #xxx

Problem Summary:

When a LIST boundary contained both NULL and concrete values (e.g.
PARTITION p VALUES IN ((NULL),(5))), the parser called
cvr.intersection(empty_cvr) to materialize the concrete fixed-value
set and then cvr.set_contain_null(true). The latter unconditionally
calls set_empty_value_range() (see ColumnValueRange::set_contain_null),
which discards the fixed values that were just inserted. The boundary
collapsed to a NULL-only CVR, so a non-null-aware RF that does NOT
contain 5 would still be considered "empty intersection" and the
partition would be wrongly pruned even though row 5 actually matches
the RF.

Track NULL membership separately on ParsedBoundary (`contains_null`)
and let the CVR carry only the concrete fixed values; the pruner
already consults the new flag instead of cvr.contain_null().

### Release note

None

### Check List (For Author)

- Test:
    - Manual test (BE single-file compile verified)
- Behavior changed: No (RF partition pruning is gated by
  enable_runtime_filter_partition_prune; this fixes a correctness bug
  introduced in the previous commit on the same path)
- Does this need documentation: No

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
… more types in RF partition pruning suite

### What problem does this PR solve?

Issue Number: close #xxx

Problem Summary:

Extends regression-test/suites/query_p0/runtime_filter/rf_partition_pruning.groovy
with additional cases that exercise the recently-added correctness fixes
in the runtime filter partition pruner:

- NULL handling on LIST partitions
  - only-NULL partition pruned when RF does not accept NULL
  - mixed (NULL + concrete value) partition: concrete value must
    survive (regression test for the set_contain_null destructive
    overwrite bug)
  - non-matching RF on mixed partition still prunes (NULL row alone
    does not save the partition)
  - null-safe <=> join on mixed partition (smoke test for null_aware
    RF semantics)
- Multi-column RANGE projection
  - partition [(1,1),(1,5)) keeps rows when RF probes value 1, validating
    the closed [L1, U1] projection of the first partition column
- Type coverage for RANGE partitions: TINYINT, SMALLINT, LARGEINT,
  DECIMAL(10,2), DATE, DATETIME, VARCHAR
- Switch coverage: enable_runtime_filter_partition_prune=false should
  disable the new path completely (no boundaries reported, no pruning)

### Release note

None

### Check List (For Author)

- Test:
    - Regression test (added; .out to be regenerated by running the
      suite)
- Behavior changed: No
- Does this need documentation: No

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@BiteTheDDDDt
Copy link
Copy Markdown
Contributor Author

run buildall

@BiteTheDDDDt
Copy link
Copy Markdown
Contributor Author

/review

Copy link
Copy Markdown
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blocking issue:

  1. be/src/exec/runtime_filter/runtime_filter_partition_pruner.cpp: NULL-aware runtime filters are still treated as !rf_contains_null unless they are direct-IN filters. get_set_func() is null for bloom filters and for null-aware min/max predicates, so pb.only_null partitions are pruned even though <=> semantics should keep NULL probe rows. This can return wrong results for LIST partitions like VALUES IN ((NULL)) when the runtime filter is realized as BLOOM or NULL_AWARE_BINARY_PRED.

Critical checkpoint conclusions:

  • Goal of task: add BE-side runtime-filter partition pruning for OLAP scans. The main path is wired and the regression suite covers many single-column/non-null cases, but the current head is still incorrect for NULL-only LIST partitions under null-aware non-IN runtime filters, so the goal is not fully met.
  • Modification size/focus: The PR stays reasonably focused on planner metadata, BE pruning integration, and regression coverage.
  • Concurrency: I reviewed the new state flow and lock usage. _conjuncts_lock serializes pruning updates and _prune_mutex protects shared pruned-partition state; I did not find a blocking deadlock or lock-order issue.
  • Lifecycle/static initialization: No special lifecycle or static-initialization issue found.
  • Configuration items: The FE-side enable_runtime_filter_partition_prune gating now looks correct on the current head.
  • Compatibility/incompatible changes: The new thrift fields are optional, so wire compatibility looks acceptable.
  • Functionally parallel code paths: The OLAP path is wired, and I did not find another remaining FE/BE path mismatch on the current head.
  • Special conditional checks: The NULL-handling shortcut in the pruner is still incorrect for null-aware non-IN filters.
  • Test coverage: The suite is broad for single-column and non-null cases, but it does not force the null-safe <=> case down a real BLOOM or null-aware MIN/MAX path, so the remaining bug is uncovered.
  • Test result changes: The new .out file looks consistent with the queries the suite actually runs.
  • Observability: The added partition-pruning counters are useful and sufficient.
  • Transaction/persistence-related modifications: Not applicable.
  • Data writes/modifications: Not applicable.
  • FE-BE variable passing: partition_boundaries and null_aware are passed through, but BE does not yet interpret NULL-matching semantics correctly for all RF implementations.
  • Performance: No separate blocking performance issue found on the current head.
  • Other issues: None beyond the blocker above.

// contain_null() (see FilterBase::contain_null = _null_aware && _contain_null).
// For BINARY_PRED predicates (=, <, >, <=, >=) NULL never compares true,
// so they never match a NULL probe row -- treat as not containing NULL.
auto hybrid_set_for_null = impl->get_set_func();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rf_contains_null is only derived from get_set_func(), which means it works for direct-IN filters but stays false for the other null-aware RF implementations. In particular, BLOOM_PRED carries its NULL bit on get_bloom_filter_func()->contain_null(), and null-aware min/max filters arrive as NULL_AWARE_BINARY_PRED rather than BINARY_PRED. With the current logic, a LIST partition like VALUES IN ((NULL)) is pruned by the pb.only_null branch for a null-safe <=> join as soon as the RF is realized as bloom or null-aware min/max, even though NULL probe rows should match and the partition must be kept.

…artition pruner

### What problem does this PR solve?

Issue Number: close #xxx

Problem Summary:

The previous version determined whether a runtime filter accepts a NULL
probe row only via `impl->get_set_func()->contain_null()`. That covers
the IN flavour, but real RFs ship in two more shapes:

  * BLOOM_PRED:   NULL bit lives on get_bloom_filter_func()->contain_null()
  * NULL_AWARE_BINARY_PRED: a Min/Max RF built for null-safe equal joins;
    the NULL semantic is encoded in the node type itself (see
    runtime_filter/utils.cpp::create_vbin_predicate).

For these shapes, `rf_contains_null` was always false, so a LIST
partition like `VALUES IN ((NULL))` would be incorrectly pruned by the
`pb.only_null` branch even when a null-safe `<=>` join with a NULL on
the build side should keep the partition. Detect both shapes so the
NULL-aware semantics are honoured.

Also extend the value-pruning branch to accept NULL_AWARE_BINARY_PRED
so we still benefit from concrete-bound min/max pruning on partitions
that do not contain NULL.

### Release note

None

### Check List (For Author)

- Test:
    - Manual test (BE single-file compile verified). Existing regression
      suite covers null-safe join smoke path; pruning counts for the
      bloom path will surface in profile-based assertions when run.
- Behavior changed: No (gated by enable_runtime_filter_partition_prune;
  the change is strictly more conservative against false pruning)
- Does this need documentation: No

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
### What problem does this PR solve?

Issue Number: close #xxx

Problem Summary:

The suite asserts on FE query-profile counters
(TotalPartitionsForRFPruning / PartitionsPrunedByRuntimeFilter). Even
though each query embeds a unique UUID token used to locate its own
profile, the FE profile list is bounded (max_query_profile_num) and
shared across sessions. Heavy parallel regression traffic can evict
our profile before the poller (15 attempts, 500ms apart) finds it,
causing flaky failures with total=0/pruned=0. Move the suite into the
nonConcurrent group so profile-counter assertions are deterministic.

### Release note

None

### Check List (For Author)

- Test:
    - No need to test (regression metadata only)
- Behavior changed: No
- Does this need documentation: No

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@BiteTheDDDDt
Copy link
Copy Markdown
Contributor Author

/review

@BiteTheDDDDt
Copy link
Copy Markdown
Contributor Author

run buildall

Copy link
Copy Markdown
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Findings:

  1. regression-test/data/query_p0/runtime_filter/rf_partition_pruning.out is missing the five new golden sections added by the suite, so the new regression case does not pass as submitted.
  2. be/src/exec/runtime_filter/runtime_filter_partition_pruner.cpp does not handle BLOOM_PRED, so internal-table partition pruning becomes a no-op once an IN_OR_BLOOM_FILTER consumer materializes a real Bloom filter, or when runtime_filter_type='BLOOM_FILTER' is used directly.
  3. regression-test/suites/query_p0/runtime_filter/rf_partition_pruning.groovy still treats profile not found within 7.5s as zero counters, which leaves the suite timing-sensitive on slow FE/CI runs.

Critical checkpoint conclusions:

  • Goal of the task: partially met. The PR adds OLAP-side runtime-filter partition pruning and valuable NULL/multi-column RANGE fixes, but the regression artifact is incomplete and one RF shape (Bloom) is still unhandled.
  • Scope/focus: mostly focused on planner/scan/pruner/test changes; no blocking scope creep beyond the new optional thrift fields.
  • Concurrency: reviewed. Late-arrival RF updates are serialized by _conjuncts_lock, and the shared-mutex use around pruned-partition lookups is acceptable; no blocking race found.
  • Lifecycle/static init: reviewed. Scanner/pruner lifetimes and literal-column ownership look intentional; no blocking lifecycle issue found.
  • Config changes: no new config was added. Existing enable_runtime_filter_partition_prune gating is correctly honored on the FE side.
  • Compatibility: optional thrift additions look backward-compatible; no blocking rolling-upgrade issue found.
  • Parallel/related code paths: not fully covered. The external/file-scan partition-pruning path can evaluate actual RF conjuncts, but the new OLAP pruner still skips the Bloom RF shape.
  • Special condition checks: the new NULL and multi-column RANGE guards are well-motivated and conservative.
  • Test coverage: insufficient as submitted. The new ordered-query cases are not all reflected in the golden file, and the profile polling helper remains timing-sensitive.
  • Modified test results: incorrect/incomplete because the .out file is missing new sections.
  • Observability: the new partition counters are useful; no blocking observability issue beyond the flaky profile lookup in tests.
  • Transaction/persistence/data-write impacts: not applicable here; no blocking issue found.
  • Performance: Bloom RF partition pruning currently provides no benefit in the OLAP path, which leaves a real runtime-filter shape behind.

Requesting changes until the missing regression outputs are added and the remaining unsupported/flaky paths are addressed.

using CvrType = std::decay_t<decltype(boundary_cvr)>;
using CppType = typename CvrType::CppType;

auto hybrid_set = impl->get_set_func();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VBloomPredicate falls through this pruning logic today. In the real Bloom case (runtime_filter_type='BLOOM_FILTER', or IN_OR_BLOOM_FILTER after the wrapper switches past runtime_filter_max_in_num), impl->node_type()==BLOOM_PRED, get_set_func()==nullptr, and we never enter the MinMax branch either. The result is that the new OLAP partition-pruning path becomes a no-op for Bloom RFs, so LIST-partition scans lose pruning entirely once the filter stops being an IN set.

// ---- Profile utilities ----
def profileAction = new ProfileAction(context)

def getProfileByToken = { String token ->
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This helper still turns profile was not visible yet into total=0/pruned=0: if the profile list does not contain the token within 15 * 500ms, getProfileByToken() returns "" and extractCounterSum() quietly reads that as zero. On a slow or loaded FE that makes the suite fail even though pruning worked. Please fail explicitly when the profile is still missing, or keep polling until it appears / the query is known to be gone.

-- !span_minmax --
13 250 m
3 50 c

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The suite now defines five more ordered result checks, but this golden file still stops at span_minmax. The missing sections are list_only_null_pruned, list_only_null_minmax, list_mixed_value_kept, list_mixed_nullsafe, and range_multi_first_col, so run-regression-test.sh -d query_p0 -s rf_partition_pruning will fail at the first new order_qt_* case.

@hello-stephen
Copy link
Copy Markdown
Contributor

Cloud UT Coverage Report

Increment line coverage 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 78.02% (1842/2361)
Line Coverage 64.60% (32919/50958)
Region Coverage 65.22% (16326/25034)
Branch Coverage 55.76% (8707/15616)

@hello-stephen
Copy link
Copy Markdown
Contributor

FE UT Coverage Report

Increment line coverage 22.33% (23/103) 🎉
Increment coverage report
Complete coverage report

@hello-stephen
Copy link
Copy Markdown
Contributor

BE UT Coverage Report

Increment line coverage 1.42% (5/353) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 53.19% (20328/38216)
Line Coverage 36.74% (191335/520768)
Region Coverage 33.02% (148832/450786)
Branch Coverage 33.99% (65012/191278)

@hello-stephen
Copy link
Copy Markdown
Contributor

BE Regression && UT Coverage Report

Increment line coverage 81.59% (288/353) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 73.81% (27625/37429)
Line Coverage 57.53% (298704/519189)
Region Coverage 54.62% (248515/454969)
Branch Coverage 56.06% (107555/191868)

@hello-stephen
Copy link
Copy Markdown
Contributor

FE Regression Coverage Report

Increment line coverage 57.89% (77/133) 🎉
Increment coverage report
Complete coverage report

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants