Skip to content

Add polymorphic window functions#18169

Open
yashmayya wants to merge 1 commit intoapache:masterfrom
yashmayya:polymorphic-window-functions
Open

Add polymorphic window functions#18169
yashmayya wants to merge 1 commit intoapache:masterfrom
yashmayya:polymorphic-window-functions

Conversation

@yashmayya
Copy link
Copy Markdown
Contributor

  • Resolve TODO in WindowValueAggregatorFactory by adding type-specific window value aggregator implementations
  • Add SumLongWindowValueAggregator for INT/LONG to avoid precision loss when summing large long values (> 2^53)
  • Add SumBigDecimalWindowValueAggregator for BIG_DECIMAL to preserve full decimal precision
  • Add primitive MinIntWindowValueAggregator / MaxIntWindowValueAggregator using fastutil IntArrayFIFOQueue
  • Add primitive MinLongWindowValueAggregator / MaxLongWindowValueAggregator using fastutil LongArrayFIFOQueue
  • Add MinComparableWindowValueAggregator / MaxComparableWindowValueAggregator as fallback for types like BIG_DECIMAL
  • Factory now dispatches based on columnDataType.getStoredType() instead of always using double-based aggregators
  • Add 56 unit tests covering factory dispatch, all new aggregators, null handling, removal support, and precision preservation

@yashmayya yashmayya requested a review from Copilot April 11, 2026 05:06
@yashmayya yashmayya added enhancement Improvement to existing functionality multi-stage Related to the multi-stage query engine window-functions Related to SQL window functions on the multi-stage query engine labels Apr 11, 2026
@yashmayya yashmayya force-pushed the polymorphic-window-functions branch from e81fbd0 to de1f0c3 Compare April 11, 2026 05:07
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds type-specific window value aggregators (and corresponding factory dispatch) so window aggregations preserve precision for integral and BIG_DECIMAL types, and reduces boxing for MIN/MAX on primitive types in pinot-query-runtime.

Changes:

  • Update WindowValueAggregatorFactory to dispatch SUM/MIN/MAX aggregators based on columnDataType.getStoredType().
  • Add new polymorphic implementations: SumLong*, SumBigDecimal*, primitive Min/Max(Int|Long)*, and Min/MaxComparable* fallbacks; rename existing double-based implementations for clarity.
  • Add a comprehensive WindowValueAggregatorTest suite covering factory dispatch, null/removal behavior, and precision.

Reviewed changes

Copilot reviewed 15 out of 15 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/operator/window/aggregate/WindowValueAggregatorFactory.java Dispatch to type-specific SUM/MIN/MAX implementations using stored type.
pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/operator/window/aggregate/SumLongWindowValueAggregator.java New long-accumulating SUM to avoid double precision loss for large integral values.
pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/operator/window/aggregate/SumDoubleWindowValueAggregator.java Rename/refocus existing SUM implementation as double-based.
pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/operator/window/aggregate/SumBigDecimalWindowValueAggregator.java New BigDecimal SUM implementation intended to preserve decimal precision.
pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/operator/window/aggregate/MinIntWindowValueAggregator.java New primitive INT MIN with fastutil deque for sliding windows.
pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/operator/window/aggregate/MaxIntWindowValueAggregator.java New primitive INT MAX with fastutil deque for sliding windows.
pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/operator/window/aggregate/MinLongWindowValueAggregator.java New primitive LONG MIN with fastutil deque (precision-safe vs double).
pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/operator/window/aggregate/MaxLongWindowValueAggregator.java New primitive LONG MAX with fastutil deque (precision-safe vs double).
pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/operator/window/aggregate/MinDoubleWindowValueAggregator.java Rename existing MIN implementation as double-based.
pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/operator/window/aggregate/MaxDoubleWindowValueAggregator.java Rename existing MAX implementation as double-based.
pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/operator/window/aggregate/MinComparableWindowValueAggregator.java New Comparable-based MIN fallback to preserve non-double types.
pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/operator/window/aggregate/MaxComparableWindowValueAggregator.java New Comparable-based MAX fallback to preserve non-double types.
pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/operator/window/aggregate/BoolAndWindowValueAggregator.java Rename class to match file name and window-aggregator naming.
pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/operator/window/aggregate/BoolOrWindowValueAggregator.java Rename class to match file name and window-aggregator naming.
pinot-query-runtime/src/test/java/org/apache/pinot/query/runtime/operator/window/aggregate/WindowValueAggregatorTest.java New unit tests covering factory dispatch and aggregator behaviors/precision.

if (value instanceof BigDecimal) {
return (BigDecimal) value;
}
return BigDecimal.valueOf(((Number) value).doubleValue());
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense, right?

Comment on lines +470 to +475
@Test(expectedExceptions = UnsupportedOperationException.class)
public void testComparableMinRemovalUnsupported() {
WindowValueAggregator<Object> agg = new MinComparableWindowValueAggregator(false);
agg.addValue(1);
agg.removeValue(1);
}
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Apr 11, 2026

Codecov Report

❌ Patch coverage is 84.78261% with 35 lines in your changes missing coverage. Please review.
✅ Project coverage is 63.16%. Comparing base (dcbe2ae) to head (de1f0c3).
⚠️ Report is 9 commits behind head on master.

Files with missing lines Patch % Lines
...window/aggregate/MaxLongWindowValueAggregator.java 65.51% 5 Missing and 5 partials ⚠️
...window/aggregate/MinLongWindowValueAggregator.java 68.96% 5 Missing and 4 partials ⚠️
.../window/aggregate/MaxIntWindowValueAggregator.java 79.31% 2 Missing and 4 partials ⚠️
.../window/aggregate/MinIntWindowValueAggregator.java 86.20% 1 Missing and 3 partials ⚠️
.../aggregate/MaxComparableWindowValueAggregator.java 92.30% 0 Missing and 2 partials ⚠️
.../aggregate/MinComparableWindowValueAggregator.java 92.30% 0 Missing and 2 partials ⚠️
.../aggregate/SumBigDecimalWindowValueAggregator.java 95.00% 0 Missing and 1 partial ⚠️
...window/aggregate/SumLongWindowValueAggregator.java 94.11% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #18169      +/-   ##
============================================
+ Coverage     63.03%   63.16%   +0.13%     
+ Complexity     1617     1616       -1     
============================================
  Files          3202     3221      +19     
  Lines        194717   195950    +1233     
  Branches      30046    30275     +229     
============================================
+ Hits         122739   123775    +1036     
- Misses        62250    62286      +36     
- Partials       9728     9889     +161     
Flag Coverage Δ
custom-integration1 100.00% <ø> (ø)
integration 100.00% <ø> (ø)
integration1 100.00% <ø> (ø)
integration2 0.00% <ø> (ø)
java-11 63.11% <84.78%> (+0.10%) ⬆️
java-21 63.14% <84.78%> (+0.14%) ⬆️
temurin 63.16% <84.78%> (+0.13%) ⬆️
unittests 63.16% <84.78%> (+0.13%) ⬆️
unittests1 55.41% <84.78%> (-0.15%) ⬇️
unittests2 34.73% <0.00%> (+1.29%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@yashmayya yashmayya marked this pull request as ready for review April 12, 2026 03:15

@Override
public void addValue(@Nullable Object value) {
if (value != null) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not introduced in this PR, but how do we handle null first/last?

Comment on lines +53 to +55
while (!_deque.isEmpty() && _deque.lastLong() > longValue) {
_deque.dequeueLastLong();
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for me to understand: It looks like in _deque we have N copies of the min value. It should be more efficient to just keep the min and the number of repetitions we had, right? I guess I'm missing something.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement Improvement to existing functionality multi-stage Related to the multi-stage query engine window-functions Related to SQL window functions on the multi-stage query engine

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants