[Draft] Add compression ratio calculation and per-column compression stats (#18184) by johnsolomonj · Pull Request #18185 · apache/pinot

johnsolomonj · 2026-04-13T18:59:50Z

Labels: feature, release-notes, observability

Summary

Draft implementation for the PEP proposed in #18184. Kept as draft pending design review on the issue.

Adds compression ratio tracking and per-column compression stats to Pinot's existing table size and metadata APIs:

Track uncompressed forward index sizes at write time in all raw column writers (BaseChunkForwardIndexWriter subclasses, VarByteChunkForwardIndexWriterV4/V5/V6, CLPForwardIndexCreatorV2)
Persist uncompressed size and compression codec to metadata.properties per column
Expose compressionStats, columnCompressionStats, and storageBreakdown on both GET /tables/{table}/size and GET /tables/{table}/metadata
Add TABLE_COMPRESSION_RATIO_PERCENT and TABLE_TIERED_STORAGE_SIZE controller gauges with tier lifecycle management
Gated by table-level indexingConfig.compressionStatsEnabled flag (default: off, zero overhead when disabled)

Design document

See #18184 for the full PEP including motivation, prior art, API response structure, and known corner cases.

Key design decisions

Per-value tracking: Uncompressed size tracked at individual put*() callsites, capturing raw ingested data size without chunk headers or alignment padding
Shared codec resolution: ForwardIndexType.resolveCompressionType() handles CLP codec variants, used by both BaseSegmentCreator and ForwardIndexHandler
Dict columns out of scope: Dictionary-encoded columns report -1 for uncompressed size since writers only see dictionary IDs
Backward compatible: New metadata fields are additive; old segments gracefully return defaults

Test plan

Unit tests for writer uncompressed size tracking (fixed-byte, var-byte V1-V3, V4/V5/V6)
Unit tests for CLP V2 sub-stream size aggregation
Unit tests for ForwardIndexType.resolveCompressionType() codec resolution
Unit tests for ForwardIndexHandler compression stats persistence on reload
Controller aggregation tests (dict sentinel preservation, negative ratio guards, partial coverage)
Integration test for end-to-end compression stats API response
Verify zero overhead when compressionStatsEnabled = false

…to table size API This feature enables tracking and reporting of forward index compression effectiveness across Pinot segments. When `compressionStatsEnabled` is set in table config's indexing config, segment creation records uncompressed forward index sizes and compression codec in metadata.properties. The server-side table size endpoint now returns per-segment and per-column raw/compressed forward index sizes. The controller aggregates these into table-level compression ratio metrics (raw/compressed), with partial coverage tracking for mixed-version clusters. Three new ControllerGauge metrics (TABLE_COMPRESSION_RATIO_PERCENT, TABLE_RAW_FORWARD_INDEX_SIZE_PER_REPLICA, TABLE_COMPRESSED_FORWARD_INDEX_SIZE_PER_REPLICA) are emitted for monitoring. ForwardIndexHandler is updated to persist compression metadata during segment reload operations (compression type change and dict-to-raw conversion).

…feature - Add 6 new test files covering writer-level tracking, segment creation, corner cases, ForwardIndexHandler reload, and integration tests for both offline and realtime (Kafka) ingestion paths - Merge redundant dual-loop in TableSizeReader into a single pass over server info, improving performance during table size aggregation - Fix offline integration test teardown to properly wait for table data manager removal before stopping servers - Wrap second table cleanup in offline test in finally block to prevent resource leaks on assertion failure

…tier breakdown, and stale metadata cleanup - Wrap flat compression fields in nested CompressionStats DTO with @JsonInclude(NON_NULL) - Add StorageBreakdown with per-tier segment count and size (always reported) - Add per-column ColumnCompressionDetail with aggregated sizes, ratio, and codec (MIXED when codecs differ across segments) - Gate compressionStats on tableConfig.indexingConfig.compressionStatsEnabled; suppress from JSON when OFF - Fix isPartialCoverage: now correctly returns true when 0 segments have stats but non-missing segments exist - Clear stale forwardIndex.compressionCodec and forwardIndex.uncompressedSizeBytes on raw-to-dict reload - Support null values in SegmentMetadataUtils.updateMetadataProperties to clear properties - Add TABLE_TIERED_STORAGE_SIZE gauge; emit tier metrics always; clear compression+tier gauges when flag OFF - Add testRawToDictClearsCompressionStats, testCompressionStatsNullWhenFlagOff, per-column/tier assertions - Update integration tests for nested compressionStats JSON structure

…API, and comprehensive tests - Gate uncompressed size tracking in forward index writers via compressionStatsEnabled flag (ForwardIndexCreatorFactory, ForwardIndexHandler, all raw index creators) - Add per-column compression stats aggregation to server TablesResource and ServerSegmentMetadataReader with MIXED codec detection - Extend TableMetadataInfo DTO with columnCompressionStats field (NON_NULL suppression) - Fix integration test schema name mismatch for disabled-stats table - Add 7 new test classes: IndexingConfigCompressionFlagTest, SegmentGeneratorConfigPropagationTest, CLPForwardIndexCreatorV2StatsTest, ServerTableSizeReaderRawBytesTest, TableMetadataReaderCompressionTest, TableMetadataInfoCompressionTest, ForwardIndexHandlerCompressionStatsTest updates

…feature flag - Replace Map<String, ColumnCompressionStatsInfo> with List<ColumnCompressionStatsInfo> array containing all required fields: column, uncompressedSizeInBytes, compressedSizeInBytes, compressionRatio, codec, hasDictionary, indexes - Gate columnCompressionStats in both server endpoints (TablesResource metadata, TableSizeResource size) on compressionStatsEnabled feature flag - Add controller-side suppression in PinotTableRestletResource for safety against old servers that may still emit stats when flag is off - Fix forward index size accumulation: use getIndexSizeFor(StandardIndexes.forward()) directly per segment instead of cumulative variable - Sort columnCompressionStats array by column name for deterministic output - Update all tests and DTOs for the new array schema

…, and metadata endpoint gaps Resolve default compression codec (LZ4/PASS_THROUGH) in BaseSegmentCreator and ForwardIndexHandler when table config leaves chunkCompressionType null. Include dictionary-encoded columns in columnCompressionStats with hasDictionary=true. Clear stale controller compression metrics when no segments report stats. Suppress zeroed compressionStats for dict-only tables. Add compressionStats summary to the metadata endpoint aggregated from per-column data. Add tests for all fixes.

…ession - Move columnCompressionStats to top-level field on TableSubTypeSizeDetails instead of nesting inside CompressionStats inner class - Remove unused ColumnCompressionDetail inner class; use shared ColumnCompressionStatsInfo DTO from pinot-common - Fix metadata endpoint field names to match size endpoint: rawForwardIndexSizePerReplicaInBytes, compressedForwardIndexSizePerReplicaInBytes - Suppress compressionStats and columnCompressionStats for dict-only tables and when feature flag is OFF - Dictionary columns report -1 for uncompressedSizeInBytes to distinguish from zero-size raw columns on both size and metadata endpoints - Use LinkedHashSet for index deduplication in per-column aggregation

…adata endpoint - Create CompressionStatsSummary DTO in pinot-common with rawForwardIndexSizePerReplicaInBytes, compressedForwardIndexSizePerReplicaInBytes, and compressionRatio - Create StorageBreakdownInfo DTO in pinot-common with per-tier count and size - Add both as @JsonInclude(NON_NULL) fields on TableMetadataInfo with backward-compatible constructors - Server computes compressionStats summary and storageBreakdown during segment iteration and includes them in TableMetadataInfo response - Controller aggregates both fields across servers in ServerSegmentMetadataReader (divides by numReplica like other fields) - Remove manual addCompressionStatsSummary() JSON manipulation from PinotTableRestletResource; controller now just strips fields when flag is OFF - Fix use-after-release: tier accumulation moved inside try block before segments are released in the finally block

…racking accuracy - Remove ObjectNode.remove() pattern from PinotTableRestletResource; pass compressionStatsEnabled flag through TableMetadataReader and ServerSegmentMetadataReader so DTOs are constructed with null at creation time when the flag is OFF (storageBreakdown always preserved) - Add segmentsWithStats, totalSegments, isPartialCoverage to CompressionStatsSummary so metadata endpoint has identical JSON schema to the size endpoint; populate during server and controller aggregation - Fix VarByteChunkForwardIndexWriterV4 uncompressed size tracking: track raw value byte lengths in putBytes() instead of buffer.remaining() in write() which included chunk-format header overhead - Move segment stats counting inside the try block in TablesResource to avoid accessing segment metadata after segments are released - Add test for compression stats suppression when flag is disabled

Old segments lacking uncompressed size metadata (pre-flag segments) were contributing their compressed forward index size to the denominator while adding nothing to the numerator, deflating the compression ratio. Now only dictionary-encoded columns enter the compressed-only accumulation path; raw columns on old segments without codec/uncompressed data are skipped entirely from both numerator and denominator.

- Track uncompressed size per value in putInt/putLong/putFloat/putDouble (FixedByteChunkForwardIndexWriter) and putBytes (VarByteChunkForwardIndexWriter) instead of using chunk buffer remaining bytes which overcounts due to chunk-internal offset tables - Gate server size endpoint compression field collection on feature flag so no metadata access or accumulation occurs when flag is OFF - Exclude old raw segments lacking a persisted compression codec from per-column stats to prevent sentinel values leaking into aggregation - Guard Math.max in per-column accumulation against INDEX_NOT_FOUND (-1) sentinel values - Preserve per-column compression stats for dict-only tables even when no segments have raw forward index data (segmentsWithStats == 0) - Rename TABLE_COMPRESSION_RATIO_HUNDREDTHS gauge to TABLE_COMPRESSION_RATIO_PERCENT for consistency - Hoist IndexService.getInstance() outside per-column inner loop

- Fix negative compression ratio for dict columns in metadata endpoint: require both uncompressed > 0 and compressed > 0 before computing ratio in ServerSegmentMetadataReader (per-column and summary level) - Track emitted tier gauge keys per table in TableSizeReader so stale tier-suffixed gauges (tableName.tierKey) are removed when tiers disappear or table is deleted via SegmentStatusChecker cleanup - Resolve CLP V2 actual compression type (ZSTANDARD) from the CompressionCodec before falling back to ForwardIndexConfig's chunkCompressionType, which maps all CLP variants to PASS_THROUGH

Extract resolveCompressionType() as a shared utility in ForwardIndexType that correctly maps CLP codec variants to their actual compression types (CLPV2/CLPV2_ZSTD → ZSTANDARD, CLPV2_LZ4 → LZ4, CLP → PASS_THROUGH). This fixes ForwardIndexHandler using incorrect compression types during codec changes and dict-to-raw conversions. BaseSegmentCreator now uses the same shared method, handling nullable fieldType for schema evolution cases. Also document CLPForwardIndexCreatorV2.getUncompressedSize() semantics: returns pre-compression sub-stream byte total, not original UTF-8 length.

…umns Size endpoint flag-OFF path now uses the 5-arg SegmentSizeInfo constructor to pass through tier information, fixing storageBreakdown flattening all segments into the "default" tier when compression stats are disabled. Metadata endpoint aggregation now skips columns from old raw segments that have no persisted compression codec and no dictionary, preventing zero-filled entries from appearing in per-column compression stats.

Dictionary columns report -1 as uncompressed forward index size. When aggregating across replicas, skip accumulation for negative sentinels (using >= 0 guard) and reconstruct -1 in the output for dict columns that have no real uncompressed data. Also guard compressionStatsSummary construction on segmentsWithStats > 0 after replica division, avoiding a degenerate summary when integer division rounds the count to zero.

…stats Tier gauge emission was passing the tier-suffixed key (e.g. "myTable_OFFLINE.coldTier") to isLeaderForTable(), which expects the base table name. Hoist the leader check to use the canonical tableNameWithType before the tier loop, then emit gauges directly. On the server metadata endpoint, move computeIfAbsent for per-column compression accumulators inside the conditional branches so old raw segments without a persisted codec no longer create zero-filled entries.

codecov-commenter · 2026-04-13T19:51:35Z

Codecov Report

❌ Patch coverage is 57.60135% with 251 lines in your changes missing coverage. Please review.
✅ Project coverage is 63.34%. Comparing base (34d3fd6) to head (2563a62).
⚠️ Report is 9 commits behind head on master.

Files with missing lines	Patch %	Lines
...che/pinot/server/api/resources/TablesResource.java	31.50%	42 Missing and 8 partials ⚠️
...t/controller/util/ServerSegmentMetadataReader.java	51.25%	27 Missing and 12 partials ⚠️
.../apache/pinot/controller/util/TableSizeReader.java	78.34%	18 Missing and 16 partials ⚠️
.../pinot/server/api/resources/TableSizeResource.java	8.33%	29 Missing and 4 partials ⚠️
...ent/creator/impl/fwd/CLPForwardIndexCreatorV2.java	0.00%	19 Missing ⚠️
...mon/restlet/resources/CompressionStatsSummary.java	0.00%	14 Missing ⚠️
...segment/spi/index/metadata/ColumnMetadataImpl.java	0.00%	14 Missing ⚠️
...oller/api/resources/PinotTableRestletResource.java	0.00%	8 Missing ⚠️
...ment/index/forward/ForwardIndexCreatorFactory.java	68.18%	4 Missing and 3 partials ⚠️
...local/segment/creator/impl/BaseSegmentCreator.java	79.16%	0 Missing and 5 partials ⚠️
... and 10 more

Additional details and impacted files

@@             Coverage Diff              @@
##             master   #18185      +/-   ##
============================================
+ Coverage     63.13%   63.34%   +0.20%     
- Complexity     1610     1627      +17     
============================================
  Files          3213     3230      +17     
  Lines        195730   197198    +1468     
  Branches      30240    30520     +280     
============================================
+ Hits         123583   124916    +1333     
+ Misses        62281    62255      -26     
- Partials       9866    10027     +161

Flag	Coverage Δ
custom-integration1	`100.00% <ø> (?)`
integration	`100.00% <ø> (+100.00%)`	⬆️
integration1	`100.00% <ø> (?)`
integration2	`0.00% <ø> (ø)`
java-11	`63.29% <57.60%> (+0.18%)`	⬆️
java-21	`63.30% <57.60%> (+0.20%)`	⬆️
temurin	`63.34% <57.60%> (+0.20%)`	⬆️
unittests	`63.34% <57.60%> (+0.20%)`	⬆️
unittests1	`55.26% <40.50%> (-0.11%)`	⬇️
unittests2	`35.08% <57.43%> (+0.31%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

jsol-splunk added 16 commits April 11, 2026 08:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Draft] Add compression ratio calculation and per-column compression stats (#18184)#18185

[Draft] Add compression ratio calculation and per-column compression stats (#18184)#18185
johnsolomonj wants to merge 16 commits intoapache:masterfrom
johnsolomonj:feature/compression-stats-tracking

johnsolomonj commented Apr 13, 2026 •

edited

Loading

Uh oh!

codecov-commenter commented Apr 13, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

johnsolomonj commented Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Design document

Key design decisions

Test plan

Uh oh!

codecov-commenter commented Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

johnsolomonj commented Apr 13, 2026 •

edited

Loading

codecov-commenter commented Apr 13, 2026 •

edited

Loading