Skip to content

Fix condition for using parquet metadata cache#1631

Open
arthurpassos wants to merge 3 commits intoantalya-26.1from
arthurpassos-patch-11
Open

Fix condition for using parquet metadata cache#1631
arthurpassos wants to merge 3 commits intoantalya-26.1from
arthurpassos-patch-11

Conversation

@arthurpassos
Copy link
Copy Markdown
Collaborator

Apache Iceberg queries were not htiting the parquet metadata cache because object_info->getFileFormat() resolves to IcebergDataObjectInfo::getFileFormat, which gets its return value from IcebergObjectSerializableInfo. This field is filled with the value from Apache Iceberg manifest file, and it is upper case by default, which then fails clickhouse check for parquet metadata cache usage.

Changelog category (leave one):

  • Bug Fix (user-visible misbehavior in an official stable release)

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Fix apache iceberg queries not hitting the parquet metadata cache

Documentation entry for user-facing changes

...

CI/CD Options

Exclude tests:

  • Fast test
  • Integration Tests
  • Stateless tests
  • Stateful tests
  • Performance tests
  • All with ASAN
  • All with TSAN
  • All with MSAN
  • All with UBSAN
  • All with Coverage
  • All with Aarch64
  • All Regression
  • Disable CI Cache

Regression jobs to run:

  • Fast suites (mostly <1h)
  • Aggregate Functions (2h)
  • Alter (1.5h)
  • Benchmark (30m)
  • ClickHouse Keeper (1h)
  • Iceberg (2h)
  • LDAP (1h)
  • Parquet (1.5h)
  • RBAC (1.5h)
  • SSL Server (1h)
  • S3 (2h)
  • S3 Export (2h)
  • Swarms (30m)
  • Tiered Storage (2h)

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 9, 2026

Workflow [PR], commit [ac188dc]

@ianton-ru
Copy link
Copy Markdown

test test_storage_iceberg_with_spark/test_read_constant_columns_optimization.py::test_read_constant_columns_optimization is broken (#1629), ignore it.

ianton-ru
ianton-ru previously approved these changes Apr 20, 2026
Copy link
Copy Markdown

@ianton-ru ianton-ru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@arthurpassos arthurpassos added the port-antalya PRs to be ported to all new Antalya releases label Apr 20, 2026
zvonand pushed a commit that referenced this pull request Apr 21, 2026
@Selfeer
Copy link
Copy Markdown
Collaborator

Selfeer commented Apr 23, 2026

AI audit note: This review comment was generated by AI (gpt-5.3-codex).

Audit update for PR #1631 (Fix condition for using parquet metadata cache):

Confirmed defects:

No confirmed defects in reviewed scope.

Coverage summary:

Scope reviewed: `StorageObjectStorageSource::createReader` parquet-metadata-cache gating path (`src/Storages/ObjectStorage/StorageObjectStorageSource.cpp`) plus dependent format-resolution/case-insensitive lookup flow (`FormatFactory::getInputWithMetadata/getInput`, Iceberg object format source in `IcebergDataObjectInfo`).
Categories failed: none.
Categories passed: call-graph/transition review, branch coverage (cache-enabled and fallback path), error-contract consistency (same parser path for `Parquet`/`PARQUET`), C++ bug-class checks in scope (lifetime, invalidation, race/deadlock, exception-safety, integer/signedness, UB) with no confirmed defect.
Assumptions/limits: static audit only (no runtime fault-injection execution); analysis limited to this 1-line PR delta and directly affected call paths.

@Selfeer
Copy link
Copy Markdown
Collaborator

Selfeer commented Apr 23, 2026

PR #1631 CI Failure Verification — Fix condition for using parquet metadata cache

Field Value
PR #1631
Title Fix condition for using parquet metadata cache
Commit c7fa2310ea7f062cada9f4eba488faaceb989c01
Workflow run 24205243072
CI report ci_run_report.html
Verified on 2026-04-23

Code change under test

Single-line case-insensitivity fix in the gating condition for the Parquet metadata cache:

// src/Storages/ObjectStorage/StorageObjectStorageSource.cpp
- && (object_info->getFileFormat().value_or(configuration->getFormat()) == "Parquet")
+ && (Poco::toLower(object_info->getFileFormat().value_or(configuration->getFormat())) == "parquet")

Semantics:

  • For callers where the format string was already "Parquet" (e.g. format = Parquet in DDL/args) the condition is unchanged.
  • For callers where the format string is upper/mixed case (e.g. Iceberg's IcebergDataObjectInfo::getFileFormat() returning "PARQUET" from manifest files) the condition now evaluates to true whereas previously it was false. This makes Iceberg reads start hitting the parquet metadata cache.

Failure-by-failure verdict

# Failure Job(s) Verdict Related to PR?
1 test_storage_iceberg_with_spark/test_read_constant_columns_optimization.py::test_read_constant_columns_optimization[False-s3] Integration tests: amd_asan, db disk, old analyzer, 4/6, amd_binary, 4/5, arm_binary, distributed plan, 2/4 PR-related — test reference needs update Yes
2 test_storage_iceberg_with_spark/test_read_constant_columns_optimization.py::test_read_constant_columns_optimization[True-s3] Same 3 integration jobs PR-related — test reference needs update Yes
3 04034_parquet_v3_metadata_cache_no_query_context Stateless tests amd_binary, old analyzer, s3 storage, DatabaseReplicated, parallel Pre-existing timing flake; not caused by PR No
4 /s3/minio/export tests/export part/concurrent alter/during minio interruption Regression release s3_export_part Pre-existing unrelated flaky No
5 /swarms/feature/node failure/initiator out of disk space Regression release swarms, Regression aarch64 swarms Pre-existing known issue No
6 could not bring up docker-compose cluster Regression release parquet_aws_s3 Infrastructure/transient No
7 Grype high/critical CVE gate GrypeScanKeeper, GrypeScanServer (-alpine) Image-security gate No

Evidence and reasoning per item

1–2) Iceberg test_read_constant_columns_optimization[*-s3] — PR-related, requires test update

Failure signature (both False-s3 and True-s3):

assert 15 == ((7 * 3) + 1)     # expected 22, got 15  (diff = -7)
assert 18 == ((7 * 3) + 4)     # expected 25, got 18  (diff = -7)

The failing assertion is on an event counter (check_events at
test_read_constant_columns_optimization.py:190). The iceberg table has 7 data
files
(per the test's own comment), and the assertion multiplier is expected * 3
(3 queries) plus a cluster/non-cluster constant.

Why this is PR-related:

  • The test uses an Iceberg S3 table. Before the PR, Iceberg's getFileFormat()
    returned "PARQUET", which failed the == "Parquet" check — so the parquet
    metadata cache was bypassed, and per-file metadata events (the ones being
    counted) fired on every read.
  • After the PR, the case-insensitive check returns true for "PARQUET", so the
    metadata cache is hit on second/third queries. Exactly 7 fewer events
    per affected code path (one per file) is precisely what the test now sees.
  • Azure shards of the same test still pass ([False-azure], [True-azure]),
    because the PR does not affect Azure's object-info format string in the same
    way; that further localizes the effect to the parquet metadata cache path
    this PR targets.

Historical data (play.clickhouse.com): test_read_constant_columns_optimization
has zero recorded failures across all of upstream CI history. Three builds on
three different architectures failing the exact same way at once is deterministic,
not flake.

Conclusion: The PR changes observable behavior (cache is now actually used for
Iceberg-on-S3 parquet reads — which is the intended fix). The test's expected
event count was calibrated for the bug (cache always bypassed). The test's
reference constants must be adjusted to reflect the fixed behavior. This is a
test update required by the fix, not a functional regression.

3) 04034_parquet_v3_metadata_cache_no_query_context — pre-existing timing flake

Failure signature:

Reason: result differs with reference:
--- .reference
+++ .stdout
-100	4950
+0	0

The test creates an S3Queue table with format = Parquet and a MV into a
destination table, then polls for up to 30 seconds for the destination to
reach 100 rows. It got 0.

Why this is not PR-related:

  • The test uses the literal string 'Parquet' in the engine argument. For that
    value, old_cond == "Parquet" is true and Poco::toLower(...) == "parquet"
    is also true. The PR is a no-op on this code path.
  • The failure is "no rows processed in 30s" — a classic S3Queue scheduling /
    startup timing issue, not a logic regression.
  • Historical data on this test in upstream CI DB:
    • Most recent pre-PR failure (PR 86892, amd_tsan, flaky check,
      2026-04-07) was Test runs too long (> 180s) — same slowness family as
      this failure.
    • Older fails cluster on specific unrelated PRs (96130, 99163, 99274,
      100433), each tied to issues in those PRs (Python ValueError in test
      runner, Code: 499), not to format-case handling.
  • From 2026-03-27 onward, the test has 0 failures out of many thousands of
    runs in main — stable overall, with an occasional timing flake.

Conclusion: Pre-existing timing flake, not caused by this PR.

4) s3_export_part "concurrent alter / during minio interruption"

  • Fault-injection scenario under MinIO network interruption; asserts that a
    specific destination part (5_9_9_0) is present in part_log_matches_destination.
  • PR #1631 does not touch the export-part flow, scheduler, or part-log
    matching logic. Failure path is entirely orthogonal to parquet metadata
    cache gating.
  • Corroborated as pre-existing by prior triage docs in this repo
    (pr-1448-report.md, pr-1455-verification.md, pr-1517-verification.md,
    pr-1580-verification.md).

Conclusion: Pre-existing flaky / unrelated.

5) swarms / node failure / initiator out of disk space

  • Fails with UNKNOWN_DATABASE; job also reports many unrelated xfails
    (FULL OUTER JOIN ... #89996).
  • Present in prior triage docs (pr-1493-verification.md,
    pr-1517-verification.md, pr-1579-verification.md,
    pr-1593-verification.md) for completely unrelated PRs.
  • Unrelated to parquet metadata cache gating.

Conclusion: Pre-existing known issue.

6) parquet_aws_s3could not bring up docker-compose cluster

  • Environment-level failure that occurs before the test payload runs.
  • No causal path to StorageObjectStorageSource::createReader.

Conclusion: Infrastructure / transient.

7) Grype image CVE scan

  • High/critical vulnerability gate failures on base server/keeper images.
  • Purely image-level security scan, independent of code changes in this PR.

Conclusion: Infrastructure / security gate.

Validation against the AI audit note

The provided audit note states:

"No confirmed defects in reviewed scope."
(scope: StorageObjectStorageSource::createReader parquet-metadata-cache
gating path, format-resolution/case-insensitive lookup flow, and Iceberg
object format source.)

CI evidence supports the correctness of this note's static review:

  • The fix does what it intends (enables parquet metadata cache for Iceberg-on-S3
    reads).
  • No C++ bug-class defect is observed in CI (no crashes, no UB, no use-after-free,
    no lifetime issue, no deadlock). The failing tests do not report fatal
    messages or cascades.

However, the audit note does not note the downstream test-reference impact:

  • The Iceberg test_read_constant_columns_optimization[*-s3] test encodes
    reference event counts that were calibrated assuming the metadata cache was
    not hit for Iceberg. This PR intentionally flips that behavior, and the
    test now reports a deterministic off-by-7 mismatch across three build configs.

So the note's conclusion is correct as a defect review, but incomplete as a
behavior-change review: a test-reference update is required as part of this
PR.

Summary for the PR author

  • Failures caused by this PR: 2 tests (same test, two parameterizations)
    • test_read_constant_columns_optimization[False-s3]
    • test_read_constant_columns_optimization[True-s3]
    • Cause: intended behavior change (cache now hit for Iceberg). Test's
      reference event counts need to be updated.
  • Failures NOT caused by this PR: 5 items
    • 04034_parquet_v3_metadata_cache_no_query_context — pre-existing timing
      flake; the PR is a no-op on the code path this test exercises.
    • s3_export_part MinIO-interruption — pre-existing flaky, unrelated.
    • swarms / node failure / initiator out of disk space — pre-existing known.
    • parquet_aws_s3 docker-compose bring-up — infra/transient.
    • Grype CVE gate — infra/security gate.

Recommendations

  1. Update the reference counts in
    tests/integration/test_storage_iceberg_with_spark/test_read_constant_columns_optimization.py
    (function check_events) to reflect the correct cached behavior. Expected
    drop is one event per file for the cached-path queries; specifically the
    asserted values should decrease from (7*3 + 1) / (7*3 + 4) to the
    observed 15 / 18 (or compute the new formula from first principles).
  2. Re-run the three failing integration shards after updating the reference.
  3. Retry the 04034_parquet_v3_metadata_cache_no_query_context stateless shard
    to confirm the timing flake clears; no code change needed for it in this PR.
  4. s3_export_part, swarms, parquet_aws_s3, and Grype failures do not
    block this PR. Track them under their existing issues.

Update assertions to reflect changes in metadata caching behavior between versions 25.8 and 26.1.
@Selfeer
Copy link
Copy Markdown
Collaborator

Selfeer commented Apr 24, 2026

PR #1631 CI Triage — Stateless & Integration Failures

Verdict

None of the stateless or integration failures are caused by the PR. All are pre-existing upstream flaky tests, cluster-startup infrastructure cascades, or a fuzzer random-query crash unrelated to the modified code path.

PR Scope (very narrow)

Two files, +19/-6:

  1. src/Storages/ObjectStorage/StorageObjectStorageSource.cpp — one-line fix: lowercase the format string when checking the parquet-metadata-cache guard, so Iceberg's uppercase "PARQUET" (read from the Iceberg manifest) matches.

    // before
    && (object_info->getFileFormat().value_or(configuration->getFormat()) == "Parquet")
    // after
    && (Poco::toLower(object_info->getFileFormat().value_or(configuration->getFormat())) == "parquet")
  2. tests/integration/test_storage_iceberg_with_spark/test_read_constant_columns_optimization.py — adjusts the per-file object-get counter (S3 now hits the cache, Azure doesn't because etag is empty) and adds a clarifying comment.

The modified code path is gated by use_parquet_metadata_cache && use_native_reader_v3 && format==Parquet && !etag.empty() inside StorageObjectStorageSource::createReader. Nothing outside this narrow Iceberg/parquet read path is touched.

Stateless Tests

Test Job Verdict
03377_object_storage_list_objects_cache Stateless (amd_binary, ParallelReplicas, s3 storage, sequential) Flaky, unrelated

Evidence:

  • Failed on first run with ProfileEvents['ObjectStorageListObjectsCacheMisses'] > 0 returning 0 instead of 1 (event-counter race via system.query_log after SYSTEM FLUSH LOGS).
  • Passed on automatic retry: [ OK ] 1.80 sec (same commit, same job; see stateless_job.log:1865).
  • Tests ObjectStorageListObjectsCache (list-objects cache) — not the parquet metadata cache modified by the PR. The PR's guarded branch (use_parquet_metadata_cache + etag + format=="parquet") is never exercised by this test.

Integration Tests

All cross-referenced against upstream ClickHouse CI (play.clickhouse.com, default.checks, last 90 days).

Test Job Upstream fails / PRs Verdict
test_replicated_database/test.py::test_sync_replica Integration (amd_asan, db disk, old analyzer, 2/6) 135 fails / 72 PRs (Jan 26 → Apr 16, 2026) Pre-existing upstream flake
test_arrowflight_interface/test.py::test_doput_cmd_insert_invalid_format Integration (arm_binary, distributed plan, 3/4) 53 fails / 34 PRs — also passed on retry in this run Pre-existing upstream flake
test_storage_kafka/test_batch_slow_0.py::test_kafka_formats_with_broken_message[generate_old_create_table_query] Integration (amd_asan, db disk, old analyzer, 1/6) 8 fails / 6 PRs (Mar 17 → Apr 24, 2026); generate_new_create_table_query variant passed in the same run Pre-existing upstream flake
test_s3_cluster/test.py::test_ambiguous_join, test_cluster_default_expression, ~23 other test_s3_cluster tests Integration (amd_asan, db disk, old analyzer, 2/6) All fail with identical Exception: Timed out while waiting for instance 'c2.s0_0_1' with ip 172.16.1.9 to start Infrastructure cascade — one Docker cluster-startup timeout, every test in the file errors at fixture setup

Upstream flake history query

SELECT
    test_name,
    count() AS fails,
    min(check_start_time) AS first_fail,
    max(check_start_time) AS last_fail,
    uniq(pull_request_number) AS unique_prs
FROM default.checks
WHERE test_name IN (
  '03377_object_storage_list_objects_cache',
  'test_storage_kafka/test_batch_slow_0.py::test_kafka_formats_with_broken_message[generate_old_create_table_query]',
  'test_replicated_database/test.py::test_sync_replica',
  'test_arrowflight_interface/test.py::test_doput_cmd_insert_invalid_format'
)
AND test_status = 'FAIL'
AND check_start_time > now() - INTERVAL 90 DAY
GROUP BY test_name
ORDER BY fails DESC

Result:

test_replicated_database/test.py::test_sync_replica                                                               135 fails   72 PRs   2026-01-26 → 2026-04-16
test_arrowflight_interface/test.py::test_doput_cmd_insert_invalid_format                                           53 fails   34 PRs   2026-01-27 → 2026-03-20
test_storage_kafka/test_batch_slow_0.py::test_kafka_formats_with_broken_message[generate_old_create_table_query]    8 fails    6 PRs   2026-03-17 → 2026-04-24

03377_object_storage_list_objects_cache returns 0 upstream hits — it is an Altinity-specific test, but it passed on retry in this run, confirming it's flaky.

Other Job Failures (not stateless/integration, for completeness)

  • AST fuzzer (arm_asan)Failures: 1/1, FAIL [Unknown error] — Lost connection to server.
    Crash in server.log:

    BaseDaemon: Received signal 6 (Aborted)
    IColumn::assertTypeEquality   (IColumn.h:837)
    FunctionIf::executeGeneric    (if.cpp:828, insertFrom)
    FunctionArrayRemove::executeImpl (arrayRemove.cpp:127)
    

    Fuzzer random mutation on arrayRemove + mixed-type arrays. Upstream has the same signature: Logical error: Filter column for arrayRemove was not evaluated as ColumnUInt8 (7 upstream fails) and Unknown error in AST fuzzer (20 fails across 12 PRs). Unrelated — nothing near StorageObjectStorageSource/parquet.

  • Integration tests (amd_asan, db disk, old analyzer, 4/6) and 5/6Internal error in pytest or a plugin (exit status: 2). Pytest/runner-level error, no tests executed. Infrastructure.

Summary Table

Category Count Details
PR-caused regressions 0
Pre-existing upstream flakes 3 test_sync_replica, test_doput_cmd_insert_invalid_format, test_kafka_formats_with_broken_message[generate_old_create_table_query]
Altinity-only flake (passed on retry) 1 03377_object_storage_list_objects_cache
Infrastructure cascade (Docker startup) ~25 All test_s3_cluster::* in 2/6
Infrastructure (pytest runner error) 2 jobs 4/6 and 5/6
Unrelated fuzzer crash (known upstream pattern) 1 arrayRemove / FunctionIf::executeGeneric

The PR's stateless and integration test signal is clean modulo pre-existing flakiness and infra. Safe to merge from a CI-correctness standpoint; optionally rerun the affected jobs to clear the red UI.

@Selfeer Selfeer added the verified Approved for release label Apr 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

antalya antalya-26.1 antalya-26.1.11.20001 bugfix port-antalya PRs to be ported to all new Antalya releases verified Approved for release

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants