Fix condition for using parquet metadata cache#1631
Fix condition for using parquet metadata cache#1631arthurpassos wants to merge 3 commits intoantalya-26.1from
Conversation
|
test |
|
AI audit note: This review comment was generated by AI (gpt-5.3-codex). Audit update for PR #1631 (Fix condition for using parquet metadata cache): Confirmed defects: No confirmed defects in reviewed scope. Coverage summary: |
PR #1631 CI Failure Verification — Fix condition for using parquet metadata cache
Code change under testSingle-line case-insensitivity fix in the gating condition for the Parquet metadata cache: // src/Storages/ObjectStorage/StorageObjectStorageSource.cpp
- && (object_info->getFileFormat().value_or(configuration->getFormat()) == "Parquet")
+ && (Poco::toLower(object_info->getFileFormat().value_or(configuration->getFormat())) == "parquet")Semantics:
Failure-by-failure verdict
Evidence and reasoning per item1–2) Iceberg
|
Update assertions to reflect changes in metadata caching behavior between versions 25.8 and 26.1.
PR #1631 CI Triage — Stateless & Integration Failures
VerdictNone of the stateless or integration failures are caused by the PR. All are pre-existing upstream flaky tests, cluster-startup infrastructure cascades, or a fuzzer random-query crash unrelated to the modified code path. PR Scope (very narrow)Two files,
The modified code path is gated by Stateless Tests
Evidence:
Integration TestsAll cross-referenced against upstream ClickHouse CI (
Upstream flake history querySELECT
test_name,
count() AS fails,
min(check_start_time) AS first_fail,
max(check_start_time) AS last_fail,
uniq(pull_request_number) AS unique_prs
FROM default.checks
WHERE test_name IN (
'03377_object_storage_list_objects_cache',
'test_storage_kafka/test_batch_slow_0.py::test_kafka_formats_with_broken_message[generate_old_create_table_query]',
'test_replicated_database/test.py::test_sync_replica',
'test_arrowflight_interface/test.py::test_doput_cmd_insert_invalid_format'
)
AND test_status = 'FAIL'
AND check_start_time > now() - INTERVAL 90 DAY
GROUP BY test_name
ORDER BY fails DESCResult:
Other Job Failures (not stateless/integration, for completeness)
Summary Table
The PR's stateless and integration test signal is clean modulo pre-existing flakiness and infra. Safe to merge from a CI-correctness standpoint; optionally rerun the affected jobs to clear the red UI. |
Apache Iceberg queries were not htiting the parquet metadata cache because
object_info->getFileFormat()resolves toIcebergDataObjectInfo::getFileFormat, which gets its return value fromIcebergObjectSerializableInfo. This field is filled with the value from Apache Iceberg manifest file, and it is upper case by default, which then fails clickhouse check for parquet metadata cache usage.Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Fix apache iceberg queries not hitting the parquet metadata cache
Documentation entry for user-facing changes
...
CI/CD Options
Exclude tests:
Regression jobs to run: