Export partition to apache iceberg#1618
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 5b0e833565
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| { | ||
| auto * object_storage = dynamic_cast<StorageObjectStorageCluster *>(dest_storage.get()); | ||
|
|
||
| auto * iceberg_metadata = dynamic_cast<IcebergMetadata *>(object_storage->getExternalMetadata(query_context)); |
There was a problem hiding this comment.
Guard IcebergMetadata casts with USE_AVRO
IcebergMetadata is defined only under #if USE_AVRO in IcebergMetadata.h, but this new dynamic_cast<IcebergMetadata *> is compiled unconditionally. In non-AVRO builds (USE_AVRO=0), this translation unit (and the same pattern added in StorageReplicatedMergeTree.cpp) cannot compile, so the change breaks optional-AVRO build configurations.
Useful? React with 👍 / 👎.
| const String sidecar_path = replaceFileExtensionWithAvro( | ||
| filename_generator.convertMetadataPathToStoragePath(path)); |
There was a problem hiding this comment.
Use storage paths directly when reading export sidecars
The export path list is populated from filename.path_in_storage, but commit now treats each entry as a metadata path and calls convertMetadataPathToStoragePath before reading sidecars. With write_full_path_in_iceberg_metadata=1, table_dir is an URI prefix (for example s3://...) while these entries are plain storage paths (/...), so the conversion throws and EXPORT PARTITION cannot commit.
Useful? React with 👍 / 👎.
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 6c6194814d
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| context_copy->makeQueryContextForExportPart(); | ||
| context_copy->setCurrentQueryId(manifest.query_id); | ||
| context_copy->setSetting("output_format_parallel_formatting", manifest.parallel_formatting); | ||
| context_copy->setSetting("output_format_parquet_parallel_encoding", manifest.parquet_parallel_encoding); |
There was a problem hiding this comment.
Got a failing test trying to use output_format_parquet_compression_method ZSTD and SNAPPY. Looks like this setting isn't propagated here?
There was a problem hiding this comment.
Yeah.. this is the reason, but only for partition export. For part export that should be working.
I still need to find the best approach for persisting these settings.
mkmkme
left a comment
There was a problem hiding this comment.
There's no way I could finish this review properly in time. From reading the code diagonally, it looks good. Discussed some of the AI findings in the call. Probably worth addressing in the future PRs related to that feature.
Approving.
| - **Type**: `Bool` | ||
| - **Default**: `false` | ||
| - **Description**: Ignore existing partition export and overwrite the ZooKeeper entry. Allows re-exporting a partition to the same destination before the manifest expires. | ||
| - **Description**: Ignore existing partition export and overwrite the ZooKeeper entry. Allows re-exporting a partition to the same destination before the manifest expires. **IMPORTANT:** this is dangerou because it can lead to duplicated data, use it with caution. |
|
todo arthur check key value pair string because maybe it will be duplicated |
|
@arthurpassos For anyone upgrading CH from a version that didn't have export to one that does, existing/old tables' ZooKeeper trees are missing the I have a test for this failing when it tries to run EXPORT PARTITION on a table whose ZooKeeper tree doesn't have the exports/ folder yet. CH tries to create a file inside that folder but ZooKeeper refuses because the parent folder doesn't exist. The EXPORT command fails with a "no such node" error. |
This is weird, I should have it covered. Let's discuss this internally |
|
@arthurpassos Truncate and export seem to disagree on REST catalog. Basically when I EXPORT PARTITION A, CH writes the manifest-list path as a bucket-relative path: This looks like kinda the same issue I mentioned before, which should be fixed by ClickHouse#100420. So I guess we leave it for now? |
How does regular |
Discussed this internally and it was an err on test side. A re-start is required as the nodes are created upon initialization |
Insert is bucket-relative too, so same as export. Looks like truncate is the outlier. |
So.. here's the deal: I trusted the current Iceberg writes upstream implementation. It will write full paths for the manifest files (not data files) IF, and only if, On the other hand, the This is where the inconsistency lives. The question is: who is right? |
|
Tests are pretty much done and passing (with the exception of some issues discussed above with iceberg writes) and cover rest and glue catalogs. Did one more pass of AI audit and it came back with one issue. @arthurpassos Thoughts? |
This is VERY unlikely, it can only happen if the zookeeper data got corrupted in between calls. And for these cases, the task timeout should do its job. Regardless, the idea of treating those cases as errors and bump the commit_retry_attempts is a good idea. I will implement it. |
Should be fixed by af3dda1 |
|
We don't have time to iron out everything, but I think we've covered the main user scenarios. Since we're looking to release very soon, I'll put down a few more of AI's nitpicks that we can address if there is time. Otherwise, we can leave them for the next release. |
|
Export partition mechanics changes:
export_merge_tree_partition_system_table_prefer_remote_informationfalse by default (I am considering to remove it completely)getContextCopyWithTaskSettingsto avoid code duplicationenable_experimental_export_merge_tree_partition_featuretoallow_experimental_export_merge_tree_partitionallow_experimental_insert_into_icebergnot enabledApache Iceberg specifics:
write_full_path_in_iceberg_metadatain zookeeper taskf_clickhouse_export_partition_transaction_idto apache iceberg manifest so we can check it before comitting twiceChangelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Make changes to the export partition background engine and support experimental exports to apache iceberg
Documentation entry for user-facing changes
...
CI/CD Options
Exclude tests:
Regression jobs to run: