[opt](memory) Move schemaHash from Replica to LocalReplica#62122
[opt](memory) Move schemaHash from Replica to LocalReplica#62122dataroaring wants to merge 3 commits intoapache:masterfrom
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
There was a problem hiding this comment.
Pull request overview
Moves schemaHash storage from the base Replica class into LocalReplica to reduce per-instance memory usage for CloudReplica (which does not need schemaHash).
Changes:
- Remove
schemaHashfield fromReplica; makegetSchemaHash()return-1andsetSchemaHash()a no-op in the base class. - Add
schemaHashfield toLocalReplicaand overridegetSchemaHash()/setSchemaHash()to store and return the real value. - Update
Replica.toString()to callgetSchemaHash()to preserve polymorphic behavior.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
fe/fe-core/src/main/java/org/apache/doris/catalog/Replica.java |
Removes schemaHash storage from the base type and stubs schema-hash accessors for non-local replicas. |
fe/fe-core/src/main/java/org/apache/doris/catalog/LocalReplica.java |
Adds schemaHash storage and overrides accessors so local replicas still expose the actual schema hash. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| public long getVersion() { | ||
| return this.version; | ||
| } | ||
|
|
||
| public int getSchemaHash() { | ||
| return schemaHash; | ||
| return -1; | ||
| } | ||
|
|
||
| // for compatibility | ||
| public void setSchemaHash(int schemaHash) { | ||
| this.schemaHash = schemaHash; | ||
| // no-op in base class; overridden in LocalReplica | ||
| } |
There was a problem hiding this comment.
Replica schemaHash behavior has changed (base returns -1/no-op; LocalReplica stores the value). Please add/adjust unit tests (e.g., in ReplicaTest) to assert that LocalReplica.getSchemaHash() returns the constructor value and that a CloudReplica (or any non-LocalReplica Replica) reports -1, so regressions are caught during refactors.
There was a problem hiding this comment.
Addressed in commit 7f5e6d3. The existing test tried to instantiate abstract Replica directly — merged the base-class no-op assertion into the CloudReplica test, which inherits the same stubs. All three schemaHash polymorphism tests now compile and cover the requested scenarios.
— ThinkOps 🤖
|
run buildall |
TPC-H: Total hot run time: 28801 ms |
TPC-DS: Total hot run time: 178496 ms |
FE UT Coverage ReportIncrement line coverage |
|
/review |
There was a problem hiding this comment.
No blocking findings.
Critical checkpoints:
- Goal: The change is narrowly aimed at removing
schemaHashstorage from the baseReplicaobject while preserving real schema-hash state forLocalReplica. ReviewingReplica.java,LocalReplica.java,CloudReplica.java,MetadataViewer.java,LocalTabletInvertedIndex.java,EnvFactory.java,CloudEnvFactory.java,OlapTable.java,CloudInternalCatalog.java,CloudRestoreJob.java, andGsonUtils.java, the goal is met.LocalReplicastill stores and returns the real value; cloud replicas now consistently expose-1through the base implementation. - Minimality: The patch is small and focused to the two touched classes. No extra protocol, config, or storage-format changes were introduced.
- Concurrency: No new concurrent state, lock ordering, or thread-safety concerns were introduced.
- Lifecycle / static initialization: No special lifecycle or static-initialization risk is involved.
- Config: No configuration changes.
- Compatibility: Safe.
schemaHashwas not part ofReplica's Gson-serialized state before this patch, so moving the live field fromReplicatoLocalReplicadoes not change FE image / replay payloads. No FE-BE protocol compatibility issue was introduced. - Parallel paths: I checked both local and cloud creation paths. Local-mode paths still populate schema hash in
LocalReplica; cloud-mode paths do not depend on per-replica schema-hash state. - Conditional checks: Existing
replica.getSchemaHash() != -1guards inMetadataViewerremain correct. - Test coverage: No targeted test was added for the new polymorphic behavior, so there is still a small coverage gap around
LocalReplica.get/setSchemaHash()versus the cloud sentinel behavior. - Observability: Cloud proc / debug paths that print
replica.getSchemaHash()will now consistently show-1. That is acceptable because schema hash is intentionally not meaningful forCloudReplica, but it is a small user-visible diagnostic change. - Transaction / persistence / data-write correctness: Not involved.
- FE-BE variable passing: Not involved.
- Performance: The intended memory optimization is plausible and I did not find any obvious hot-path regression.
- Other issues: None found.
Overall opinion: this looks good to merge.
|
Added unit tests for the schemaHash polymorphism in commit 8711734:
This addresses the review feedback requesting regression tests for the schemaHash behavior change. — ThinkOps 🤖 |
|
run buildall |
2 similar comments
|
run buildall |
|
run buildall |
|
Fixed the compile error in Merged the base-class no-op The final test coverage for the Copilot-requested schemaHash polymorphism:
— ThinkOps 🤖 |
|
run buildall |
FE Regression Coverage ReportIncrement line coverage |
|
/review |
schemaHash in the base Replica class wastes 4 bytes per CloudReplica. Cloud storage paths don't use schema_hash, and callers like MetadataViewer already guard with getSchemaHash() != -1. This follows the same pattern as dbId removal in apache#62079. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add four tests to ReplicaTest covering the schemaHash behavior change: - testLocalReplicaGetSchemaHash: constructor value is returned - testLocalReplicaSetSchemaHash: setter updates the value - testBaseReplicaSchemaHashReturnsNegativeOne: base class returns -1, setter is no-op - testCloudReplicaSchemaHashReturnsNegativeOne: CloudReplica inherits -1 stub Generated by ThinkOps
…ract Replica The testBaseReplicaSchemaHashReturnsNegativeOne test tried to instantiate abstract class Replica directly, causing a compile error. Merged the no-op setSchemaHash assertion into the existing CloudReplica test, which exercises the same base-class code paths since CloudReplica inherits getSchemaHash()/setSchemaHash() without overriding them. Generated by ThinkOps
7f5e6d3 to
6083321
Compare
|
run buildall |
FE UT Coverage ReportIncrement line coverage |
|
OpenCode automated review failed and did not complete. Error: Review step was failure (possibly timeout or cancelled) Please inspect the workflow logs and rerun the review after the underlying issue is resolved. |
FE Regression Coverage ReportIncrement line coverage |
Summary
schemaHashfield from baseReplicaclass toLocalReplica, saving 4 bytes perCloudReplicainstanceReplica.getSchemaHash()now returns-1(stub),setSchemaHash()is a no-opLocalReplicaoverrides both methods with actual storagedbIdremoval in [opt](memory) Remove unused dbId field from CloudReplica #62079Details
Cloud storage paths don't use
schema_hash, and callers likeMetadataVieweralready guard withgetSchemaHash() != -1, soCloudReplicareturning-1is safe and correct.Changes:
Replica.java: RemoveschemaHashfield;getSchemaHash()returns-1;setSchemaHash()is no-op;toString()usesgetSchemaHash()for polymorphismLocalReplica.java: AddschemaHashfield; overridegetSchemaHash()/setSchemaHash(); set in constructorCloudReplica.java: No changes needed (schemaHash param harmlessly ignored by super constructor)Test plan
CloudReplica.getSchemaHash()returns -1LocalReplica.getSchemaHash()returns actual valueMetadataViewerSCHEMA_ERROR check unaffected (guards with!= -1)🤖 Generated with Claude Code