feat(encryption): Stage 6E-1b — EnableRaftEnvelope admin RPC + elastickv-admin CLI#907
Conversation
…ckv-admin CLI subcommand Mirror of the 6D-6 EnableStorageEnvelope path for the raft variant: proto definition, server method, CLI subcommand, tests. ## What lands - proto/encryption_admin.proto — new `EnableRaftEnvelope` RPC + `EnableRaftEnvelopeRequest` / `EnableRaftEnvelopeResponse` messages. The response omits the storage variant's `cutover_index_unknown` defensive field because `RaftEnvelopeCutoverIndex != 0` IS the active sentinel (no separate bool to defend against, and 6E-1a already fail-closes on `raftIdx == 0` before any sidecar mutation). - adapter/encryption_admin.go — `EnableRaftEnvelope` server method with the same semaphore / pre-check / fan-out / propose / post-check sequence as EnableStorageEnvelope. Differences: target Purpose=PurposeRaft, source slot Active.Raft, active-sentinel field RaftEnvelopeCutoverIndex. - cmd/elastickv-admin — `enable-raft-envelope` subcommand (parser, runner, printer). Output prefixes (`enabled` / `already-active`) mirror the storage subcommand for script compatibility; no warning line because the variant has no defensive cutover_index_unknown field. - Tests: 5 server tests covering raft-specific branches (zero proposer ID, oversized epoch, not-bootstrapped, idempotent retry, happy path) + 5 CLI tests + 1 dispatch test. - docs/design/2026_05_31_proposed_6e_enable_raft_envelope.md → partial; 6E-1 (both 6E-1a + 6E-1b sub-milestones) is now complete. 6E-2 and 6E-3 remain. ## Protoc version skew (documented) This PR was generated against locally-installed `libprotoc 34.0` because the pinned `libprotoc 29.3` is not available in the dev environment. Empirical comparison (no-op regen on the unmodified proto file with both versions) showed the only difference is the `// protoc vX.Y.Z` version comment in the file header — the generated rawDesc bytes and all code are byte-identical. The version comment in this PR has been reverted to the pinned 29.3 output so the diff is purely the new additive content. CI/release should re-run `make gen` with the pinned 29.3 toolchain to confirm. ## Self-review 1. Data loss — no replication / FSM apply path changes; new server method only proposes through the existing proposeEncryptionEntry seam. 2. Concurrency — semaphore + acquireCutoverSemaphore reused verbatim from EnableStorageEnvelope; ctx-aware wait, pre-canceled short-circuit, and serialization properties inherited. 3. Performance — one extra map entry in proto type registry; no hot-path impact. 4. Data consistency — proposal shape (Purpose=PurposeRaft, DEKID=Active.Raft, len(Wrapped)==0) re-validated at apply in 6E-1a's `validateEnableRaftEnvelopePayload`; the server and applier agree on the constraints. 5. Test coverage — 5 new server tests + 5 new CLI tests + 1 dispatch test. Operator-inertness guarantee preserved: 6E-1 is a no-op until 6E-2 ships engine unwrap + coord wrap.
|
Warning Review limit reached
More reviews will be available in 43 minutes and 23 seconds. Learn how PR review limits work. Your organization has run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (4)
📝 WalkthroughWalkthroughThis PR implements the ChangesRaft Envelope Cutover RPC Implementation
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request implements Stage 6E-1b of the Phase-2 cutover, introducing the EnableRaftEnvelope admin RPC and its corresponding elastickv-admin enable-raft-envelope CLI subcommand. This allows operators to trigger the transition from cleartext Raft proposals to envelope-encrypted proposals. The changes include the server-side orchestration, request validation, idempotent retry handling, CLI integration, and comprehensive unit tests. A review comment points out that the check for appliedIndex == 0 in freshRaftCutoverResponse is unreachable dead code because the caller already verifies that RaftEnvelopeCutoverIndex is non-zero, suggesting a simplification of the function.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| appliedIndex := sc.RaftEnvelopeCutoverIndex | ||
| if appliedIndex == 0 { | ||
| appliedIndex = proposedIdx | ||
| } | ||
| return &pb.EnableRaftEnvelopeResponse{ | ||
| AppliedIndex: appliedIndex, | ||
| CapabilitySummary: projectCapabilityVerdicts(fanoutResult.Verdicts), | ||
| WasAlreadyActive: false, | ||
| } |
There was a problem hiding this comment.
In freshRaftCutoverResponse, the check if appliedIndex == 0 is unreachable dead code. This is because freshRaftCutoverResponse is only called from raftCutoverPostcheck after verifying that postSidecar.RaftEnvelopeCutoverIndex != 0 (lines 1151-1154). Since RaftEnvelopeCutoverIndex is the sole sentinel for the active state in the raft variant (unlike the storage variant which has a separate boolean flag), appliedIndex is guaranteed to be non-zero here. We should simplify this function by removing the redundant check.
return &pb.EnableRaftEnvelopeResponse{
AppliedIndex: sc.RaftEnvelopeCutoverIndex,
CapabilitySummary: projectCapabilityVerdicts(fanoutResult.Verdicts),
WasAlreadyActive: false,
}References
- Remove dead or unreachable code paths, such as asynchronous handlers for tasks that are always performed synchronously, to improve code clarity and maintainability.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 67e8826e3a
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if err != nil { | ||
| return nil, err | ||
| } | ||
| proposedIdx, err := s.proposeRaftCutoverEntry(ctx, preSidecar, req) |
There was a problem hiding this comment.
Gate raft cutover until wrapping support ships
When this RPC is invoked before the 6E-2 wrap/unwrap/barrier path is active, it persists RaftEnvelopeCutoverIndex=N while all subsequent Raft entries are still proposed in cleartext. The planned apply hook keys off entry.Index > sidecar.RaftEnvelopeCutoverIndex, so after upgrading/restarting with 6E-2 any cleartext entries committed at indexes greater than N will be treated as envelopes and replay/apply can halt. The marker should not be proposed until wrapping is active, or the activation scheme must ensure no cleartext entries can exist above the recorded cutover index.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
cmd/elastickv-admin/encryption.go (1)
47-47:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winUsage and error text omit the newly registered
enable-raft-envelope.The dispatch table at Line 68 now serves
enable-raft-envelope, but the help string (Line 47) and the unknown-subcommand error (Line 53) still enumerate the old set only. Operators runningencryption helpor hitting a typo won't discover the new subcommand.📝 Proposed fix
- _, err := fmt.Fprintln(os.Stdout, "usage: elastickv-admin encryption <subcommand> [flags]\n\nsubcommands:\n status\n rotate-dek\n register-writer\n bootstrap\n enable-storage-envelope\n probe-node-id") + _, err := fmt.Fprintln(os.Stdout, "usage: elastickv-admin encryption <subcommand> [flags]\n\nsubcommands:\n status\n rotate-dek\n register-writer\n bootstrap\n enable-storage-envelope\n enable-raft-envelope\n probe-node-id")- return errors.Errorf("encryption: unknown subcommand %q (supported: status, rotate-dek, register-writer, bootstrap, enable-storage-envelope, probe-node-id)", sub) + return errors.Errorf("encryption: unknown subcommand %q (supported: status, rotate-dek, register-writer, bootstrap, enable-storage-envelope, enable-raft-envelope, probe-node-id)", sub)Also applies to: 53-53
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@cmd/elastickv-admin/encryption.go` at line 47, Update the hard-coded usage/help and the unknown-subcommand error message to include the new subcommand name "enable-raft-envelope": change the string printed by the fmt.Fprintln(os.Stdout, ...) usage call and the corresponding unknown-subcommand error output so they list "enable-raft-envelope" alongside the other subcommands (the dispatch table that handles subcommands already routes "enable-raft-envelope"). Ensure both messages match the dispatch table (the fmt.Fprintln usage string and the unknown-subcommand error string in encryption.go).
🧹 Nitpick comments (1)
cmd/elastickv-admin/encryption.go (1)
28-33: 💤 Low valueStale doc comment.
This narrative now describes the current PR's work as future: it attributes the diff to "6D-6b (this PR)" adding
enable-storage-envelopeand saysenable-raft-envelope"lands in Stage 6E". Since this PR (Stage 6E-1b) actually wiresenable-raft-envelope, the comment reads as inaccurate.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@cmd/elastickv-admin/encryption.go` around lines 28 - 33, The block comment referencing PR stages is stale; update the narrative to accurately reflect that this PR wires enable-raft-envelope (not defers it to Stage 6E) and adjust the history line to indicate which PR added enable-storage-envelope versus enable-raft-envelope; specifically edit the doc comment that mentions "6D-6b (this PR) adds `enable-storage-envelope`" and the later "lands in Stage 6E" so it correctly lists `enable-storage-envelope`, `enable-raft-envelope`, and previously added symbols (`status`, `rotate-dek`, `register-writer`, `bootstrap`, `probe-node-id`, `ResyncSidecar`) in their proper order/PRs.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@docs/design/2026_05_31_partial_6e_enable_raft_envelope.md`:
- Around line 11-25: The status table correctly marks 6E-2 and 6E-3 as not
started but later prose incorrectly claims engine unwrap-on-apply, coordinator
wrap-on-propose and the §7.1 quiescence barrier have landed; update the later
narrative to defer those behaviors to milestone 6E-2 by replacing any
present-tense or "landed"/"implemented" wording with phrasing that they are
"planned in 6E-2" or "not yet implemented; planned in 6E-2", and ensure
references to engine unwrap-on-apply, coordinator wrap-on-propose and §7.1
proposal-quiescence barrier explicitly state they are pending work in 6E-2
rather than already shipped.
- Line 16: The table entry for "6E-1b — `EnableRaftEnvelope` admin RPC +
`elastickv-admin enable-raft-envelope` CLI subcommand" currently uses the
placeholder "this PR"; replace that placeholder with the concrete PR identifier
(and commit hash if available) following the same style used in the 6E-1a row so
the row reads with a durable reference (e.g., "PR `#1234` (commit abcdef0)").
---
Outside diff comments:
In `@cmd/elastickv-admin/encryption.go`:
- Line 47: Update the hard-coded usage/help and the unknown-subcommand error
message to include the new subcommand name "enable-raft-envelope": change the
string printed by the fmt.Fprintln(os.Stdout, ...) usage call and the
corresponding unknown-subcommand error output so they list
"enable-raft-envelope" alongside the other subcommands (the dispatch table that
handles subcommands already routes "enable-raft-envelope"). Ensure both messages
match the dispatch table (the fmt.Fprintln usage string and the
unknown-subcommand error string in encryption.go).
---
Nitpick comments:
In `@cmd/elastickv-admin/encryption.go`:
- Around line 28-33: The block comment referencing PR stages is stale; update
the narrative to accurately reflect that this PR wires enable-raft-envelope (not
defers it to Stage 6E) and adjust the history line to indicate which PR added
enable-storage-envelope versus enable-raft-envelope; specifically edit the doc
comment that mentions "6D-6b (this PR) adds `enable-storage-envelope`" and the
later "lands in Stage 6E" so it correctly lists `enable-storage-envelope`,
`enable-raft-envelope`, and previously added symbols (`status`, `rotate-dek`,
`register-writer`, `bootstrap`, `probe-node-id`, `ResyncSidecar`) in their
proper order/PRs.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: c14a1241-356e-415a-a633-2ae2d2fe47e2
⛔ Files ignored due to path filters (2)
proto/encryption_admin.pb.gois excluded by!**/*.pb.goproto/encryption_admin_grpc.pb.gois excluded by!**/*.pb.go
📒 Files selected for processing (7)
adapter/encryption_admin.goadapter/encryption_admin_test.gocmd/elastickv-admin/encryption.gocmd/elastickv-admin/encryption_mutators.gocmd/elastickv-admin/encryption_test.godocs/design/2026_05_31_partial_6e_enable_raft_envelope.mdproto/encryption_admin.proto
… ships codex P1 (cluster-bricking hazard): accepting cutover proposals now records RaftEnvelopeCutoverIndex=N while subsequent Raft entries remain cleartext (6E-2's wrap-on-propose path is not yet deployed). On a later 6E-2 upgrade, the engine apply-hook dispatches entry.Index > sidecar.RaftEnvelopeCutoverIndex through the unwrap path, so all the cleartext entries committed at indexes > N get treated as wrapped envelopes and halt apply cluster-wide. Fail-closed gate: package-level const raftEnvelopeWrapEnabled starts at false. EnableRaftEnvelope refuses fresh proposals with FailedPrecondition. The pre-check (leader gate, bootstrap gate, idempotent-retry short-circuit) still fires for fast feedback; only the propose path is refused. 6E-2 flips the const atomically with the wrap/unwrap/barrier wiring. gemini MEDIUM: removed the dead 'if appliedIndex == 0' defensive branch in freshRaftCutoverResponse. The raft variant uses the cutover index itself as the active sentinel, so raftCutoverPostcheck already refuses earlier on a zero index -- reaching freshRaftCutoverResponse guarantees non-zero. The storage variant's hand-edit hedge has no analogue here. Test changes: - Replaced HappyPath with GatedUntil6E2 -- pins the refusal, proposer.calls stays empty, and the sidecar's RaftEnvelopeCutoverIndex remains 0. - Dropped the orphaned assertRaftCutoverProposalShape helper (the HappyPath test was its only caller). - The other 4 server tests (zero proposer / oversized epoch / not-bootstrapped / idempotent retry) exercise branches that fire BEFORE the gate; their behavior is unchanged. Caller audit (semantic-change directive): EnableRaftEnvelope's sole production caller is the auto-generated gRPC dispatcher; the CLI wraps any error with errors.Wrap(err, "EnableRaftEnvelope"), which surfaces the gate message cleanly. No other callers exist. Design doc: added §3.3 documenting the gate, the codex P1 rationale, and the 6E-2 atomic-flip contract.
|
@claude review Round-1 fixes in 1facaa1:
The orphaned |
|
Claude finished @bootjp's task in 3m 21s —— View job Code Review — Round-1 fixes (commit
|
… tweaks - Major (already auto-marked addressed by coderabbit for 1facaa1): §0 "Why this slice exists" prose still read as if a single PR landed engine unwrap / coord wrap / barrier end-to-end. Clarify that the design is sliced across 6E-1a / 6E-1b / 6E-2 / 6E-3, with engine unwrap + coord wrap + barrier deferred to 6E-2 (planned, not shipped). - Minor: replaced "this PR" placeholder on the 6E-1b status row with the durable #907 reference, matching the 6E-1a row style. No code changes; doc-only.
claude review round-1 flagged three stale references in cmd/elastickv-admin/encryption.go now that enable-raft-envelope has landed (6E-1b): 1. The block doc comment for encryptionMain said "the §7.1 Phase 2 enable-raft-envelope lands in Stage 6E" -- no longer accurate. Replaced the shipping-order narrative with a concise list noting 6E-1b is gated until 6E-2 ships wrap-on-propose. 2. The -h/--help usage string listed six subcommands; added enable-raft-envelope between enable-storage-envelope and probe-node-id. 3. The unknown-subcommand error's "supported:" list missed enable-raft-envelope. Same insertion point. No code paths change -- just three text edits in the help / error surfaces. Existing TestEncryptionMain_RejectsUnknownSubcommand continues to pass (it asserts "unknown subcommand" substring; the "supported:" list is auxiliary).
|
@claude review Round-2 follow-up (7c4403b + cafe9bf):
All claude round-1 findings now closed:
Tests pass on cmd/elastickv-admin (0.340s); lint clean. |
|
Claude finished @bootjp's task in 1m 47s —— View job Code Review — Round-2 follow-up (commits
|
Summary
Stage 6E-1b — operator-facing trigger for the Phase-2 raft cutover, mirroring the 6D-6 EnableStorageEnvelope path. The FSM apply machinery shipped in #899 (Stage 6E-1a); this PR adds the proto RPC, server method, and
elastickv-admin enable-raft-envelopesubcommand so operators can actually invoke the cutover.Operator-inert by design: with only 6E-1 (a + b) deployed, the cutover entry advances the sidecar's
RaftEnvelopeCutoverIndexbut no Raft proposal is wrapped and no apply is unwrapped — those land atomically in 6E-2.What changed
proto/encryption_admin.proto— newEnableRaftEnvelopeRPC +EnableRaftEnvelopeRequest/EnableRaftEnvelopeResponsemessages. The response omits the storage variant'scutover_index_unknowndefensive field becauseRaftEnvelopeCutoverIndex != 0IS the active sentinel — there's no separate bool to defend against, and 6E-1a already fail-closes onraftIdx == 0before any sidecar mutation.adapter/encryption_admin.go—EnableRaftEnvelopeserver method. Same semaphore / pre-check / fan-out / propose / post-check sequence asEnableStorageEnvelope. Variant-specific differences: targetPurpose=PurposeRaft, source slotActive.Raft, active-sentinel fieldRaftEnvelopeCutoverIndex.cmd/elastickv-admin/—enable-raft-envelopesubcommand (parser/runner/printer). Output prefixes (enabled/already-active) match the storage subcommand for script compatibility. No warning line — no defensivecutover_index_unknownfield.docs/design/2026_05_31_proposed_6e_enable_raft_envelope.md→partial: milestone 6E-1 (both sub-milestones) is now complete. 6E-2 and 6E-3 remain.Protoc version skew (documented)
The pinned protoc toolchain (
libprotoc 29.3) is not available in this dev environment; locally installed islibprotoc 34.0. Empirical comparison: a no-op regen against the unmodified proto file with both versions produces byte-identical output except the// protoc vX.Y.Zversion comment in the file header. The version comment in this PR has been reverted to the pinned 29.3 output, so the diff is purely the new additive content. CI/release should re-runmake genwith the pinned 29.3 toolchain to confirm the generated artifacts.Test plan
go test ./adapter/ ./cmd/elastickv-admin/ ./proto/ ./internal/encryption/(without-race): all pass.-run 'EnableStorageEnvelope|EnableRaftEnvelope'race-tests in adapter: pass (0.571s).go test ./kv/...(FSM apply chain): pass.golangci-lint --config=.golangci.yaml runon all impacted packages: 0 issues.go test -race ./adapter/...times out at 10 min on this hardware due to existing slow integration tests in adapter; not specific to this PR. The new tests themselves are quick unit tests.Self-review (per CLAUDE.md)
proposeEncryptionEntryseam (same asEnableStorageEnvelope).acquireCutoverSemaphorereused verbatim from the storage variant; ctx-aware wait, pre-canceled short-circuit, and serialization properties inherited (those branches are tested by the storage suite, which exercises the same code paths).Purpose=PurposeRaft,DEKID=Active.Raft,len(Wrapped)==0) re-validated at apply by 6E-1a'svalidateEnableRaftEnvelopePayload. Server and applier agree on the constraints.Caller audit (semantic-change directive)
No semantic changes to existing functions — purely new additions. The new
EnableRaftEnvelopeserver method has a single transport-layer caller (gRPC dispatch), and the new CLI subcommand has a single dispatch entry inencryptionSubcommands. Both are exercised by the new tests.Summary by CodeRabbit
Release Notes
New Features
enable-raft-envelopeCLI command enabling Raft envelope encryption cutover.Documentation