Skip to content

security(H16): vault crate primitives for WS3 public-key state machine (re-target of #79 after #78 merged)#81

Merged
tolgaergin merged 1 commit into
mainfrom
security/h8-h16-ws3-vault-crate-v2
May 21, 2026
Merged

security(H16): vault crate primitives for WS3 public-key state machine (re-target of #79 after #78 merged)#81
tolgaergin merged 1 commit into
mainfrom
security/h8-h16-ws3-vault-crate-v2

Conversation

@tolgaergin

Copy link
Copy Markdown
Contributor

Re-opened from #79 after #78 merged. Identical content rebased onto main with the test-module conflict resolved (both #78's format_push_error tests and #79's get_my_public_key/classify_public_key_state/upload_public_key tests preserved). 39 sync tests pass.

…chine

Workstream 3 PR #3 — the rust-client half of the public-key write
hardening. Server-side WS3 (paired PR #30 in a-package-manager)
gates POST /api/users/me/public-key on a step-up proof and refuses
silent overwrites; this PR teaches the vault crate to (a) classify
the local-vs-server key state without mutating, (b) propagate
server errors honestly instead of collapsing them to "no key", and
(c) carry the step-up proof in the X-LPM-Step-Up-Proof header on
the upload path.

The CLI command-layer migration (rotate-sharing-key, pending-key
promotion, replacing every silent `ensure_public_key(...)` caller
with explicit classification + reauth UX) is PR #4 of this
workstream, intentionally split so the crate API change can be
reviewed independently of the CLI UX work.

Concrete changes in `crates/lpm-vault/src/sync.rs`:

- `get_my_public_key()` no longer collapses non-2xx to `Ok(None)`.
  401, 403, and 5xx now propagate as `Err(...)` carrying the HTTP
  status, so a transient outage or an expired token can no longer
  trick the caller into the silent-overwrite path. 2xx with
  `publicKey: null` (or the field absent) is the explicit "no key
  on server" signal.

- New `LocalPublicKeyState` struct + `PublicKeyRegistrationState`
  enum capture the three outcomes the CLI must branch on:
  `Matches`, `NeedsInitialSet`, `RotationRequired`. The
  `RotationRequired` variant is the one the prior
  `ensure_public_key` path silently rolled through — the new
  classifier surfaces it so the CLI command layer can refuse and
  direct the user to the explicit rotation flow.

- New `classify_public_key_state(registry_url, auth_token)` runs
  the local-keypair load + server lookup + classification. Pure
  classifier — never writes.

- `upload_public_key()` signature gains `step_up_proof:
  Option<&str>`, emits the proof in the `X-LPM-Step-Up-Proof`
  header when provided, and parses the WS3 server response shape
  (`{ok, status, fingerprintPrefix, previousFingerprintPrefix,
  invalidatedWrappedKeys, affectedOrgs}`). The new
  `UploadPublicKeyResponse` struct deserializes every field as
  optional so pre-WS3 servers (returning `{status: "saved"}`) still
  round-trip without error.

- Local keypair load extracted into `load_local_public_key_state()`
  so `ensure_public_key` and `classify_public_key_state` share the
  storage / generation / canonical-Base64-encoding policy.

- `CLI_STEP_UP_HEADER_NAME` constant mirrors the server-side
  header name from `lib/auth/cli-step-up.js`, so the upload site
  and any future caller never drift.

- `ensure_public_key` updated to the new `upload_public_key`
  signature with `proof: None`. Against the WS3-hardened server
  the no-proof upload will surface as `step_up_required` for the
  set/rotate cohorts — that's the correct failure mode (silent
  overwrite was the security bug WS3 closes). Three existing
  push_org_with_keys tests had to swap their 404 "missing key"
  mocks for the truthful 200-with-null-publicKey shape.

Tests (10 new):

- get_my_public_key returns Err on non-2xx (401 cohort)
- get_my_public_key returns Some on happy 2xx with key
- get_my_public_key returns None on 2xx with publicKey: null
- upload_public_key sends X-LPM-Step-Up-Proof when provided AND
  omits the header entirely when None passed (regression guard
  against spurious empty headers)
- upload_public_key propagates server error envelope on non-2xx
  with step_up_required code preserved
- classify_public_key_state returns Matches when keys equal
- classify_public_key_state returns NeedsInitialSet on null
- classify_public_key_state returns RotationRequired on mismatch
  (the critical regression guard against silent overwrite)
- classify_public_key_state propagates server errors instead of
  misclassifying outages as NeedsInitialSet

Workspace gate: 7362/7362 nextest passed (4 new). cargo fmt
clean. cargo clippy --workspace --all-targets clean.

Branched off main since WS3 server PRs are forward-compatible:
old servers ignore the new proof header, new servers refuse the
old proof-less calls with a structured error the CLI can
surface — both behaviors are correct for their respective
deployments.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@tolgaergin tolgaergin merged commit d994b4d into main May 21, 2026
8 checks passed
@tolgaergin tolgaergin deleted the security/h8-h16-ws3-vault-crate-v2 branch May 21, 2026 17:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant