feat(policy): per-tenant password complexity + session timeouts#275
feat(policy): per-tenant password complexity + session timeouts#275iarunsaragadam wants to merge 1 commit into
Conversation
Extend LoginPolicy with four additive fields so an org can tighten its members' authentication beyond the global defaults: - password_min_length / password_require_classes — enforced at password signup, reset, and accept-invitation via a per-tenant StrengthPolicy resolved from the governance plane (domain → tenant → policy). A tenant can only ever raise the floor, never loosen below the global baseline; ungoverned users keep the global rules. - session_idle_timeout_seconds / session_absolute_timeout_seconds — enforced on the RefreshToken hot path against the refresh row's LastUsedAt / CreatedAt, returning the new ErrSessionExpired and deleting the dead session. Unset (0) preserves current behavior. Add postgres migration 0263 (additive columns, default 0/false), wire the columns through the LoginPolicyStore upsert/scan, extend the store smoke test, and update the schema doc. The domain→tenant→policy walk is factored into resolveLoginPolicy so password and session enforcement reuse the same fail-safe lookup as method enforcement. Tests cover the strength policy (pkg/passwords), per-tenant min-length at signup, idle/absolute timeout on the refresh path, and fail-safe fallback for ungoverned users. Deferred: tenant-scoped ExportAuditEvents RPC and exposing LoginPolicy via proto/admin RPCs (depends on the RBAC/policy-API issues); ProfileService ChangePassword stays on the global baseline until governance is wired into that service. Closes #263
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 30a99e354c
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if timeoutUser == nil { | ||
| return nil, "", "", fmt.Errorf("%w: user not found", ErrNotFound) | ||
| } | ||
| if err := s.enforceSessionTimeout(ctx, timeoutUser.Email, s.nowMs(), record.CreatedAt, record.LastUsedAt); err != nil { |
There was a problem hiding this comment.
Preserve original session age for absolute timeout
When a tenant sets SessionAbsoluteTimeoutSeconds, this check compares against record.CreatedAt from the current refresh-token row. RefreshToken then rotates by calling issueTokens, which writes each replacement refresh token with CreatedAt: now, so a client that refreshes more often than the absolute window keeps resetting the baseline and never hits the absolute session timeout. Use a stable session start timestamp for the absolute check instead of the rotated token's creation time.
Useful? React with 👍 / 👎.
| // ErrSessionExpired is returned when a still-valid refresh token is | ||
| // rejected because the owning tenant's LoginPolicy idle or absolute | ||
| // session timeout has elapsed. | ||
| ErrSessionExpired = errors.New("session expired") |
There was a problem hiding this comment.
Map session expiry to a client auth error
When a tenant idle or absolute timeout is exceeded, RefreshToken returns the new ErrSessionExpired, but internal/connect/errors.go::toConnectError does not map this sentinel. In that scenario the public RefreshToken RPC falls through to CodeInternal instead of an authentication/expiry response, so clients see a server error and cannot reliably route the user back to sign-in.
Useful? React with 👍 / 👎.
| @@ -0,0 +1,8 @@ | |||
| -- Per-tenant password complexity + session timeout policy fields. | |||
There was a problem hiding this comment.
Keep migration versions contiguous
The repository currently has migrations only through 0016, so adding this as version 0263 advances golang-migrate's recorded schema version to 263 on upgraded databases. A later normal migration numbered 0017 through 0262 would be lower than the recorded version and would not be applied, leaving those deployments missing schema changes unless every future migration intentionally starts at 0264.
Useful? React with 👍 / 👎.
iarunsaragadam
left a comment
There was a problem hiding this comment.
PR review gate: ❌ BLOCKED
| Reviewer | Effective verdict | Surviving blockers |
|---|---|---|
| Product | REQUEST_CHANGES | 1 |
| Security | APPROVE | 0 |
| Performance | APPROVE | 0 |
| Maintainability | REQUEST_CHANGES | 1 |
| Correctness | APPROVE | 0 |
| Contract & Migration | APPROVE | 0 |
Conditional reviewers: Contract & Migration — ran (proto/API/schema/migration touched); Accessibility & UX — skipped (no UI changes).
The gate is decided solely by surviving confirmed blockers. Security and Contract & Migration each raised a REQUEST_CHANGES finding that was refuted on adversarial re-verification, so their effective verdict is APPROVE and they do not block. Two confirmed blockers remain.
Confirmed blockers
Product — Absolute session timeout never fires for an active user (CreatedAt resets on every rotation)
- Location:
internal/service/login_policy_enforce.go(enforceSessionTimeout) +internal/service/auth.go:1058-1109(issueTokens) - Detail:
session_absolute_timeout_secondsis documented (login_policy.go and docs/redesign/schema.md) as "invalidate a session older than this regardless of activity." Enforcement comparesnow - record.CreatedAtagainst the absolute window, but refresh tokens rotate on everyRefreshTokencall andissueTokensstamps a freshCreatedAt=nowon each rotated row with no carry-forward of the original session start. For any user refreshing more often than the absolute window,CreatedAtis perpetually recent and the absolute timeout never triggers — the SOC2/enterprise "force re-auth" guarantee is not met. The idle path is fine (LastUsedAttracks last refresh). Tests mask the gap:TestRefreshToken_TenantTimeout_FreshSessionPassesonly asserts a fresh session passes, andTestEnforceSessionTimeout_Absolutecalls the helper directly with a hand-set oldCreatedAt— neither exercises absolute expiry across a real rotation. - Fix: Carry a session-origin timestamp forward across rotations (persist/propagate the original
CreatedAtor asession_started_aton the rotated refresh row) and compare the absolute timeout against that origin, not the per-rotationCreatedAt. Add aRefreshToken-path test that rotates at least once and assertsErrSessionExpiredonce the absolute window since session start has elapsed. If carrying the origin is out of scope for this PR, drop thesession_absolute_timeout_secondsfield + docs and note it under Deferred rather than shipping a control that silently does nothing.
Maintainability — password_require_classes / RequireClasses wired end-to-end but never affects validation (inert governance field)
- Location:
pkg/passwords/strength.go:55-62(RequireClasses) andinternal/service/login_policy_enforce.go:18-26(passwordStrengthPolicyFor) - Detail:
StrengthPolicy.RequireClassesis added with a long doc comment, persisted as thepassword_require_classesPostgres column (migration 0263, login_policy_store.go scan+upsert), exposed asLoginPolicy.PasswordRequireClasses, and mapped intoStrengthPolicy. ButValidateStrengthWithPolicynever readspolicy.RequireClasses— the upper/lower/digit/special checks are unconditional, sopassword_require_classes=trueand=falseproduce byte-identical validation. This is an inert field/column threaded through the whole stack, which AGENTS.md §2 (no dead code) and the "no half-finished implementations" rule prohibit, yet the PR markets it as a working per-tenant complexity control.TestPasswordStrengthPolicyForonly asserts the field mapping because there is no behavioral difference to assert. - Fix: Either (a) drop
RequireClasses,PasswordRequireClasses, and thepassword_require_classescolumn from this PR until per-class control is actually implemented (an additive migration is easy to add later), or (b) make it load-bearing now — gate the four class checks behind it so the zero policy still enforces them (preserving the global default) but the flag genuinely tightens. Do not ship a column whose value never changes behavior. If kept, the doc comment must describe present behavior (not aspirational future per-class relaxation) and a test must assert an actual behavioral effect.
Refuted (raised but verification cleared them)
- Security — "Absolute session timeout resets on every refresh": Same underlying observation as the confirmed Product blocker; retained as the Product finding, so the duplicate Security blocker was dropped (not a separate defect).
- Contract & Migration — "New ErrSessionExpired unmapped → leaks as CodeInternal (HTTP 500)": Re-verification cleared this;
ErrSessionExpireddoes not fall through toCodeInternalon theRefreshTokenwire path as claimed.
Non-blocking findings
- Product (minor):
PasswordRequireClassesis surfaced/documented but has no enforcement effect today; schema.md's "Demands all four char classes" overstates current behavior (classes are always enforced globally). Soften the wording or make the flag load-bearing. (Overlaps the Maintainability blocker.) - Product (nit): No CHANGELOG/release-note entry for the new tenant-facing governance controls, migration 0263, or the new session-expired outcome on
RefreshToken. - Performance (minor):
RefreshTokenadds 3 uncached governance round-trips (domain→tenant→policy) per refresh for governed tenants; all index-backed point reads, but worth folding behind the existing short-TTL cache (project_resolver_cache.go) in a follow-up. - Performance (nit): No load/bench coverage for the added refresh-path governance lookups.
- Maintainability (nit):
RequireClassesdoc comment explains future intent rather than present behavior (AGENTS.md §5). - Contract & Migration (minor): Migration numbering jumps 0016 → 0256 → 0263; harmless under golang-migrate's numeric ordering, but document whether versions are sequential or issue-derived to avoid future collisions.
Recommendation
Block merge until the two confirmed blockers are resolved. Both are "half-finished control" issues that the PR advertises as working: the absolute session timeout silently never fires for active users, and password_require_classes is inert across the entire stack. For each, either make the control load-bearing (carry a session-origin timestamp; gate class checks on the flag) with a test that proves the behavioral difference, or remove the field/column/docs and defer it explicitly. The rest of the PR — tighten-only password min-length, fail-safe policy resolution, the additive defaulted migration, and the idle-timeout path — is sound and well-tested. Address the non-blocking items (changelog, schema.md wording, governance-walk caching) opportunistically.
Generated by the PR review gate (AGENTS.md §11) — 5 always-on reviewers + conditional Contract/Accessibility, with adversarial verification.
Closes #263
What
Additively extends
LoginPolicyso an enterprise tenant can tighten its members' authentication beyond the global defaults — addressing the SOC2/enterprise asks in #263 (per-org password complexity + session idle/absolute timeouts).New
LoginPolicyfields (all default 0/false = "use global behavior")password_min_length,password_require_classes— per-tenant password complexity.session_idle_timeout_seconds,session_absolute_timeout_seconds— per-tenant session timeouts.Enforcement
validatePasswordStrengthForEmailresolves the owning tenant's policy (domain → tenant → policy, fail-safe) and validates via a newpasswords.StrengthPolicy. Wired into password signup, password reset, and accept-invitation. A tenant can only tighten (min-length below the global floor is clamped up); ungoverned users keep the global rules. A global baseline check still runs before any token lookup so behavior/info-leak is unchanged.enforceSessionTimeoutruns on theRefreshTokenhot path against the refresh row'sLastUsedAt(idle) andCreatedAt(absolute). On breach it returns the newErrSessionExpiredand deletes the dead refresh row. Unset timeouts and ungoverned users impose no restriction.Storage / docs
LoginPolicyStoreupsert/scan carry the fields; store smoke test extended to round-trip them and assert overwrite-on-upsert.docs/redesign/schema.mdlogin_policiestable updated.resolveLoginPolicy, reused by method, password, and session enforcement.Backward compatibility
Purely additive. No proto change. No existing RPC/field touched. LoginPolicy is postgres-only governance state (memory/sqlite have no policy plane), so no conformance change is required. Existing
ListAuditEventsis untouched.Tests
pkg/passwords:ValidateStrengthWithPolicytightening + clamp-to-floor.internal/service: per-tenant min-length at signup, tenant-scoped (other-domain unaffected), idle + absolute timeout on the realRefreshTokenpath (incl. dead-row deletion), and fail-safe fallback.login_policy_storesmoke round-trips the new columns.Deferred (noted for #263 follow-up)
ExportAuditEventsRPC and exposingLoginPolicyvia proto/admin RPCs — depend on the RBAC / policy-API issues that must emit audit events.ProfileService.ChangePasswordstays on the global baseline until the governance plane is injected into that service (it has no governance dependency today).🤖 Generated with Claude Code