Skip to content

feat(server): add management-auth seam and providers (8/8)#195

Draft
abhinav-galileo wants to merge 3 commits intoabhi/rfc-1-1-pr7-natural-key-attachfrom
abhi/rfc-1-1-pr8-management-auth
Draft

feat(server): add management-auth seam and providers (8/8)#195
abhinav-galileo wants to merge 3 commits intoabhi/rfc-1-1-pr7-natural-key-attachfrom
abhi/rfc-1-1-pr8-management-auth

Conversation

@abhinav-galileo
Copy link
Copy Markdown
Collaborator

@abhinav-galileo abhinav-galileo commented Apr 21, 2026

Summary

Introduces a pluggable request-auth framework. Agent Control speaks a generic Operation vocabulary on its endpoints; a provider translates those operations onto whatever auth system a deployment uses. This PR wires the management family (4 endpoints); the framework is designed to absorb the runtime, observability, and public families in follow-ups -- see TODO markers in auth/core.py.

Also adds a local end-to-end harness for validating the full stack.

Two providers ship in-tree:

  • HeaderAuthProvider -- OSS / single-tenant default. Reads X-Tenant-Id, falls back to DEFAULT_TENANT_ID. Enforces the legacy OSS API-key / admin-key check per-operation via an OSS_AUTH_LEVELS map (PUBLIC / AUTHENTICATED / ADMIN), preserving existing behavior verbatim.
  • HttpUpstreamAuthProvider -- forwards caller credentials (API key, JWT, cookie) to a configurable upstream URL. Adds a configurable service-to-service token. Maps upstream statuses onto 401/403/404. Fail-closed on network errors and upstream 5xx -> 503.

Tracks: sc-63146, sc-63147.

Module layout

  • auth/core.py -- framework: Operation (incl. reserved runtime/observability/public members), Principal, RequestAuthorizer Protocol, require_operation(op, context_builder) FastAPI dependency factory, global authorizer registry.
  • auth/local.py -- legacy OSS credential check (API key + session cookie). Still used by non-management endpoints pending follow-up migration. Also exposes authenticate_request() as a re-usable helper for providers.
  • auth/providers/header.py -- OSS default, with LocalAccessLevel enum and OSS_AUTH_LEVELS map (single source of truth for OSS access policy).
  • auth/providers/http_upstream.py -- upstream HTTP adapter.
  • auth/config.py -- configure_auth_from_env() reads env vars at startup.

P0 fix: management endpoints accept upstream credentials

Previously the 4 management endpoints were wrapped with the router-level Depends(require_api_key) AND the endpoint-level Depends(require_admin_key), which rejected the Galileo bearer token that http_upstream mode expects to forward. They now live on dedicated management_router objects mounted in main.py without the legacy gate; the provider alone owns authn+authz.

Non-management endpoints keep their legacy gates (tagged with TODO(auth-framework) markers) and continue to work exactly as before.

Env vars

  • AGENT_CONTROL_MANAGEMENT_AUTH_MODE=header (default) or http_upstream
  • When http_upstream:
    • AGENT_CONTROL_MANAGEMENT_AUTH_UPSTREAM_URL (required)
    • AGENT_CONTROL_MANAGEMENT_AUTH_UPSTREAM_CHECK_PATH (default /internal/agent_control/auth/check_management_access)
    • AGENT_CONTROL_MANAGEMENT_AUTH_UPSTREAM_SERVICE_TOKEN (optional)
    • AGENT_CONTROL_MANAGEMENT_AUTH_UPSTREAM_SERVICE_TOKEN_HEADER (default X-Agent-Control-Service-Token)
    • AGENT_CONTROL_MANAGEMENT_AUTH_UPSTREAM_TIMEOUT_SECONDS (default 5.0)

Wired endpoints

  • GET /api/v1/controls -> controls.read
  • GET /api/v1/targets/{target_type}/{external_id}/controls -> target_bindings.read
  • PUT /api/v1/targets/{target_type}/{external_id}/controls/{control_id} -> target_bindings.write
  • DELETE /api/v1/targets/{target_type}/{external_id}/controls/{control_id} -> target_bindings.write

Each endpoint uses Depends(require_operation(...)) and reads tenant_id from the returned Principal. For target-binding routes a _target_binding_context builder extracts target_type and external_id from request.path_params.

OSS-safe

No Galileo or Cerbos nouns in auth/core.py or auth/providers/header.py. Provider-specific logic lives only in providers/http_upstream.py. Any HTTP service that speaks the documented operation contract can be an upstream.

End-to-end harness

tools/management_vertical_harness/harness.py -- reproducible validation script for a locally-running stack of Agent Control, Galileo api, and Galileo authz. Walks the happy path and three optional deny cases. Useful for manual integration validation without a UI.

See tools/management_vertical_harness/README.md for usage.

Tests

  • tests/test_auth_framework.py -- 9 unit tests:
    • Header provider: resolve from header, fall back to default, trim whitespace (with api_key_enabled=False so the credential check passes through).
    • HTTP upstream provider: credential forwarding, service-token injection, 401/403/404 mapping, 5xx fail-closed, network-error fail-closed, malformed principal -> 502.
    • require_operation dependency factory routes operation + context to the installed authorizer.
  • conftest.py autouse fixture installs HeaderAuthProvider so isolated test subsets (no lifespan) still have an authorizer.

Test plan

  • Full server suite -- 616 passed
  • mypy clean
  • ruff clean on touched files

Stacking

Based on abhi/rfc-1-1-pr7-natural-key-attach (PR7). Pairs with rungalileo/authz#145 and rungalileo/api#6350 for the Galileo-hosted deployment mode.

Introduces a pluggable management-auth layer. Agent Control speaks a
generic operation vocabulary on its management endpoints; a provider
translates those operations onto whatever upstream auth system a
deployment uses. Two providers ship in-tree.

Module layout:

- server/src/agent_control_server/authz/base.py
  - ManagementOperation enum (controls.{read,create,update,delete},
    target_bindings.{read,write}, runtime.use)
  - ManagementPrincipal model (tenant_id, optional subject_id)
  - ManagementAuthorizer Protocol
  - require_management_auth(operation, context_builder) FastAPI
    dependency factory
  - set_management_authorizer / get_management_authorizer registry
- server/src/agent_control_server/authz/providers/header.py
  - HeaderManagementAuthorizer: reads X-Tenant-Id, falls back to
    DEFAULT_TENANT_ID. OSS / single-tenant default. Allow-all.
- server/src/agent_control_server/authz/providers/http_upstream.py
  - HttpUpstreamManagementAuthorizer: forwards caller credentials
    (Authorization, Cookie, X-Galileo-API-Key, X-API-Key, plus any
    configured extras) to a configured upstream URL. Adds a
    configurable service-to-service token header. Maps upstream
    status codes onto client-facing codes (401/403/404/503/etc).
    Fail-closed on network errors / upstream 5xx -> 503.
- server/src/agent_control_server/authz/config.py
  - configure_management_auth_from_env() reads
    AGENT_CONTROL_MANAGEMENT_AUTH_MODE and related env vars and
    installs the appropriate provider at startup.

Wiring:

- main.py lifespan now calls configure_management_auth_from_env() on
  startup. Default mode is header; existing OSS setups need no config.
- endpoints/controls.py list_controls: replaces the get_tenant_id
  dependency with require_management_auth(ManagementOperation.controls_read).
- endpoints/targets.py: the three natural-key endpoints
  (GET / PUT / DELETE /targets/{target_type}/{external_id}/controls[/*])
  now use require_management_auth with a _target_binding_context builder
  that reads target_type and external_id from request.path_params.

OSS-safe: AC core speaks only generic operation names; all upstream-
specific logic lives in the http_upstream provider. No Galileo, Cerbos,
or RFC nouns in the core authz module.

Tests:

- tests/test_management_authz.py: 9 unit tests covering the header
  provider (resolve, fallback, empty/whitespace), the HTTP upstream
  provider (credential forwarding, deny-code mapping, fail-closed
  behavior, malformed principal), and the require_management_auth
  dependency wiring (operation + context routed through correctly).
- conftest.py autouse fixture installs HeaderManagementAuthorizer for
  every test so isolated test subsets (which don't exercise the
  lifespan) still have an authorizer available.

TS SDK regenerated for the updated endpoint docstrings.

Depends on PR7 (natural-key endpoints) for the target-binding routes
it wires. Pairs with rungalileo/authz#145 and rungalileo/api#6350 for
the Galileo-hosted deployment mode.
Reproducible validation harness for the management-vertical authz flow.
Walks the happy path from plan §9 (list controls, read bindings,
attach, detach) and three deny cases (cross-org, cross-project,
insufficient permissions) against a locally-running AC+api+authz stack.

Exits non-zero on first failed assertion. Deny cases are optional and
skipped when their tokens are not set, so the harness is useful even
in minimal single-user dev setups.

Tracks: sc-63147.
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 21, 2026

- rename authz/ -> auth/; split old auth.py into auth/local.py
- auth/core: Operation (public/management/runtime/observability), Principal,
  RequestAuthorizer Protocol, require_operation dep factory
- header provider: LocalAccessLevel + OSS_AUTH_LEVELS map preserving
  legacy api-key / admin-key asymmetry per operation
- http_upstream provider: renamed, updated default check_path to
  /internal/agent_control/auth/check_management_access
- split management endpoints (list controls + 3 target-binding ops) onto
  dedicated routers mounted without the legacy require_api_key gate so
  http_upstream providers can accept Galileo bearer tokens (P0 fix)
- harness: drop meaningless fail-closed check (it's a unit-test concern)
- TODO markers on legacy gates + unwired Operation members so the
  framework's full scope is discoverable from the code
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant