feat(server): add management-auth seam and providers (8/8)#195
Draft
abhinav-galileo wants to merge 3 commits intoabhi/rfc-1-1-pr7-natural-key-attachfrom
Draft
feat(server): add management-auth seam and providers (8/8)#195abhinav-galileo wants to merge 3 commits intoabhi/rfc-1-1-pr7-natural-key-attachfrom
abhinav-galileo wants to merge 3 commits intoabhi/rfc-1-1-pr7-natural-key-attachfrom
Conversation
Introduces a pluggable management-auth layer. Agent Control speaks a
generic operation vocabulary on its management endpoints; a provider
translates those operations onto whatever upstream auth system a
deployment uses. Two providers ship in-tree.
Module layout:
- server/src/agent_control_server/authz/base.py
- ManagementOperation enum (controls.{read,create,update,delete},
target_bindings.{read,write}, runtime.use)
- ManagementPrincipal model (tenant_id, optional subject_id)
- ManagementAuthorizer Protocol
- require_management_auth(operation, context_builder) FastAPI
dependency factory
- set_management_authorizer / get_management_authorizer registry
- server/src/agent_control_server/authz/providers/header.py
- HeaderManagementAuthorizer: reads X-Tenant-Id, falls back to
DEFAULT_TENANT_ID. OSS / single-tenant default. Allow-all.
- server/src/agent_control_server/authz/providers/http_upstream.py
- HttpUpstreamManagementAuthorizer: forwards caller credentials
(Authorization, Cookie, X-Galileo-API-Key, X-API-Key, plus any
configured extras) to a configured upstream URL. Adds a
configurable service-to-service token header. Maps upstream
status codes onto client-facing codes (401/403/404/503/etc).
Fail-closed on network errors / upstream 5xx -> 503.
- server/src/agent_control_server/authz/config.py
- configure_management_auth_from_env() reads
AGENT_CONTROL_MANAGEMENT_AUTH_MODE and related env vars and
installs the appropriate provider at startup.
Wiring:
- main.py lifespan now calls configure_management_auth_from_env() on
startup. Default mode is header; existing OSS setups need no config.
- endpoints/controls.py list_controls: replaces the get_tenant_id
dependency with require_management_auth(ManagementOperation.controls_read).
- endpoints/targets.py: the three natural-key endpoints
(GET / PUT / DELETE /targets/{target_type}/{external_id}/controls[/*])
now use require_management_auth with a _target_binding_context builder
that reads target_type and external_id from request.path_params.
OSS-safe: AC core speaks only generic operation names; all upstream-
specific logic lives in the http_upstream provider. No Galileo, Cerbos,
or RFC nouns in the core authz module.
Tests:
- tests/test_management_authz.py: 9 unit tests covering the header
provider (resolve, fallback, empty/whitespace), the HTTP upstream
provider (credential forwarding, deny-code mapping, fail-closed
behavior, malformed principal), and the require_management_auth
dependency wiring (operation + context routed through correctly).
- conftest.py autouse fixture installs HeaderManagementAuthorizer for
every test so isolated test subsets (which don't exercise the
lifespan) still have an authorizer available.
TS SDK regenerated for the updated endpoint docstrings.
Depends on PR7 (natural-key endpoints) for the target-binding routes
it wires. Pairs with rungalileo/authz#145 and rungalileo/api#6350 for
the Galileo-hosted deployment mode.
Reproducible validation harness for the management-vertical authz flow. Walks the happy path from plan §9 (list controls, read bindings, attach, detach) and three deny cases (cross-org, cross-project, insufficient permissions) against a locally-running AC+api+authz stack. Exits non-zero on first failed assertion. Deny cases are optional and skipped when their tokens are not set, so the harness is useful even in minimal single-user dev setups. Tracks: sc-63147.
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
- rename authz/ -> auth/; split old auth.py into auth/local.py - auth/core: Operation (public/management/runtime/observability), Principal, RequestAuthorizer Protocol, require_operation dep factory - header provider: LocalAccessLevel + OSS_AUTH_LEVELS map preserving legacy api-key / admin-key asymmetry per operation - http_upstream provider: renamed, updated default check_path to /internal/agent_control/auth/check_management_access - split management endpoints (list controls + 3 target-binding ops) onto dedicated routers mounted without the legacy require_api_key gate so http_upstream providers can accept Galileo bearer tokens (P0 fix) - harness: drop meaningless fail-closed check (it's a unit-test concern) - TODO markers on legacy gates + unwired Operation members so the framework's full scope is discoverable from the code
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Introduces a pluggable request-auth framework. Agent Control speaks a generic
Operationvocabulary on its endpoints; a provider translates those operations onto whatever auth system a deployment uses. This PR wires the management family (4 endpoints); the framework is designed to absorb the runtime, observability, and public families in follow-ups -- see TODO markers inauth/core.py.Also adds a local end-to-end harness for validating the full stack.
Two providers ship in-tree:
HeaderAuthProvider-- OSS / single-tenant default. ReadsX-Tenant-Id, falls back toDEFAULT_TENANT_ID. Enforces the legacy OSS API-key / admin-key check per-operation via anOSS_AUTH_LEVELSmap (PUBLIC/AUTHENTICATED/ADMIN), preserving existing behavior verbatim.HttpUpstreamAuthProvider-- forwards caller credentials (API key, JWT, cookie) to a configurable upstream URL. Adds a configurable service-to-service token. Maps upstream statuses onto 401/403/404. Fail-closed on network errors and upstream 5xx -> 503.Tracks: sc-63146, sc-63147.
Module layout
auth/core.py-- framework:Operation(incl. reserved runtime/observability/public members),Principal,RequestAuthorizerProtocol,require_operation(op, context_builder)FastAPI dependency factory, global authorizer registry.auth/local.py-- legacy OSS credential check (API key + session cookie). Still used by non-management endpoints pending follow-up migration. Also exposesauthenticate_request()as a re-usable helper for providers.auth/providers/header.py-- OSS default, withLocalAccessLevelenum andOSS_AUTH_LEVELSmap (single source of truth for OSS access policy).auth/providers/http_upstream.py-- upstream HTTP adapter.auth/config.py--configure_auth_from_env()reads env vars at startup.P0 fix: management endpoints accept upstream credentials
Previously the 4 management endpoints were wrapped with the router-level
Depends(require_api_key)AND the endpoint-levelDepends(require_admin_key), which rejected the Galileo bearer token thathttp_upstreammode expects to forward. They now live on dedicatedmanagement_routerobjects mounted inmain.pywithout the legacy gate; the provider alone owns authn+authz.Non-management endpoints keep their legacy gates (tagged with
TODO(auth-framework)markers) and continue to work exactly as before.Env vars
AGENT_CONTROL_MANAGEMENT_AUTH_MODE=header(default) orhttp_upstreamhttp_upstream:AGENT_CONTROL_MANAGEMENT_AUTH_UPSTREAM_URL(required)AGENT_CONTROL_MANAGEMENT_AUTH_UPSTREAM_CHECK_PATH(default/internal/agent_control/auth/check_management_access)AGENT_CONTROL_MANAGEMENT_AUTH_UPSTREAM_SERVICE_TOKEN(optional)AGENT_CONTROL_MANAGEMENT_AUTH_UPSTREAM_SERVICE_TOKEN_HEADER(defaultX-Agent-Control-Service-Token)AGENT_CONTROL_MANAGEMENT_AUTH_UPSTREAM_TIMEOUT_SECONDS(default5.0)Wired endpoints
GET /api/v1/controls->controls.readGET /api/v1/targets/{target_type}/{external_id}/controls->target_bindings.readPUT /api/v1/targets/{target_type}/{external_id}/controls/{control_id}->target_bindings.writeDELETE /api/v1/targets/{target_type}/{external_id}/controls/{control_id}->target_bindings.writeEach endpoint uses
Depends(require_operation(...))and readstenant_idfrom the returnedPrincipal. For target-binding routes a_target_binding_contextbuilder extractstarget_typeandexternal_idfromrequest.path_params.OSS-safe
No Galileo or Cerbos nouns in
auth/core.pyorauth/providers/header.py. Provider-specific logic lives only inproviders/http_upstream.py. Any HTTP service that speaks the documented operation contract can be an upstream.End-to-end harness
tools/management_vertical_harness/harness.py-- reproducible validation script for a locally-running stack of Agent Control, Galileoapi, and Galileoauthz. Walks the happy path and three optional deny cases. Useful for manual integration validation without a UI.See
tools/management_vertical_harness/README.mdfor usage.Tests
tests/test_auth_framework.py-- 9 unit tests:api_key_enabled=Falseso the credential check passes through).require_operationdependency factory routes operation + context to the installed authorizer.conftest.pyautouse fixture installsHeaderAuthProviderso isolated test subsets (no lifespan) still have an authorizer.Test plan
Stacking
Based on
abhi/rfc-1-1-pr7-natural-key-attach(PR7). Pairs with rungalileo/authz#145 and rungalileo/api#6350 for the Galileo-hosted deployment mode.