feat(k8s): deploy smartem-frontend across dev/staging/production#205
Merged
Conversation
Adds Kubernetes manifests for the smartem-frontend image
produced by smartem-frontend#94 (v0.2.0, now on GHCR). The
image is environment-agnostic: it ships a placeholder
config.json with dev defaults and proxies /api/ to the
backend service via its own nginx, deferring DNS to request
time.
What lands per environment (k8s/environments/<env>/
smartem-frontend.yaml):
- ConfigMap smartem-frontend-config carries the runtime
config.json (Keycloak URL/realm/clientId + authEnabled).
The Deployment subPath-mounts this onto
/usr/share/nginx/html/config.json, overriding the
placeholder shipped in the image.
- Deployment smartem-frontend pulls
ghcr.io/diamondlightsource/smartem-frontend:latest, sets
BACKEND_HOST=smartem-http-api-service so the SPA pod's
nginx proxies /api/ to the backend service.
- Service smartem-frontend-service: NodePort 30100 for
development (next free in the 30000s range; matches the
existing smartem-http-api / Keycloak / RabbitMQ / Postgres
/ Adminer pattern), ClusterIP for staging and production.
Per-environment config.json values:
- development: keycloak.url http://localhost:30090 (the
Keycloak mock NodePort - browser-reachable), authEnabled
false to match KEYCLOAK_AUTH_REQUIRED=false on the dev
backend. Flip to true to exercise the full login chain.
- staging: identity-test.diamond.ac.uk, authEnabled true.
- production: identity.diamond.ac.uk, authEnabled true.
Ingress (staging and production only - dev keeps the
NodePort pattern):
- k8s/environments/{staging,production}/ingress.yaml route
a single host to smartem-frontend-service. The SPA pod's
nginx handles /api/ proxying to the backend internally,
so one route covers everything. Hostnames are placeholders
(smartem-staging.example.com / smartem.example.com) and
flagged TODO until real values are decided.
scripts/k8s/dev-k8s.sh: print the new
http://localhost:30100 (frontend) and http://localhost:30090
(Keycloak, missed when #198 landed) in the access-URLs
section.
Verified locally: kubectl kustomize build is clean for all
three environments. End-to-end browser flow (SPA login,
authenticated /api call) will be exercised on the user's
local k3s after merge.
smartem-decisions#285 removed the KEYCLOAK_AUTH_REQUIRED flag from the backend; Bearer-token validation now runs unconditionally. The dev frontend ConfigMap needs authEnabled: true to match, otherwise the SPA skips the login ceremony and every /api/ call comes back 401.
Fold the configmap and ingress cleanups that PR #205's frontend work exposed: - KEYCLOAK_AUTH_REQUIRED was set in dev (false) and staging (true) configmaps but the backend stopped reading it after smartem-decisions#285 (commit 2ec937d, "remove KEYCLOAK_AUTH_REQUIRED flag, enforce azp allow-list"). Auth is always enforced; the entry is dead config and misleading. Removed from both. - KEYCLOAK_CLIENT_ID="SmartEM" in both configmaps is also dead config for the backend. The backend's auth.py reads KEYCLOAK_ALLOWED_AZP (comma-separated list), not KEYCLOAK_CLIENT_ID. The SmartEM Agent reads KEYCLOAK_CLIENT_ID from its own local config file, not the cluster configmap. Removed from both backend configmaps. - Staging gains KEYCLOAK_ALLOWED_AZP="SmartEM_User,SmartEM_Agent" so the azp allow-list is actually populated (was the intent of the old KEYCLOAK_CLIENT_ID line; now expressed in the var the backend reads). Dev stays permissive — comment documents the env var if someone wants to restrict. - production/ingress.yaml host changes from the smartem.example.com placeholder to the real smartem.diamond.ac.uk. Staging's host remains a placeholder pending the real value.
nginx's explicit `resolver` directive in the SPA image doesn't consult /etc/resolv.conf's search list, so the short name `smartem-http-api-service` returns NXDOMAIN and the SPA's /api/ proxy 502s. Switch to the in-cluster FQDN per environment namespace.
The SmartEM_User client only listed Vite dev ports (5173/5174) in redirectUris/webOrigins. The k8s dev deploy serves the SPA at NodePort 30100, so Keycloak rejected the auth flow. Add 30100 alongside. The same realm file is mounted by both the kustomize ConfigMap (k3s local dev) and keycloak-mock/docker-compose.yml (frontend-only dev) - single source of truth, no mirroring needed.
…re KEYCLOAK_ALLOWED_AZP
smartem-decisions#285 made backend auth unconditional and removed the
KEYCLOAK_AUTH_REQUIRED env var entirely. KEYCLOAK_CLIENT_ID was likewise
superseded by KEYCLOAK_ALLOWED_AZP (the azp allow-list) when the
backend ConfigMap was cleaned up earlier in this branch.
The YAMLs already dropped both keys, but dev-k8s.sh kept reading them
from .env and re-injecting them into smartem-config via
`kubectl create configmap --from-literal=...`, undoing the YAML cleanup
on every dev deploy. The env-examples advertised the same dead knobs.
Now:
- env-examples/.env.example.k8s.{development,staging}: drop
KEYCLOAK_AUTH_REQUIRED and KEYCLOAK_CLIENT_ID. Staging gains
KEYCLOAK_ALLOWED_AZP=SmartEM_User,SmartEM_Agent to mirror the YAML;
development leaves it commented (any valid realm token accepted).
- scripts/k8s/dev-k8s.sh: drop both vars from the override check,
defaults, log line, and `kubectl create configmap` args. Append
KEYCLOAK_ALLOWED_AZP only when explicitly set (preserves the
"unset = any realm token" semantics).
- k8s/environments/{staging,production}/smartem-frontend.yaml: comment
now points readers at KEYCLOAK_ALLOWED_AZP instead of the deleted
KEYCLOAK_CLIENT_ID.
Local verification:
- kubectl kustomize k8s/environments/development - 926 lines, clean
- kubectl kustomize k8s/environments/staging - 579 lines, clean
- kubectl kustomize k8s/environments/production - 577 lines, clean
- grep KEYCLOAK_AUTH_REQUIRED in rendered output: 0 hits all three
- grep KEYCLOAK_CLIENT_ID in rendered output: 0 hits all three
- bash -n scripts/k8s/dev-k8s.sh: OK
- k8s/environments/staging/ingress.yaml: smartem-staging.example.com -> smartem-staging.diamond.ac.uk. Production was already smartem.diamond.ac.uk (since becc1cd), so all hostnames are now real. - k8s/ingress.yaml: deleted. Added in the initial k8s scaffolding (a5e88da, 2025-12-08) as an early sketch for exposing the backend HTTP API directly via ingress. Pre-dates the current architecture where the SPA pod's nginx reverse-proxies /api/ to the backend internally, so a single frontend ingress is sufficient for browser traffic. Was not referenced by any kustomization, lived in the dev namespace (which uses NodePort, not ingress), and carried a placeholder host. The remaining backend-facing ingress use case is the Windows agent's connectivity story, which will follow the existing per-env pattern under k8s/environments/<env>/ rather than this root-level file. Recoverable from git history if needed. Local verification: - kubectl kustomize k8s/environments/{development,staging,production} renders clean (926/579/577 lines) - staging ingress now resolves to smartem-staging.diamond.ac.uk; production unchanged at smartem.diamond.ac.uk
…son + client rename The smartem-frontend SPA stopped reading VITE_KEYCLOAK_* / VITE_AUTH_ENABLED build-time vars some time ago - main.tsx now fetches /config.json at boot and apps/smartem/src/auth/config.ts is the only consumer, with no fallback to import.meta.env. The smartem-frontend repo updated its own apps/smartem/.env.example to reflect this, but the docs in this repo (keycloak-mock/README.md, docs/development/local-keycloak.md) still instructed developers to edit a .env.local file with VITE_* keys that no code reads. Other staleness in the same docs: - Client name "SmartEM" predates the rename in smartem-devtools#198 to SmartEM_User (browser) + SmartEM_Agent (Windows agent). Both docs only mentioned the old single client. - Redirect URIs listed 5173 + 5174 only - the realm now also allows http://localhost:30100/* for the SPA pod's k3s NodePort (added in this branch's earlier `fix(keycloak-mock): allow NodePort SPA redirect URI`). - Seeded users list mentioned valuser/valpass; the realm only ships devuser now. - The "Disabling auth entirely" section described VITE_AUTH_ENABLED=false as a clean opt-out, but smartem-decisions#285 made backend auth unconditional, so setting authEnabled:false in config.json just bypasses the SPA's login screen - every /api/ call still 401s. Only useful when paired with MSW (VITE_ENABLE_MOCKS=true) or for views that don't fetch from the backend. Reframed to call out the caveat explicitly. Updated both docs to reflect the runtime-config.json mechanism (edit apps/smartem/public/config.json for `npm run dev:smartem`; the k8s ConfigMap mount overrides it in deploys), the two-client realm layout, the full redirect URI set, the actual seeded user, and the auth-disable caveat. No code or manifest changes.
vredchenko
added a commit
that referenced
this pull request
May 22, 2026
Fold the configmap and ingress cleanups that PR #205's frontend work exposed: - KEYCLOAK_AUTH_REQUIRED was set in dev (false) and staging (true) configmaps but the backend stopped reading it after smartem-decisions#285 (commit 2ec937d, "remove KEYCLOAK_AUTH_REQUIRED flag, enforce azp allow-list"). Auth is always enforced; the entry is dead config and misleading. Removed from both. - KEYCLOAK_CLIENT_ID="SmartEM" in both configmaps is also dead config for the backend. The backend's auth.py reads KEYCLOAK_ALLOWED_AZP (comma-separated list), not KEYCLOAK_CLIENT_ID. The SmartEM Agent reads KEYCLOAK_CLIENT_ID from its own local config file, not the cluster configmap. Removed from both backend configmaps. - Staging gains KEYCLOAK_ALLOWED_AZP="SmartEM_User,SmartEM_Agent" so the azp allow-list is actually populated (was the intent of the old KEYCLOAK_CLIENT_ID line; now expressed in the var the backend reads). Dev stays permissive — comment documents the env var if someone wants to restrict. - production/ingress.yaml host changes from the smartem.example.com placeholder to the real smartem.diamond.ac.uk. Staging's host remains a placeholder pending the real value.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Phase B of the smartem-frontend k8s deploy work. Adds Deployment + Service + ConfigMap for the frontend image produced by smartem-frontend#94 (v0.2.0, now on GHCR), plus Ingress for staging/production. The image from Phase A is environment-agnostic — it ships a placeholder
config.jsonwith dev defaults and reverse-proxies/api/to the backend service through its own nginx — so a single tag deploys to every environment with only ConfigMap and env-var differences.Also bundles a small cleanup pass on dead Keycloak config that the backend stopped reading in smartem-decisions#285 (
KEYCLOAK_AUTH_REQUIRED) and on this branch's earlier rename (KEYCLOAK_CLIENT_ID→KEYCLOAK_ALLOWED_AZPallow-list), removes a stale root-levelk8s/ingress.yamlorphan from the original December scaffolding, and refreshes the local-dev Keycloak docs to match the SPA's current runtime-config.jsonmechanism.What lands per environment
Each
k8s/environments/<env>/smartem-frontend.yamlcarries three documents:smartem-frontend-config— the runtimeconfig.json(Keycloak URL/realm/clientId +authEnabled). The DeploymentsubPath-mounts this onto/usr/share/nginx/html/config.json, overriding the placeholder shipped in the image.smartem-frontend— pullsghcr.io/diamondlightsource/smartem-frontend:latest, setsBACKEND_HOST=smartem-http-api-serviceso the SPA pod's nginx proxies/api/internally.smartem-frontend-service— NodePort30100in development (next free slot in the 30000s range used by smartem-http-api/Keycloak/RabbitMQ/Postgres/Adminer), ClusterIP in staging and production.Per-environment config.json values
keycloak.urlauthEnabledhttp://localhost:30090(Keycloak mock NodePort, browser-reachable)truehttps://identity-test.diamond.ac.uktruehttps://identity.diamond.ac.uktrueAll three set
realm: dls,clientId: SmartEM_User(post-rename in smartem-devtools#198), andauthEnabled: true. The backend enforces Keycloak Bearer-token validation unconditionally on every non-exempt request since smartem-decisions#285 — there is no opt-out, so the SPA must always complete the login ceremony to talk to/api/.Ingress
Staging and production only — development keeps the NodePort pattern, consistent with everything else in
k8s/environments/development/. Eachingress.yamlroutes a single host tosmartem-frontend-serviceon port 80. The SPA's nginx handles/api/proxying internally, so one route covers both the SPA and API traffic.smartem-staging.diamond.ac.uksmartem.diamond.ac.ukdev-k8s.sh
The access-URLs section gains:
Keycloak (mock): http://localhost:30090— was missed when smartem-devtools#198 landedSmartEM Frontend: http://localhost:30100Dead-config cleanup (
KEYCLOAK_AUTH_REQUIRED,KEYCLOAK_CLIENT_ID)The backend stopped reading
KEYCLOAK_AUTH_REQUIREDin smartem-decisions#285 (auth is unconditional, no opt-out). This branch had already removed the key fromk8s/environments/{development,staging}/configmap.yaml, butscripts/k8s/dev-k8s.shwas still reading it from.env, defaulting it to"false", and re-injecting it intosmartem-configon every dev deploy — silently undoing the YAML cleanup. Same story forKEYCLOAK_CLIENT_ID, which the staging ConfigMap replaced withKEYCLOAK_ALLOWED_AZP="SmartEM_User,SmartEM_Agent".Cleanup:
env-examples/.env.example.k8s.{development,staging}: drop both dead keys. Staging gainsKEYCLOAK_ALLOWED_AZP=SmartEM_User,SmartEM_Agentto mirror the YAML; development leaves it commented out (any valid realm token accepted in local dev).scripts/k8s/dev-k8s.sh: drop both vars from the override check, defaults, log line, andkubectl create configmapargs. AppendKEYCLOAK_ALLOWED_AZPonly when explicitly set, preserving the "unset = any realm token" semantics.smartem-frontend.yamlcomments now point readers atKEYCLOAK_ALLOWED_AZPinstead of the deletedKEYCLOAK_CLIENT_ID.Orphan ingress removal
k8s/ingress.yamlat the repo root was added in the initial k8s scaffolding (a5e88da, 2025-12-08) as an early sketch for exposing the backend HTTP API directly via ingress. It pre-dated the current SPA-pod-nginx-proxies-/api/architecture, was not referenced by any kustomization, lived in the dev namespace (which uses NodePort, not ingress), and carried a placeholder host. Deleted — recoverable from git history if needed. The remaining backend-facing ingress use case (Windows agent connectivity) is tracked in #206.Frontend-dev Keycloak doc refresh
The SPA stopped reading
VITE_KEYCLOAK_*/VITE_AUTH_ENABLEDbuild-time vars;apps/smartem/src/main.tsxfetches/config.jsonat boot andapps/smartem/src/auth/config.tsis the only consumer (no fallback toimport.meta.env). The smartem-frontend repo updated its ownapps/smartem/.env.exampleto reflect this, but the local-dev docs in this repo still instructed developers to edit a.env.localfile withVITE_*keys that no code reads.Same docs also lagged on other realm changes from this branch's
feat: add SmartEM_Agent client, rename SmartEM to SmartEM_Usercommit andfix(keycloak-mock): allow NodePort SPA redirect URI:SmartEMpredated the rename toSmartEM_User+SmartEM_Agent.5173/5174; the realm now also allowshttp://localhost:30100/*.valuser/valpass(not in the realm anymore — onlydevuserships).VITE_AUTH_ENABLED=falseas a clean opt-out, but #285 made backend auth unconditional, so settingauthEnabled:falseonly bypasses the SPA's login screen — every/api/call still 401s. Reframed with the caveat (use with MSWVITE_ENABLE_MOCKS=trueor for views that don't fetch from the backend).Updated
docs/development/local-keycloak.mdandkeycloak-mock/README.mdto point atapps/smartem/public/config.jsonas the dev-time source of truth.Local verification
Static rendering:
kubectl kustomize k8s/environments/development— 926 lines, cleankubectl kustomize k8s/environments/staging— 579 lines, cleankubectl kustomize k8s/environments/production— 577 lines, cleangrep -c KEYCLOAK_AUTH_REQUIREDin rendered output:0across all three envsgrep -c KEYCLOAK_CLIENT_IDin rendered output:0across all three envssmartem-staging.diamond.ac.uk; production unchanged atsmartem.diamond.ac.ukbash -n scripts/k8s/dev-k8s.sh: syntax OKgrepforVITE_KEYCLOAK_/VITE_AUTH_ENABLED/valuserin the repo:0hits (was 9 before this branch)End-to-end auth loop, driven against the live local k3s cluster (backend
0.1.1rc48.dev0+g5bc8e22b3.d20260521, post-#285):curl http://localhost:30100/version— 200, returns the SPA version JSONcurl http://localhost:30100/config.json— 200, returns dev ConfigMap content withauthEnabled: true,clientId: SmartEM_User,keycloak.url: http://localhost:30090curl http://localhost:30100/api/health— 200 (the/healthpath is inEXEMPT_PATHSinsmartem_backend/auth.py, so it bypasses Bearer validation; the SPA pod's nginx strips/api/and proxies to the backend's/health)curl http://localhost:30100/api/acquisitions(no token) — 401 withwww-authenticate: Bearerand{"detail":"Missing or malformed Authorization header"}— confirms unconditional auth on a real routehttp://localhost:30100, SPA renders the auth-gate sign-in screen (smartem-frontend#05b2c9d); SIGN IN redirects tohttp://localhost:30090/realms/dls/protocol/openid-connect/auth?client_id=SmartEM_User&redirect_uri=…(code + PKCE flow); login asdevuser/devpassreturns tohttp://localhost:30100/, header shows "Dev User", and/acquisitionsroute fires an authenticatedGET /api/acquisitionsthat returns 200Out of scope / follow-ups
smartem-http-apiNodePort in dev. Tracked in Plan agent connectivity to the backend API from outside the cluster #206 to land as a per-env file underk8s/environments/<env>/once the route shape is decided.Test plan
curl http://localhost:30100/versionreturns{"frontend": "0.2.0", ...}curl http://localhost:30100/config.jsonreturns the dev ConfigMap content (authEnabled: true)curl http://localhost:30100/api/healthreturns 200 (the/healthpath is inEXEMPT_PATHSand reaches the backend through the SPA pod's nginx)curl http://localhost:30100/api/acquisitionswithout a token returns 401 (auth is unconditional on non-exempt routes)http://localhost:30100, SPA redirects to the Keycloak mock login, sign-in asdevuser/devpassreturns to the SPA, and/api/acquisitionssucceeds with the bearer token./scripts/k8s/dev-k8s.sh down && ./scripts/k8s/dev-k8s.shrolls thesmartem-configConfigMap so the staleKEYCLOAK_AUTH_REQUIRED/KEYCLOAK_CLIENT_IDkeys (left over from an olderdev-k8s.shrun) disappear from the deployed state — functionally a no-op since the backend ignores them, but worth doing for hygiene