Go-live: Tuesday 5 May 2026, 1pm–3pm BST
Scope: Laravel 10 + group tags deployed together to restarters.net
- Get outstanding PRs merged into
develop - Stand up
restarters.devfromdevelop - Set up auto-deploy for
develop→restarters-dev— CircleCI deploy job added; addFLY_API_TOKENto CircleCI project env vars to activate (token already generated) - Set up queue and app monitoring — supervisord keeps processes up, but need alerting if queue backs up or app goes unhealthy
- Tidy
productionbranch — force-pushed fromdevelop; no node_modules committed - Activate auto-deploy for
production→restarters(same CircleCI pattern as develop, once develop is working) - Rebuild "yesterday" restore system
- Deploy Mailpit for dev:
restarters-dev-maillive athttps://restarters-dev-mail.fly.dev— navbar onrestarters-devlinks to it - Set Fly secrets on
restarters— all secrets deployed (mail, Discourse, Wiki, Drip, analytics, etc.) -
fly.tomlupdated for production:APP_ENV=production,APP_URL=https://restarters.net,SESSION_DOMAIN=.restarters.net,SENTRY_ENVIRONMENT=production, VM scaled toshared-cpu-2x / 4GB - Deployed to
restartersFly app — healthy, production config live - TLS cert for
restarters.netissued (Let's Encrypt, RSA+ECDSA) — pre-provisioned via_acme-challengeCNAME before DNS cutover; cert is active now - Production DB synced from
restarters_mc2RhNwviafly-migrate.sh --db --images(6m16s); row counts verified against TCP connection - API compatibility check — no breaking changes for known consumers. All v1 stats endpoints, RepairTogether, Zapier triggers, and TRP.org widgets are unchanged. Tag visibility changes are intentional and fine.
- Check and renew
restarters.netdomain at iwantmyname (noted as due soon) - Write Fly.io ops crib for Neil
- Re-check known-issues list (some may be fixed since Laravel 10 / group tags work)
- Tell network coordinators about 5 May 1–3pm window
- Put banner on site once Big Give finishes (~29 April)
fly secrets set \
MAIL_FROM_ADDRESS=noreply@mg.restarters.net \
MAIL_FROM_NAME="..." \
WP_XMLRPC_ENDPOINT="..." \
WP_XMLRPC_USER="..." \
WP_XMLRPC_PSWD="..." \
DRIP_CAMPAIGN_ID="..." \
-a restartersValues from the production .env on restart-sp. MAIL_FROM_ADDRESS is changed (not copied) — fixes DMARC alignment.
✅ Pre-done (before window): fly.toml updated, app deployed, TLS cert issued, DB synced.
| When | Action |
|---|---|
| 1:00pm | Tell network coordinators not to log devices |
| 1:02pm | Put old server in maintenance mode: php artisan down --retry=60 on restart-sp |
| 1:04pm | Run final DB + image sync: ./fly-migrate.sh --db --images on restart-sp (~6 min) |
| ~1:12pm | Switch DNS — point restarters.net A → 66.241.124.187, AAAA → 2a09:8280:1::ce:b85f:0 |
| ~1:15pm | Run smoke tests (see below) |
| 3:00pm | Done, or roll back |
TLS cert is already issued — no delay on cutover. New server is NOT put in maintenance mode.
Also update (already done):fly.toml env before deploying
APP_ENV = "production"
APP_URL = "https://restarters.net"
FEATURE__DISCOURSE_INTEGRATION = "true"
SENTRY_ENVIRONMENT = "production"
SESSION_DOMAIN = ".restarters.net"- Homepage loads, HTTPS works, HSTS header present
- Static assets and uploaded images load (verifies Tigris proxy)
- Login, session persistence
- Password reset email arrives (tests Mailgun)
- Create a repair event and log a device
- Upload an image — verify it persists after page reload
- Locale switch to French
- Discourse SSO — click Talk link, verify login works
- WordPress XML-RPC — create/edit event, verify it appears on therestartproject.org
- Network subdomains resolve correctly
-
GET /api/homepage_data,/api/party/{id}/stats,/api/group/{id}/statsreturn valid data - Queue worker running:
fly ssh console -a restarters -C "ps aux | grep queue:work" - No unexpected failed jobs:
fly ssh console -a restarters -C "php artisan queue:failed"
Switch DNS back to 139.59.184.196 and run php artisan up on restart-sp. With 1-hour TTL, propagation is fast. Any writes that landed on the new server during the window can be salvaged manually.
Roll back if (within the 2-hour window): site unreachable > 10 min, data loss, login/device-logging broken, email completely failed, or sustained error rate > 5% in Sentry.
- Verify hourly Google Drive backups are running
- Run "yesterday" restore to confirm backup/restore pipeline works
- Verify Discourse sync (
discourse:syncgroups) and language sync are running on schedule - Check Mailgun dashboard for delivery stats
- Scale down VMs if usage data supports it
- Restore DNS TTL to 3600s after 48 hours of stable operation
- Move Repair Directory (
map.restarters.net) to Fly as a separate container in the same project, accessing the same DB
DNS records to change on go-live:
restarters.netA →66.241.124.187restarters.netAAAA →2a09:8280:1::ce:b85f:0
Do not touch: talk.restarters.net, wiki.restarters.net, mg.restarters.net, map.restarters.net