Background
On 2026-06-17, shortly after deploying #463, Pulldasher went down with:
GET /orgs/iFixit/members/danielbeardsley - 403
Request quota exhausted for request GET /orgs/{org}/members/{username}
Root cause
The org-membership 403 is a symptom, not the cause. That call lives only in lib/authentication.js on the login callback — it failed because the whole hourly REST quota was already drained, so login couldn't complete (app appears "down").
What drained the quota was the deploy restart. On boot, app.js runs refresh.openPulls(), sending every open pull (across all repos) through git-manager.parse(). Each parse() is ~8–15 REST requests:
| call |
cost |
getReviews, getIssueComments, getPullReviewComments, getIssueEvents |
paginated, 1+ each |
getIssue, getCommit, getCombinedStatusForRef |
1 each |
getAllJobRuns |
1 for workflow runs + 1 per workflow run for jobs |
N open pulls × ~10 calls, back-to-back, blew the 5000/hr quota. Any restart can trigger this; #463 was simply the deploy that tipped it over. #463's own code (models/signature.js parseReview) adds zero API calls.
Proposed paths (ranked)
- ETag conditional requests — a conditional request returning
304 Not Modified (If-None-Match + Authorization) does not count against the primary rate limit (docs). Octokit does not cache by default. On restart almost every open pull is unchanged, so a per-URL ETag cache turns the startup resync from thousands of billed calls into ~0. Smallest change, biggest win. Draft PR incoming.
- Trim
getAllJobRuns — heaviest per-pull cost (workflow runs + one jobs call per run) and overlaps getCombinedStatusForRef. Switch to the Checks API or drop the job-level fetch and lean on the check_run webhook.
- Don't full-resync on boot — pulls are already warmed from the DB (
app.js:61-69); gate or pace the refresh.openPulls() cold-start spike.
- Targeted webhook handling —
issue_comment edited/deleted and pull_request_review edited/dismissed do a full ~10-call parse. Handle incrementally (e.g. a dismissed approval just invalidates its CR signature by comment_id).
- GraphQL batch — one query replaces ~8 REST calls per pull. Bigger lift; only if 1–4 fall short.
Minor: cache the login org-membership check (short TTL) so an exhausted quota doesn't lock everyone out of the dashboard.
Ask
Aligning on the path forward. Starting with #1 as a draft to evaluate.
Background
On 2026-06-17, shortly after deploying #463, Pulldasher went down with:
Root cause
The org-membership 403 is a symptom, not the cause. That call lives only in
lib/authentication.json the login callback — it failed because the whole hourly REST quota was already drained, so login couldn't complete (app appears "down").What drained the quota was the deploy restart. On boot,
app.jsrunsrefresh.openPulls(), sending every open pull (across all repos) throughgit-manager.parse(). Eachparse()is ~8–15 REST requests:getReviews,getIssueComments,getPullReviewComments,getIssueEventsgetIssue,getCommit,getCombinedStatusForRefgetAllJobRunsN open pulls × ~10 calls, back-to-back, blew the 5000/hr quota. Any restart can trigger this; #463 was simply the deploy that tipped it over. #463's own code (
models/signature.jsparseReview) adds zero API calls.Proposed paths (ranked)
304 Not Modified(If-None-Match+Authorization) does not count against the primary rate limit (docs). Octokit does not cache by default. On restart almost every open pull is unchanged, so a per-URL ETag cache turns the startup resync from thousands of billed calls into ~0. Smallest change, biggest win. Draft PR incoming.getAllJobRuns— heaviest per-pull cost (workflow runs + one jobs call per run) and overlapsgetCombinedStatusForRef. Switch to the Checks API or drop the job-level fetch and lean on thecheck_runwebhook.app.js:61-69); gate or pace therefresh.openPulls()cold-start spike.issue_commentedited/deleted andpull_request_reviewedited/dismissed do a full ~10-callparse. Handle incrementally (e.g. a dismissed approval just invalidates its CR signature bycomment_id).Minor: cache the login org-membership check (short TTL) so an exhausted quota doesn't lock everyone out of the dashboard.
Ask
Aligning on the path forward. Starting with #1 as a draft to evaluate.