Skip to content

feat(api): add runtime aggregator role toggle#309

Merged
MegaRedHand merged 10 commits intomainfrom
runtime-aggregator-toggle
Apr 21, 2026
Merged

feat(api): add runtime aggregator role toggle#309
MegaRedHand merged 10 commits intomainfrom
runtime-aggregator-toggle

Conversation

@MegaRedHand
Copy link
Copy Markdown
Collaborator

@MegaRedHand MegaRedHand commented Apr 20, 2026

Summary

Ports leanSpec PR #636. Adds two admin endpoints that let operators rotate aggregation duties across nodes at runtime without a restart.

  • GET /lean/v0/admin/aggregator returns {"is_aggregator": bool}
  • POST /lean/v0/admin/aggregator with body {"enabled": bool} returns {"is_aggregator": <new>, "previous": <old>}

Returns 503 when the controller is not wired, 400 on malformed body or non-boolean enabled (JSON ints 0/1 are explicitly rejected, matching the spec).

Design

The CLI flag --is-aggregator now seeds a shared AggregatorController (a thin Arc<AtomicBool> in ethlambda-types) instead of a plain bool. The blockchain actor reads the flag fresh on every tick and every gossip attestation, so runtime toggles take effect from the next tick with no protocol-message churn.

axum::Extension is used to thread the controller into admin handlers so existing store-backed routes don't need to change their signatures.

The lean_is_aggregator metric is updated from the actor's tick read (with a cached last_seen_aggregator) rather than from the HTTP handler, so the gauge reflects what the actor actually acted on.

Scope limitations (intentional, matching spec)

Scope Status
Persistence across restart ❌ in-process only
Gossip subnet re-subscription ❌ frozen at startup
ENR advertisement ❌ no ENR lifecycle
Authentication ❌ consistent with rest of HTTP API

Operational model: standby aggregators must boot with --is-aggregator=true so gossip subscriptions are in place, then use the admin endpoint to rotate duties (hot-standby model). A node booted with --is-aggregator=false and toggled on later will have no extra subnets to aggregate. This is documented in the CLI flag's docstring and the CLAUDE.md gotcha section.

@github-actions
Copy link
Copy Markdown

🤖 Kimi Code Review

Overall Assessment: Well-structured PR implementing runtime aggregator toggling with correct atomic semantics and clear documentation. The hot-standby model is appropriately constrained (static gossip subscriptions, mutable aggregation logic).

Critical Issues

  1. Security: Unauthenticated Admin Endpoint (crates/net/rpc/src/admin.rs:30-85)
    The POST /lean/v0/admin/aggregator endpoint lacks any authentication/authorization. Anyone with network access to the API can toggle the node's aggregator role, potentially causing denial-of-service (forced aggregation cessation) or resource exhaustion (forced aggregation start).

    Recommendation: Add Bearer token authentication or bind admin endpoints to loopback only (127.0.0.1) by default, with clear documentation that exposing this endpoint to untrusted networks is unsafe.

Code Quality & Correctness

  1. Unused Struct (crates/net/rpc/src/admin.rs:118-122)
    The ToggleRequest struct is defined but unused (handler parses serde_json::Value directly). While marked #[allow(dead_code)], this creates maintenance burden. Consider removing it or using it with serde_json::from_value after the initial Value extraction to validate schema.

  2. Metric Consistency (crates/blockchain/src/lib.rs:139-143)
    The last_seen_aggregator optimization only updates in on_tick, but on_gossip_attestation (line 501) also reads the flag fresh. If the actor is multi-threaded, the metric could temporarily diverge from behavior, but this is benign since metrics are best-effort. If strict alignment is required, move the metric update into a shared helper used by both call sites.

Rust Best Practices

  1. Atomic Ordering (crates/common/types/src/aggregator.rs:37-43)
    Current usage of Ordering::Acquire (load) and AcqRel (swap) is correct for the "single writer, multiple readers" pattern. However, given the low criticality (no safety invariants depend on this flag, only performance/liveness), Ordering::Relaxed would suffice and be slightly faster. No change required unless profiling shows contention.

  2. Error Handling (crates/net/rpc/src/admin.rs:60-85)
    The bad_request and service_unavailable helpers use &'static str bodies. For production APIs, consider structured JSON errors (e.g., {"error": "description"}) for easier client parsing, though this may be out of scope for leanSpec parity.

Consensus & Safety

  1. Fork Choice Safety
    Verified: The change only affects attestation production (aggregation) logic, not validation or fork choice weight calculations. The node continues to validate attestations and build blocks correctly regardless of the toggle state.

  2. State Transition Safety
    Verified: produce_attestations correctly snapshots the flag at tick start (line 136), ensuring consistent behavior within the slot processing window. The fresh read in on_gossip_attestation is appropriate for inter-tick events.

Documentation

  • Excellent inline documentation of the "hot-standby" invariant in crates/net/p2p/src/lib.rs and CLAUDE.md.
  • Clear explanation that gossip subscriptions are frozen at startup.

Minor Suggestions

  1. Test Coverage (crates/net/rpc/src/admin.rs:194-197)
    The post_returns_503_without_controller test verifies the error case, but the happy-path integration with the actual controller could verify the side effect propagates (though the unit tests in aggregator.rs cover the atomicity).

Approved with Reservations: Address the authentication concern on the admin endpoint before deploying to production networks. The consensus logic implementation is correct.


Automated review by Kimi (Moonshot AI) · kimi-k2.5 · custom prompt

@github-actions
Copy link
Copy Markdown

🤖 Codex Code Review

  1. High: the new POST /lean/v0/admin/aggregator endpoint is completely unauthenticated and directly mutates consensus behavior. See crates/net/rpc/src/admin.rs:55, crates/net/rpc/src/lib.rs:24, bin/ethlambda/src/main.rs:53, and bin/ethlambda/src/main.rs:213. Because http_address is operator-configurable, any host that can reach the API can flip aggregation on/off. For this client that is not just an observability concern: it lets a remote caller suppress aggregate production or flap the role mid-slot, degrading attestation aggregation/liveness on subscribed subnets. This should be gated behind authentication, a separate localhost-only admin listener, or an explicit refusal to expose admin routes on non-loopback binds.

I didn’t see another obvious fork-choice or state-transition correctness regression in the diff itself. I couldn’t run the Rust tests here because the sandbox blocks rustup/toolchain setup (Read-only file system under /home/runner/.rustup/tmp).


Automated review by OpenAI Codex · gpt-5.4 · custom prompt

@github-actions
Copy link
Copy Markdown

🤖 Claude Code Review

Here is the full review:


PR #309feat(api): add runtime aggregator role toggle

Overview: Ports leanSpec PR #636. Adds GET/POST /lean/v0/admin/aggregator endpoints backed by a new AggregatorController (Arc<AtomicBool>) shared between the blockchain actor and the HTTP server. The blockchain actor reads fresh on every tick and every gossip attestation, so toggles take effect without a restart.


What Works Well

  • Correct atomic ordering. Ordering::Acquire on reads and Ordering::AcqRel on swap are a sound pairing. The barrier test correctly ensures the main thread's assertion is sequenced strictly after the writer's store.
  • Tick-level consistency. Snapshotting is_aggregator once at the top of on_tick and passing it down to produce_attestations prevents mid-tick races between the admin API and the actor. The gossip path reads fresh, which is also correct — a toggle that lands between ticks should be observed on the very next gossip event.
  • Metric update is actor-side. Emitting lean_is_aggregator from the actor tick (guarded by last_seen_aggregator) rather than from the HTTP handler is the right call — the gauge reflects what the protocol layer actually did.
  • Graceful degradation. Option<Extension<AggregatorController>> on both handlers returns 503 cleanly when the controller isn't wired.
  • Strict boolean enforcement. Rejecting JSON integers 0/1 matches the spec's semantics and is explicitly tested.
  • Test coverage. 10 new HTTP tests covering all 200/400/503 paths, plus 4 unit tests on the controller itself including cross-thread visibility.

Issues

Dead code: ToggleRequest

crates/net/rpc/src/admin.rs, near the bottom:

#[derive(Debug, Deserialize)]
#[allow(dead_code)]
pub struct ToggleRequest {
    pub enabled: bool,
}

This struct is never used — not in the handler (which uses raw serde_json::Value) and not in any test (which use raw JSON strings). The comment says it's kept "for documentation/testing convenience," but it adds confusion ("is this the intended deserialization path?"). Remove it; the comment in the handler is sufficient documentation.


Error responses are plain text while success responses are JSON

crates/net/rpc/src/admin.rs:

fn bad_request(reason: &'static str) -> Response {
    (StatusCode::BAD_REQUEST, reason).into_response()
}
fn service_unavailable(reason: &'static str) -> Response {
    (StatusCode::SERVICE_UNAVAILABLE, reason).into_response()
}

These emit text/plain bodies. If callers assume all responses from this API are JSON and attempt JSON.parse() on a 400/503, they'll get a parse error on top of an error. Consider returning {"error": "<reason>"} with Content-Type: application/json for consistency, or at minimum document the discrepancy. Other endpoints in the codebase appear to return JSON on success only too, so this may be intentional — but worth confirming against the spec.


Metric staleness window not mentioned in CLAUDE.md gotcha

The new gotcha section in CLAUDE.md documents the hot-standby model and the gossip-subnet freeze, but doesn't mention that lean_is_aggregator lags until the next tick (up to 4 seconds). An operator monitoring this metric might see it remain at 0 briefly after POST-ing {"enabled": true}. Worth one sentence in the docs.


No test for POST without Content-Type: application/json

All POST tests set header("content-type", "application/json"). If a caller sends a request without Content-Type, axum may reject the Json extractor before the handler runs (returning a 415 or 422 rather than the handler's 400). This behavior may be acceptable, but it's worth a test to pin the contract, especially since the PR explicitly documents the 400-on-bad-body contract.


Ordering::AcqRel on swap is stronger than necessary (non-blocking)

AcqRel on an atomic swap provides both an acquire and a release fence. Since this flag IS the shared data (there's no other heap state synchronized through it), Relaxed on both reads and writes would be sufficient for correctness on x86/ARM. Acquire/Release is always safe and the overhead is negligible, so this is not a change request — just noting that the choice is conservative rather than optimal.


on_gossip_attestation reads fresh but the previous tick's snapshot may differ

crates/blockchain/src/lib.rs:

fn on_gossip_attestation(&mut self, attestation: &SignedAttestation) {
    let is_aggregator = self.aggregator.is_enabled();
    let _ = store::on_gossip_attestation(&mut self.store, attestation, is_aggregator)

A gossip event arriving between ticks uses the live value of the atomic. This is correct and documented in the diff comment. However, it means a toggle mid-slot affects gossip-path attestations immediately but only affects tick-path logic (interval 0/1/2/4) on the next tick. This asymmetry is benign here (the gossip path only stores signatures; the tick path does aggregation), but worth a note in the actor comment for future maintainers.


Verdict

Solid implementation. The architectural choices are sound, the test coverage is good, and the operational model is clearly documented. The two actionable items are removing ToggleRequest and considering JSON error bodies for consistency. The metric staleness window in CLAUDE.md is a minor doc gap. The rest are informational.


Automated review by Claude (Anthropic) · sonnet · custom prompt

Ports leanSpec PR #636: adds `GET /lean/v0/admin/aggregator` and
`POST /lean/v0/admin/aggregator` endpoints that let operators rotate
aggregation duties across nodes without restarting.

The `--is-aggregator` CLI flag now seeds a shared `AggregatorController`
(an `Arc<AtomicBool>` in `ethlambda-types`) rather than a plain bool.
The blockchain actor reads the flag fresh on every tick and gossip
attestation, so runtime toggles take effect from the next tick.

Subnet subscriptions remain frozen at startup, matching the spec's
explicit scope limitation: standby aggregators must still boot with
`--is-aggregator` so gossip subscriptions are in place, then use the
admin endpoint to rotate duties (hot-standby model).

The `lean_is_aggregator` metric is synced from the actor's read rather
than from the HTTP handler so the gauge reflects what the actor
actually acted on, not what was requested.

POST strictly rejects non-boolean `enabled` (incl. JSON 0/1) to match
the spec's semantics.
@MegaRedHand MegaRedHand force-pushed the runtime-aggregator-toggle branch from 2bbfe5a to 3a793b8 Compare April 20, 2026 18:11
Comment thread bin/ethlambda/src/main.rs
@MegaRedHand MegaRedHand marked this pull request as ready for review April 20, 2026 22:23
@github-actions
Copy link
Copy Markdown

🤖 Kimi Code Review

The PR implements runtime aggregator toggling correctly with proper documentation and tests. A few minor observations:

Code Quality & Safety

  1. Memory Ordering (crates/common/types/src/aggregator.rs:18,22)
    Ordering::Relaxed is technically sufficient for a standalone flag, but consider Ordering::SeqCst for consensus-critical code to guarantee immediate cross-thread visibility without relying on architecture-specific guarantees. The performance difference is negligible for this use case.

  2. Admin API Authentication (crates/net/rpc/src/admin.rs)
    The endpoints lack authentication. While documented as an operational API, ensure the api_socket binds to localhost only by default (verify in main.rs), or add a warning in CLAUDE.md that exposing this endpoint to the network allows arbitrary toggling of aggregator duties.

Correctness

  1. Consistent Snapshot Pattern (crates/blockchain/src/lib.rs:124-127)
    The pattern of snapshotting is_aggregator once per tick is correct for consistency. However, note that metrics::set_is_aggregator is called with the tick-snapshot value, so metrics will lag behind the actual toggle if it happens between ticks. This is acceptable behavior but worth documenting if precise metric alignment is needed.

Style

  1. Test Helper (crates/net/rpc/src/admin.rs:165-179)
    The post helper function could be async fn post(controller: ..., body: &str) -> (Response, Value) to reduce boilerplate in test assertions, but the current explicit pattern is also fine.

Documentation

  1. CLAUDE.md Update (lines 347-353)
    The operational model documentation is excellent. Consider adding a warning log in main.rs if --is-aggregator=false is set at boot but the node later receives a toggle ON, since it will lack subnet subscriptions (the "frozen at startup" invariant).

The implementation correctly handles the hot-standby model limitations and the distinction between aggregation logic and gossip mesh subscriptions. The use of Option<Extension<_>> in axum handlers for graceful degradation is idiomatic.


Automated review by Kimi (Moonshot AI) · kimi-k2.5 · custom prompt

@github-actions
Copy link
Copy Markdown

🤖 Claude Code Review

Here is the code review for PR #309:


PR #309 — Runtime Aggregator Role Toggle

Overview

Ports leanSpec PR #636. Introduces AggregatorController (Arc<AtomicBool>) to replace the is_aggregator: bool in BlockChainServer, wiring runtime toggle endpoints (GET/POST /lean/v0/admin/aggregator) into the Axum API server. The blockchain actor snapshots the flag once per tick for consistency and reads fresh on gossip events. The design is clean, the scope limitations are well-documented, and the test coverage is solid.


Memory Ordering — aggregator.rs

pub fn is_enabled(&self) -> bool {
    self.flag.load(Ordering::Relaxed)
}

pub fn set_enabled(&self, enabled: bool) -> bool {
    self.flag.swap(enabled, Ordering::Relaxed)
}

Relaxed is technically sufficient here because the flag has no dependent shared data — there is no "once the flag is set, read this other shared buffer" pattern. On x86 it compiles identically to SeqCst. On weakly-ordered architectures (ARM, RISC-V), the worst case is the toggle being invisible for one extra tick, which is acceptable for an admin operation.

That said, the idiomatic Rust pattern for a producer/consumer flag is Release on write and Acquire on read, which establishes the expected happens-before relationship at zero cost on x86. Recommend:

pub fn is_enabled(&self) -> bool {
    self.flag.load(Ordering::Acquire)
}

pub fn set_enabled(&self, enabled: bool) -> bool {
    self.flag.swap(enabled, Ordering::Release)
}

Missing Tests in Diff — aggregator.rs

The new file is 26 lines with no embedded #[cfg(test)] block, but the PR description and CI output (4 passed) indicate unit tests exist. The gh pr diff output doesn't show them, which suggests they may be in a separate test file or the diff is incomplete. Worth double-checking — if they're inline they should appear in the file.


Error Response Format Inconsistency — admin.rs

Success responses are JSON:

json_response(StatusResponse { is_aggregator: controller.is_enabled() })

Error responses are plain text:

fn bad_request(reason: &'static str) -> Response {
    (StatusCode::BAD_REQUEST, reason).into_response()
}

This means Content-Type differs between 200 and 400/503 responses. Callers using a JSON client will get unexpected plain-text bodies on errors. Since this is a new endpoint, aligning errors with the success format (e.g., {"error": "..."}) would be cleaner, but this is a minor inconsistency consistent with what exists in the wider codebase.


Missing Test Coverage — Admin Handlers

The 10 tests are thorough for the documented cases, but two paths are untested:

  • POST with empty body: post_rejects_malformed_json covers "not json" but not a truly empty body (""). The Option<Json<Value>> extractor should handle this correctly but it's an easy test to add.
  • GET reflects value after POST: There is no test that POSTs a toggle and then GETs to confirm the value changed. The existing tests verify state via controller.is_enabled() directly, but end-to-end through HTTP is stronger.

Tick Snapshot vs. Gossip Fresh Read — Subtle Inconsistency

on_tick snapshots is_aggregator once and uses it throughout (including passing to produce_attestations for self-delivery at interval 1):

let is_aggregator = self.aggregator.is_enabled();
// ... passed through entire tick including produce_attestations

on_gossip_attestation reads fresh:

let is_aggregator = self.aggregator.is_enabled();
let _ = store::on_gossip_attestation(&mut self.store, attestation, is_aggregator)

This means if the flag is toggled between self-delivery (interval 1) and an incoming gossip attestation within the same tick, they'll use different values. This is intentional per the comment and is not a correctness issue — gossip events are logically independent of tick execution. The comment documents this clearly.


Option<Extension<T>> Usage — Correct Axum Pattern

pub async fn get_aggregator(controller: Option<Extension<AggregatorController>>) -> Response {

This is the correct approach to avoid axum's default 500 on a missing extension. The 503 fallback matches spec even though in normal operation the controller is always wired.


SwarmConfig Invariant Comment — Appropriate

The comment on SwarmConfig explaining why it holds a plain bool rather than the AggregatorController, and what to do if runtime P2P subscription is ever needed, is non-obvious and justified per the project's commenting guidelines.


Consensus Correctness

No concerns. The PR does not modify fork choice logic, attestation validation, state transition functions, justification/finalization, or signature verification. It only changes the read site for the aggregator flag from a struct field to an atomic. The snapshot-per-tick model preserves the existing behavior that all within-tick decisions see a consistent value.


Summary

  • Recommend: upgrade Ordering::Relaxed to Acquire/Release — low-risk, more idiomatic, no performance impact on x86.
  • Minor: error responses are plain text while success responses are JSON; consider aligning.
  • Minor: add an HTTP round-trip test (POST then GET) and an empty-body POST test.
  • Verify: confirm the 4 aggregator:: unit tests are present in the file (not visible in diff).

The core design is sound — Arc<AtomicBool>, tick-level snapshot for consistency, actor-side metric mirroring, and Option<Extension> for the 503 path are all correct choices. The scope limitations (no persistence, no subnet re-subscription, no auth) are well-documented and match the spec.


Automated review by Claude (Anthropic) · sonnet · custom prompt

@github-actions
Copy link
Copy Markdown

🤖 Codex Code Review

  1. The new admin toggle is an unauthenticated state-changing endpoint on the same HTTP listener as the public API. If an operator binds --http-address to anything other than loopback, any remote client can enable or disable aggregation and directly affect attestation production/aggregation behavior. That is a security regression for node availability and operator control. The route is added unconditionally in crates/net/rpc/src/lib.rs:62, served on the general API socket in crates/net/rpc/src/lib.rs:19, and that socket is operator-configurable via bin/ethlambda/src/main.rs:53. This should be isolated behind a localhost-only admin listener or explicit auth/token gating.

  2. Toggling aggregation off does not clear already-buffered raw gossip signatures, and re-enabling later can aggregate and publish stale attestations from older slots. on_gossip_attestation stores signatures whenever the flag is on (crates/blockchain/src/store.rs:535, crates/blockchain/src/store.rs:579); when the flag is off, interval-2 aggregation is simply skipped (crates/blockchain/src/store.rs:508). Those buffered signatures are only pruned at finalization time (crates/storage/src/store.rs:763), and attestation validation only rejects future slots, not stale ones (crates/blockchain/src/store.rs:447). In practice, an operator can disable aggregation for a few slots, re-enable it, and emit delayed aggregates for old messages. I’d clear the gossip-signature buffer on set_enabled(false) or filter aggregation to a bounded freshness window.

The actor-side wiring itself looks clean: reading the flag per tick/event is straightforward, and the P2P comment about startup-only subnet subscriptions is clear. I couldn’t run the new tests locally because cargo/rustup is blocked in this environment by a read-only temp-path error.


Automated review by OpenAI Codex · gpt-5.4 · custom prompt

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 20, 2026

Greptile Summary

This PR ports leanSpec PR #636 to add two admin endpoints (GET/POST /lean/v0/admin/aggregator) that let operators toggle aggregation duties at runtime without a restart, backed by a shared Arc<AtomicBool> controller threaded into both the blockchain actor and the HTTP server via axum::Extension. The design is clean and well-scoped: the actor snapshots the flag once per tick, gossip handlers re-read it fresh, and the hot-standby model and its limitations are clearly documented.

Confidence Score: 5/5

Safe to merge; all remaining findings are P2 style or coverage suggestions with no blocking correctness issues.

The core logic (AtomicBool toggle, Extension wiring, 503/400 error paths) is sound and backed by 10 admin-route tests plus the pre-existing suite. The only open items are a discrepancy between the PR description's claimed unit tests for AggregatorController and what's actually in the file, plus a style note on error-response content types — neither affects runtime behaviour.

crates/common/types/src/aggregator.rs — verify whether the claimed 4 unit tests were intentionally omitted or accidentally left out of the commit.

Important Files Changed

Filename Overview
crates/common/types/src/aggregator.rs New AggregatorController type wrapping Arc; correct Relaxed ordering for a periodic flag; claimed unit tests are not present in the file
crates/net/rpc/src/admin.rs GET/POST admin handlers with correct 503/400/200 paths; strict bool-only validation; 10 tests covering all branches; error responses are plain-text rather than JSON
crates/blockchain/src/lib.rs BlockChainServer now holds AggregatorController; flag snapshotted once per tick for consistency within a tick, re-read fresh on each gossip event; metric mirrored actor-side
crates/net/rpc/src/lib.rs AggregatorController threaded in via Extension layer; admin routes registered correctly; existing store-backed handlers unchanged
bin/ethlambda/src/main.rs Controller constructed from CLI flag, cloned into BlockChain::spawn and start_api_server; SwarmConfig intentionally keeps a plain bool with an explanatory comment
crates/common/types/src/lib.rs Exports the new aggregator module; straightforward addition
crates/net/p2p/src/lib.rs Adds INVARIANT doc comment on SwarmConfig.is_aggregator clarifying why the gossip-subscription flag stays a plain bool; no logic changes
CLAUDE.md New Runtime Aggregator Toggle gotcha section added; clearly documents the in-process-only scope and subnet subscription limitation

Sequence Diagram

sequenceDiagram
    participant Operator
    participant AdminAPI
    participant Controller
    participant Actor
    Operator->>AdminAPI: POST admin/aggregator
    AdminAPI->>Controller: set_enabled(bool)
    Controller-->>AdminAPI: previous
    AdminAPI-->>Operator: new and previous
Loading
Prompt To Fix All With AI
This is a comment left during a code review.
Path: crates/common/types/src/aggregator.rs
Line: 1-26

Comment:
**Unit tests missing despite PR description claiming 4 passed**

The PR description's test plan explicitly states `cargo test -p ethlambda-types --lib aggregator:: — 4 passed` with a cross-thread visibility test using `Barrier`, but `aggregator.rs` contains no `#[cfg(test)]` block at all. The file ends at line 26 with only the struct and impls. Either the test module was accidentally omitted from the commit, or the description was written ahead of the actual work. Since `AggregatorController` is a thin `Arc<AtomicBool>` wrapper, its behavior is indirectly covered by the admin API tests, but the described standalone visibility tests (cross-thread `Barrier` proof) are absent.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: crates/net/rpc/src/admin.rs
Line: 65-96

Comment:
**Missing `Content-Type` on error responses**

`bad_request` and `service_unavailable` return `(StatusCode, &str).into_response()`, which axum serialises as `text/plain`. Callers relying on `Content-Type: application/json` for all responses (e.g. generic error-parsing middleware) will need to special-case these. Consider returning a small JSON error body for consistency with the success responses:

```rust
fn bad_request(reason: &'static str) -> Response {
    (StatusCode::BAD_REQUEST, Json(serde_json::json!({"error": reason}))).into_response()
}

fn service_unavailable(reason: &'static str) -> Response {
    (StatusCode::SERVICE_UNAVAILABLE, Json(serde_json::json!({"error": reason}))).into_response()
}
```

How can I resolve this? If you propose a fix, please make it concise.

Reviews (1): Last reviewed commit: "Merge branch 'main' into runtime-aggrega..." | Re-trigger Greptile

Comment thread crates/common/types/src/aggregator.rs
Comment on lines +65 to +96
pub async fn post_aggregator(
controller: Option<Extension<AggregatorController>>,
body: Option<Json<Value>>,
) -> Response {
let Some(Extension(controller)) = controller else {
return service_unavailable("Aggregator controller not available");
};

// Parsing happens through `Option<Json<Value>>` so we can distinguish
// "no body / malformed JSON" (None) from "valid JSON with wrong shape".
let Some(Json(payload)) = body else {
return bad_request("Invalid or missing JSON body");
};

let Some(enabled_value) = payload.get("enabled") else {
return bad_request("Missing 'enabled' field in body");
};

let Some(enabled) = enabled_value.as_bool() else {
return bad_request("'enabled' must be a boolean");
};

let previous = controller.set_enabled(enabled);
if previous != enabled {
info!(enabled, previous, "Aggregator role toggled via admin API");
}

json_response(ToggleResponse {
is_aggregator: enabled,
previous,
})
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Missing Content-Type on error responses

bad_request and service_unavailable return (StatusCode, &str).into_response(), which axum serialises as text/plain. Callers relying on Content-Type: application/json for all responses (e.g. generic error-parsing middleware) will need to special-case these. Consider returning a small JSON error body for consistency with the success responses:

fn bad_request(reason: &'static str) -> Response {
    (StatusCode::BAD_REQUEST, Json(serde_json::json!({"error": reason}))).into_response()
}

fn service_unavailable(reason: &'static str) -> Response {
    (StatusCode::SERVICE_UNAVAILABLE, Json(serde_json::json!({"error": reason}))).into_response()
}
Prompt To Fix With AI
This is a comment left during a code review.
Path: crates/net/rpc/src/admin.rs
Line: 65-96

Comment:
**Missing `Content-Type` on error responses**

`bad_request` and `service_unavailable` return `(StatusCode, &str).into_response()`, which axum serialises as `text/plain`. Callers relying on `Content-Type: application/json` for all responses (e.g. generic error-parsing middleware) will need to special-case these. Consider returning a small JSON error body for consistency with the success responses:

```rust
fn bad_request(reason: &'static str) -> Response {
    (StatusCode::BAD_REQUEST, Json(serde_json::json!({"error": reason}))).into_response()
}

fn service_unavailable(reason: &'static str) -> Response {
    (StatusCode::SERVICE_UNAVAILABLE, Json(serde_json::json!({"error": reason}))).into_response()
}
```

How can I resolve this? If you propose a fix, please make it concise.

@MegaRedHand MegaRedHand merged commit 0f7c4fe into main Apr 21, 2026
2 checks passed
@MegaRedHand MegaRedHand deleted the runtime-aggregator-toggle branch April 21, 2026 20:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants