Skip to content

Add on-device eval agent example (ZEDCLOUD-2462)#31

Open
adithya-zededa wants to merge 1 commit into
zededa:mainfrom
adithya-zededa:feature/zedcloud-2462-ondevice-eval-agent
Open

Add on-device eval agent example (ZEDCLOUD-2462)#31
adithya-zededa wants to merge 1 commit into
zededa:mainfrom
adithya-zededa:feature/zedcloud-2462-ondevice-eval-agent

Conversation

@adithya-zededa
Copy link
Copy Markdown
Contributor

@adithya-zededa adithya-zededa commented May 13, 2026

Summary

Opens the ondevice-eval-agent application as an example under
edgeai/ondevice-eval-agent/. The agent
runs alongside an inference server (Triton / OpenVINO) at the edge and
exposes:

  • A Flask backend that proxies inference, performs model discovery, and
    surfaces ~16 MCP-style tools the LLM can call to introspect tensors,
    generate integration code, and run inference end-to-end.
  • A React SPA (Vite + Tailwind) for chat-driven model exploration and
    visualization of inference results.
  • A multi-provider LLM router (Anthropic, OpenAI, Google, Gemini, Groq,
    Ollama, and any OpenAI-compatible server such as vLLM / LM Studio /
    TGI) with rate-limit, retry, and SSE streaming support.
  • A single multi-stage Dockerfile that builds the SPA natively and
    serves it together with the API on port 8080.

All credentials are read from environment variables — no secrets are
bundled. Internal-only docs (CODE_REVIEW_REPORT.md,
DEBUG-REQUIREMENT.md, FEATURES_PRESENTATION.md), build artifacts
(node_modules/, dist/, __pycache__/, .pytest_cache/,
tsconfig.tsbuildinfo, package-lock.json), and internal Jira ticket
references in code comments have been scrubbed before this PR.

Jira: ZEDCLOUD-2462

Test plan

  • cd edgeai/ondevice-eval-agent && docker build -t ondevice-eval-agent . succeeds.
  • docker run --rm -p 8080:8080 -e MODEL_SERVER_URL=http://triton:8000 -e ANTHROPIC_API_KEY=... ondevice-eval-agent starts and GET /agent/status returns 200.
  • SPA loads at http://localhost:8080/ and the chat panel reaches the backend.
  • Backend tests pass locally: pip install -r requirements.txt && pytest tests/.

Adds the ondevice-eval-agent application under edgeai/ as an example
that runs alongside an inference server (Triton / OpenVINO) at the
edge. It provides:

- Flask backend that proxies inference and exposes model discovery /
  introspection tools via an MCP-style tool registry
- React SPA for chat-driven model exploration, inference, and result
  visualization
- LLM router that supports Anthropic, OpenAI, Google, Groq, Ollama,
  and any OpenAI-compatible endpoint, with optional rate-limit and
  retry handling
- Single-image Dockerfile that builds the SPA and serves it together
  with the API on port 8080

Reads all API keys from environment variables; no secrets are bundled.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant