Add on-device eval agent example (ZEDCLOUD-2462) by adithya-zededa · Pull Request #31 · zededa/examples

adithya-zededa · 2026-05-13T22:41:44Z

Summary

Opens the ondevice-eval-agent application as an example under
edgeai/ondevice-eval-agent/. The agent
runs alongside an inference server (Triton / OpenVINO) at the edge and
exposes:

A Flask backend that proxies inference, performs model discovery, and
surfaces ~16 MCP-style tools the LLM can call to introspect tensors,
generate integration code, and run inference end-to-end.
A React SPA (Vite + Tailwind) for chat-driven model exploration and
visualization of inference results.
A multi-provider LLM router (Anthropic, OpenAI, Google, Gemini, Groq,
Ollama, and any OpenAI-compatible server such as vLLM / LM Studio /
TGI) with rate-limit, retry, and SSE streaming support.
A single multi-stage Dockerfile that builds the SPA natively and
serves it together with the API on port 8080.

All credentials are read from environment variables — no secrets are
bundled. Internal-only docs (CODE_REVIEW_REPORT.md,
DEBUG-REQUIREMENT.md, FEATURES_PRESENTATION.md), build artifacts
(node_modules/, dist/, __pycache__/, .pytest_cache/,
tsconfig.tsbuildinfo, package-lock.json), and internal Jira ticket
references in code comments have been scrubbed before this PR.

Jira: ZEDCLOUD-2462

Test plan

cd edgeai/ondevice-eval-agent && docker build -t ondevice-eval-agent . succeeds.
docker run --rm -p 8080:8080 -e MODEL_SERVER_URL=http://triton:8000 -e ANTHROPIC_API_KEY=... ondevice-eval-agent starts and GET /agent/status returns 200.
SPA loads at http://localhost:8080/ and the chat panel reaches the backend.
Backend tests pass locally: pip install -r requirements.txt && pytest tests/.

Adds the ondevice-eval-agent application under edgeai/ as an example that runs alongside an inference server (Triton / OpenVINO) at the edge. It provides: - Flask backend that proxies inference and exposes model discovery / introspection tools via an MCP-style tool registry - React SPA for chat-driven model exploration, inference, and result visualization - LLM router that supports Anthropic, OpenAI, Google, Groq, Ollama, and any OpenAI-compatible endpoint, with optional rate-limit and retry handling - Single-image Dockerfile that builds the SPA and serves it together with the API on port 8080 Reads all API keys from environment variables; no secrets are bundled.

adithya-zededa requested a review from cshari-zededa May 13, 2026 22:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add on-device eval agent example (ZEDCLOUD-2462)#31

Add on-device eval agent example (ZEDCLOUD-2462)#31
adithya-zededa wants to merge 1 commit into
zededa:mainfrom
adithya-zededa:feature/zedcloud-2462-ondevice-eval-agent

adithya-zededa commented May 13, 2026 •

edited by atlassian Bot

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

adithya-zededa commented May 13, 2026 • edited by atlassian Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

adithya-zededa commented May 13, 2026 •

edited by atlassian Bot

Loading