Skip to content

feat(security): Spec 076 US2 — three soft checks + per-match pattern confidence (MCP-3577)#775

Open
Dumbris wants to merge 1 commit into
mainfrom
076-t3-soft-checks
Open

feat(security): Spec 076 US2 — three soft checks + per-match pattern confidence (MCP-3577)#775
Dumbris wants to merge 1 commit into
mainfrom
076-t3-soft-checks

Conversation

@Dumbris

@Dumbris Dumbris commented Jun 26, 2026

Copy link
Copy Markdown
Member

Spec 076 · T3 / US2 (MCP-3577)

Adds the US2 false-positive-discriminating SOFT checks to the Spec-076 offline detect engine, plus per-match confidence on the reused secret matchers. Soft signals raise a finding for human review and never auto-quarantine.

Tasks

  • T013 checks/directive_imperative.go — prompt-injection directives (<IMPORTANT> tags, "do not tell the user", "ignore previous instructions", "before using this tool") matched over normalized text with position discounting (detect.ClassifyPosition), so a phrase quoted/illustrated in example-position ("detects prompts such as 'ignore previous instructions'") is suppressed below the emit floor.
  • T014 checks/capability_mismatch.go — declared-vs-implied capability gap (a compute/string tool that touches ~/.ssh, /etc/passwd, an external URL, or a shell) plus an unexplained data-sink param (sidenote). Declared category is anchored on the tool name + lead sentence, so a legitimate file/network tool that genuinely declares resource access is not flagged.
  • T015 internal/security/patterns/ — additive per-match confidence: WithConfidence builder + ConfidenceFor(match). Luhn-validated card → 0.95, generic bearer → 0.3, documented examples (AKIA…EXAMPLE) → 0.1, severity defaults otherwise. Existing Match/IsValid/Scan behavior is unchanged (new field/method only).
  • T016 checks/embedded_secret.go — wraps the patterns matchers with confidence + masked evidence (never echoes the full secret), skipping documented placeholders. All three soft checks are registered in the scanner detect-engine wiring.

Testing (TDD)

Each check ships MUST-flag and hard-negative MUST-NOT-flag cases (per contracts/detect-engine.md). Verified locally:

  • go test ./internal/security/... -race — green
  • golangci-lint run --config .github/.golangci.yml ./internal/security/... — 0 issues
  • go vet ./... — clean

Coordination with US1 (#770)

detectEngineFindings in internal/security/scanner/inprocess.go is the single shared integration point between US1's hard checks (#770) and these US2 soft checks. This branch registers the three SOFT checks; the Checks slice is the documented merge point — when both land, combine into the full six-check set. Branched off origin/main (parent = T1 #769); the legacy phrase/secret heuristics are retained alongside (no regression).

Closes MCP-3577.

…confidence (MCP-3577)

Adds the US2 false-positive-discriminating SOFT checks to the Spec-076
offline detect engine, plus per-match confidence on the reused secret
matchers. Soft signals raise a finding for review and never auto-quarantine.

- T013 checks/directive_imperative.go: prompt-injection directives
  (<IMPORTANT> tags, 'do not tell the user', 'ignore previous
  instructions', 'before using this tool') matched over NORMALIZED text
  with position discounting so example-position mentions are suppressed.
- T014 checks/capability_mismatch.go: declared-vs-implied capability gap
  (a compute/string tool touching ~/.ssh, /etc/passwd, a URL or shell)
  plus an unexplained data-sink param ('sidenote'); legitimate file/network
  tools are not flagged.
- T015 internal/security/patterns: additive per-match confidence
  (WithConfidence builder + ConfidenceFor) — Luhn-validated card 0.95,
  generic bearer 0.3, documented examples 0.1, severity defaults otherwise.
  Existing Match/IsValid/Scan behavior is unchanged.
- T016 checks/embedded_secret.go: wraps the patterns matchers with
  confidence and masked evidence, skipping documented placeholders; the
  three soft checks are registered in the scanner detect-engine wiring.

TDD with MUST-flag and hard-negative MUST-NOT-flag cases for each check.

Coordination: detectEngineFindings in inprocess.go is the shared US1/US2
integration point; this branch registers the three SOFT checks and the
Checks slice is the merge point with US1's hard checks (#770).
@cloudflare-workers-and-pages

Copy link
Copy Markdown

Deploying mcpproxy-docs with  Cloudflare Pages  Cloudflare Pages

Latest commit: 3bee012
Status: ✅  Deploy successful!
Preview URL: https://2bceb451.mcpproxy-docs.pages.dev
Branch Preview URL: https://076-t3-soft-checks.mcpproxy-docs.pages.dev

View logs

@codecov-commenter

Copy link
Copy Markdown

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

❌ Patch coverage is 88.18182% with 26 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
...rnal/security/detect/checks/capability_mismatch.go 84.21% 10 Missing and 2 partials ⚠️
internal/security/detect/checks/embedded_secret.go 80.39% 6 Missing and 4 partials ⚠️
...nal/security/detect/checks/directive_imperative.go 92.00% 1 Missing and 1 partial ⚠️
internal/security/patterns/patterns.go 90.47% 2 Missing ⚠️

📢 Thoughts on this report? Let us know!

@github-actions

Copy link
Copy Markdown

📦 Build Artifacts

Workflow Run: View Run
Branch: 076-t3-soft-checks

Available Artifacts

  • archive-darwin-amd64 (28 MB)
  • archive-darwin-arm64 (25 MB)
  • archive-linux-amd64 (16 MB)
  • archive-linux-arm64 (14 MB)
  • archive-windows-amd64 (28 MB)
  • archive-windows-arm64 (25 MB)
  • frontend-dist-pr (0 MB)
  • installer-dmg-darwin-amd64 (21 MB)
  • installer-dmg-darwin-arm64 (19 MB)

How to Download

Option 1: GitHub Web UI (easiest)

  1. Go to the workflow run page linked above
  2. Scroll to the bottom "Artifacts" section
  3. Click on the artifact you want to download

Option 2: GitHub CLI

gh run download 28261692057 --repo smart-mcp-proxy/mcpproxy-go

Note: Artifacts expire in 14 days.

@mcpproxy-gatekeeper mcpproxy-gatekeeper Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gatekeeper approval — review verdict: ACCEPT (model-diverse fallback; CodexReviewer stale).

CodexReviewer stale on 3bee012b27ed9f14968dacf575c6e9409fb6e682; verdict of record: KimiReviewer ACCEPT on 3bee012b27ed9f14968dacf575c6e9409fb6e682.

CodexReviewer was handed the current head but only re-posted a verdict on superseded code (stale), so model-diverse fallback reviewer KimiReviewer reviewed and ACCEPTed the current head — that stands in as the verdict of record (narrow exception to Codex-mandatory, MCP-3120). A fresh Codex verdict on the current head would always win. Author≠approver satisfied; QA + CI gates enforced separately.

Auto-approved per Model B (MCP-1249) + reviewer-fallback (MCP-3066/MCP-3120).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants