feat(security): Spec 076 US2 — three soft checks + per-match pattern confidence (MCP-3577)#775
feat(security): Spec 076 US2 — three soft checks + per-match pattern confidence (MCP-3577)#775Dumbris wants to merge 1 commit into
Conversation
…confidence (MCP-3577)
Adds the US2 false-positive-discriminating SOFT checks to the Spec-076
offline detect engine, plus per-match confidence on the reused secret
matchers. Soft signals raise a finding for review and never auto-quarantine.
- T013 checks/directive_imperative.go: prompt-injection directives
(<IMPORTANT> tags, 'do not tell the user', 'ignore previous
instructions', 'before using this tool') matched over NORMALIZED text
with position discounting so example-position mentions are suppressed.
- T014 checks/capability_mismatch.go: declared-vs-implied capability gap
(a compute/string tool touching ~/.ssh, /etc/passwd, a URL or shell)
plus an unexplained data-sink param ('sidenote'); legitimate file/network
tools are not flagged.
- T015 internal/security/patterns: additive per-match confidence
(WithConfidence builder + ConfidenceFor) — Luhn-validated card 0.95,
generic bearer 0.3, documented examples 0.1, severity defaults otherwise.
Existing Match/IsValid/Scan behavior is unchanged.
- T016 checks/embedded_secret.go: wraps the patterns matchers with
confidence and masked evidence, skipping documented placeholders; the
three soft checks are registered in the scanner detect-engine wiring.
TDD with MUST-flag and hard-negative MUST-NOT-flag cases for each check.
Coordination: detectEngineFindings in inprocess.go is the shared US1/US2
integration point; this branch registers the three SOFT checks and the
Checks slice is the merge point with US1's hard checks (#770).
Deploying mcpproxy-docs with
|
| Latest commit: |
3bee012
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://2bceb451.mcpproxy-docs.pages.dev |
| Branch Preview URL: | https://076-t3-soft-checks.mcpproxy-docs.pages.dev |
|
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
📦 Build ArtifactsWorkflow Run: View Run Available Artifacts
How to DownloadOption 1: GitHub Web UI (easiest)
Option 2: GitHub CLI gh run download 28261692057 --repo smart-mcp-proxy/mcpproxy-go
|
There was a problem hiding this comment.
✅ Gatekeeper approval — review verdict: ACCEPT (model-diverse fallback; CodexReviewer stale).
CodexReviewer stale on 3bee012b27ed9f14968dacf575c6e9409fb6e682; verdict of record: KimiReviewer ACCEPT on 3bee012b27ed9f14968dacf575c6e9409fb6e682.
CodexReviewer was handed the current head but only re-posted a verdict on superseded code (stale), so model-diverse fallback reviewer KimiReviewer reviewed and ACCEPTed the current head — that stands in as the verdict of record (narrow exception to Codex-mandatory, MCP-3120). A fresh Codex verdict on the current head would always win. Author≠approver satisfied; QA + CI gates enforced separately.
Auto-approved per Model B (MCP-1249) + reviewer-fallback (MCP-3066/MCP-3120).
Spec 076 · T3 / US2 (MCP-3577)
Adds the US2 false-positive-discriminating SOFT checks to the Spec-076 offline
detectengine, plus per-match confidence on the reused secret matchers. Soft signals raise a finding for human review and never auto-quarantine.Tasks
checks/directive_imperative.go— prompt-injection directives (<IMPORTANT>tags, "do not tell the user", "ignore previous instructions", "before using this tool") matched over normalized text with position discounting (detect.ClassifyPosition), so a phrase quoted/illustrated in example-position ("detects prompts such as 'ignore previous instructions'") is suppressed below the emit floor.checks/capability_mismatch.go— declared-vs-implied capability gap (a compute/string tool that touches~/.ssh,/etc/passwd, an external URL, or a shell) plus an unexplained data-sink param (sidenote). Declared category is anchored on the tool name + lead sentence, so a legitimate file/network tool that genuinely declares resource access is not flagged.internal/security/patterns/— additive per-match confidence:WithConfidencebuilder +ConfidenceFor(match). Luhn-validated card →0.95, generic bearer →0.3, documented examples (AKIA…EXAMPLE) →0.1, severity defaults otherwise. ExistingMatch/IsValid/Scanbehavior is unchanged (new field/method only).checks/embedded_secret.go— wraps thepatternsmatchers with confidence + masked evidence (never echoes the full secret), skipping documented placeholders. All three soft checks are registered in the scanner detect-engine wiring.Testing (TDD)
Each check ships MUST-flag and hard-negative MUST-NOT-flag cases (per
contracts/detect-engine.md). Verified locally:go test ./internal/security/... -race— greengolangci-lint run --config .github/.golangci.yml ./internal/security/...— 0 issuesgo vet ./...— cleanCoordination with US1 (#770)
detectEngineFindingsininternal/security/scanner/inprocess.gois the single shared integration point between US1's hard checks (#770) and these US2 soft checks. This branch registers the three SOFT checks; theChecksslice is the documented merge point — when both land, combine into the full six-check set. Branched offorigin/main(parent = T1#769); the legacy phrase/secret heuristics are retained alongside (no regression).Closes MCP-3577.