Skip to content

Latest commit

 

History

History
167 lines (84 loc) · 12.9 KB

File metadata and controls

167 lines (84 loc) · 12.9 KB

Issue Code Reference

This is the reference for all issues that can be detected via snyk-agent-scan.


Compromised MCP Servers

These issues highlight that the installed MCP server is acting maliciously. This yields a very important security threat, as it effectivly exposes a successful supply chain attack

E001 | critical MCP Prompt injection in tool description

Detected a prompt injection in the tool description. The tool should be deactivated immediately.

This is a form of tool poisoning where a malicious MCP server embeds hidden adversarial instructions inside its tool descriptions to hijack the agent. These injections are designed to be invisible to the user while manipulating the agent's behavior.

E002 | high MCP Cross-server tool reference (tool shadowing)

A tool from one MCP server references a tool belonging to a different server. MCP servers should be self-contained, when a server's tool description mentions tools from another server, it can poison the behavior of those trusted tools.

This is a tool shadowing attack, where a malicious server overrides or interferes with the behavior of a legitimate tool from another server.

W001 | low MCP Suspicious words in tool description

The tool description contains words commonly associated with prompt injection attempts, such as "important", "crucial", "critical", "vital", "urgent", "ignore", "disregard", "override", or "bypass". These words are often used in tool poisoning attacks to draw the agent's attention and override its normal decision-making process.

While the presence of these words alone does not confirm malicious intent, it is a signal worth investigating, especially when combined with unusual tool descriptions.


Research from Invariant Labs and Simon Willison demonstrates how a critical vulnerability arises when an agent simultaneously has: exposure to untrusted content, access to private data, and the ability to communicate with the internet. This combination is referred to as a toxic flow.

In practice, most agents possess all three of these characteristics and are therefore inherently vulnerable. For this reason, our analysis focuses on each component individually. We flag servers that expose the agent to untrusted data (W015, W016); servers that expose particularly sensitive private data that should be excluded from the agent's context (W017, W018); and servers with potentially destructive capabilities (W019, W020). We currently do not separately flag internet communication capabilities, as most agents include network access tools (such as curl) by default.

W015 | medium MCP Untrusted content detected

The MCP server exposes the agent to potentially untrusted content from unverified third-party data sources. This creates a risk of indirect prompt injection, where malicious content from external sources could manipulate the agent's behavior.

High-confidence detection indicates tools that clearly fetch or process data from user-controllable or public sources without adequate sanitization.

W016 | low MCP Potential untrusted content detected

Lower-confidence detection of potential untrusted content exposure. The MCP server may expose the agent to content from third-party sources, but the risk is less certain than W015.

This warning suggests investigating whether the tool properly validates and sanitizes external data before presenting it to the agent.

W017 | medium MCP Sensitive data exposure

The MCP server's explicit, named purpose is to integrate with and retrieve highly sensitive or non-public information, such as personal communications (emails, direct messages, private chat histories), financial records (bank details, account balances), or credential vaults (passwords, API keys, secure tokens). This exposes critically private user data directly into the agent's context.

Loading sensitive data into an AI agent's context dramatically increases the risk of unauthorized disclosure through prompt injection attacks, logging leaks, or model provider access. Once private emails, credentials, or financial information enters the agent's session, a single exploit or misconfiguration can exfiltrate confidential data to attackers or expose it in unintended system outputs.

W018 | low MCP Workspace data exposure

The MCP server grants access to tools that retrieve local workspace files, project code, or user-managed data stores. While less sensitive than personal communications or financial records (W017), these tools still expose non-public information such as proprietary code, local notes, or development artifacts into the agent's context.

Loading workspace data into an AI agent's context introduces risk of intellectual property leakage or inadvertent disclosure of internal project details. If the agent is compromised through prompt injection or logging vulnerabilities, attackers can access proprietary source code, architectural decisions, or internal documentation that should remain confidential.

W019 | medium MCP Destructive capabilities

The MCP server grants access to tools that can modify shared infrastructure, execute arbitrary system commands, or affect other users and team resources. This includes capabilities such as executing arbitrary system shell commands, interacting with a user's live/authenticated web browser, modifying state in shared team SaaS applications (e.g., shared project boards, team repositories), or triggering irreversible financial transactions.

Granting an AI agent access to shared infrastructure or system-level execution creates risk of widespread service disruption, data corruption, or financial loss. If the agent hallucinates or gets exploited, it can instantly corrupt team databases, execute malicious commands across your infrastructure, or trigger irreversible financial transactions that affect your entire organization.

W020 | low MCP Local destructive capabilities

The MCP server grants access to tools that can modify or delete local files, alter personal settings, or change state within a single user's environment. While the blast radius is contained to one user's machine (unlike W019), these capabilities still enable irreversible changes to local data and project files.

Even local-only modifications carry risk of permanent data loss or project corruption within a user's workspace. If the agent misinterprets instructions or hallucinates, it can delete uncommitted work, overwrite configuration files, or corrupt local repositories, requiring manual recovery or resulting in lost productivity.


Compromised Skills

These issues indicate that the installed skill is acting maliciously. This represents a critical security threat, as it effectively exposes a successful supply chain attack.

E004 | critical Skills Prompt injection in skill

Detected a prompt injection in the skill instructions. The skill contains hidden or deceptive instructions that fall outside its stated purpose and attempt to override the agent's safety guidelines or intended behavior.

E005 | critical Skills Suspicious download URL in skill

Detected a suspicious URL in the skill instructions that could lead the agent to download and execute malicious scripts or binaries. This includes links to executables from untrusted sources, typosquatting of official packages, URL shorteners that obscure the destination, and personal file hosting services.

Such links pose a significant risk as they indicate that Agent Scan cannot verify the full behavior of a skill (analysis is limited to the skill's own content, not externally referenced dependencies).

E006 | critical Skills Malicious code patterns in skill

Detected high-risk code patterns in the skill content — including its prompts, tool definitions, and resources — such as data exfiltration, backdoors, remote code execution, credential theft, system compromise, supply chain attacks, and obfuscation techniques.


Vulnerable Skills

These issues highlight how benign skills can play a role during an attack. The attack itself has not occurred yet, rather we found a risky pattern or capability.

W007 | high Skills Insecure credential handling in skill

The skill handles credentials insecurely by requiring the agent to include secret values verbatim in its generated output. This exposes credentials in the agent's context and conversation history, creating a risk of data exfiltration.

W008 | high Skills Hardcoded secrets in skill

Detected sensitive credentials directly embedded within the skill content, such as API keys, access tokens, private keys, or service-specific secrets. Secrets should never be hardcoded in plain text within skill instructions.

W009 | medium Skills Direct financial execution capability

The skill is specifically designed for direct financial operations, giving the agent the ability to move money or execute financial transactions — such as payment processing, cryptocurrency operations, banking integrations, or market order execution.

W011 | medium Skills Exposure to untrusted third-party content

The skill exposes the agent to untrusted, user-generated content from public third-party sources, creating a risk of indirect prompt injection. This includes browsing arbitrary URLs, reading social media posts or forum comments, and analyzing content from unknown websites.

W012 | high Skills Unverifiable external dependency

The skill fetches instructions or code from an external URL at runtime, and the fetched content directly controls the agent's prompts or executes code. This dynamic dependency allows the external source to modify the agent's behavior without any changes to the skill itself.

Some skills implement this behavior to enable auto-update mechanisms, however, it allows skills authors to change skill instructions at any point in time, disabling any form of version pinning and allowing for potential remote instruction changes (RCE risk).

W013 | medium Skills System service modification

The skill prompts the agent to compromise the security or integrity of the user's machine by modifying system-level services or configurations, such as obtaining elevated privileges, altering startup scripts, or changing system-wide settings. This may be legitimate in rare cases, but should be investigated.

W014 | low Skills Missing SKILL.md file

The skill is missing the required SKILL.md file. This file provides essential metadata and documentation about the skill's purpose, capabilities, and security considerations. Without it, the security analysis may be incomplete and users cannot properly evaluate the skill before use.