In early 2026, CISA added Langflow to its Known Exploited Vulnerabilities catalog. Around the same time, Flowise -- another popular AI workflow platform -- disclosed a series of authentication bypasses that allowed unauthenticated users to execute arbitrary code on servers running the platform. These were not theoretical vulnerabilities found by researchers in controlled environments. They were being actively exploited in the wild.
If you are building AI systems and you were not paying attention to these disclosures, you should be worried. Not because your specific stack might be vulnerable (though it might be), but because these exploits reveal a fundamental problem with how the AI tooling ecosystem thinks about security: it doesn't.
What Actually Happened
Langflow is a visual framework for building LLM applications. You drag and drop components -- models, tools, chains, memory stores -- and it generates a working AI pipeline. Flowise does roughly the same thing. Both are popular because they lower the barrier to building AI agents from "software engineer" to "anyone with a browser."
The Langflow vulnerability (CVE-2025-3248) was a pre-authentication remote code execution flaw. An attacker could send a crafted request to the Langflow API and execute arbitrary Python code on the server. No credentials required. No authentication bypass needed -- the endpoint was simply not protected.
Think about that for a moment. Langflow is a platform specifically designed to execute code -- that is its entire purpose. And the API that triggers code execution was wide open to the internet.
Flowise had a different but equally damaging pattern. Multiple endpoints that should have required authentication were accessible without it. An attacker could read and modify AI workflows, access stored API keys (including OpenAI, Anthropic, and other LLM provider credentials), and inject malicious components into existing flows. In some configurations, this also led to server-side code execution.
AI platforms are uniquely dangerous when compromised. They have API keys to expensive services, access to sensitive data, and the ability to execute arbitrary code. A compromised AI agent is not just a data breach -- it is a foothold with superpowers.
Why AI Platforms Are Uniquely Vulnerable
Traditional web applications have a well-understood attack surface: SQL injection, XSS, CSRF, authentication bypasses. We have decades of tooling, frameworks, and best practices for these. Rails has built-in CSRF protection. Django escapes HTML by default. Express middleware libraries handle session management.
AI platforms introduce three new attack dimensions that most security tooling does not handle:
1. Code Execution by Design
AI agent frameworks need to execute code. That is the whole point -- the AI decides what code to run, and the framework runs it. This means the platform cannot simply "disable code execution" as a security measure. The challenge is constraining execution to safe, intended operations while blocking malicious ones. This is a sandboxing problem, and sandboxing is one of the hardest problems in security.
Langflow and Flowise both failed at this. Their code execution environments had insufficient isolation, and their APIs did not properly gate who could trigger execution.
2. LLM-Mediated Actions
When an LLM decides what tool to call or what query to execute, you introduce prompt injection as an attack vector. An attacker does not need to exploit a code vulnerability -- they can craft input that tricks the LLM into performing unintended actions. This is not a hypothetical attack. Prompt injection has been demonstrated against every major AI agent framework.
Imagine a customer support agent built on Flowise that has access to a database. An attacker sends a support ticket containing carefully crafted text that instructs the AI to dump the customer database. The LLM cannot distinguish between legitimate instructions and injected ones because both are just text.
3. External Tool Calling and MCP
Modern AI systems connect to external tools -- databases, APIs, file systems, code interpreters. The Model Context Protocol (MCP) standardizes these connections. Each tool is a potential attack surface. I run 14+ MCP servers in production daily, and I can tell you that securing every tool endpoint is tedious, error-prone work that most developers skip because they are focused on making the AI features work.
An MCP server that exposes a file system tool without proper path validation can be tricked into reading /etc/passwd. A database tool without query sanitization is a SQL injection vector. A code execution tool without sandboxing is, well, Langflow.
How I Secure My Own Infrastructure
I run multiple MCP servers, AI agents, and automation pipelines on my VPS infrastructure. Here is how I approach security, informed by seeing exactly how these platforms fail.
Tiered Access Control
Every MCP tool in my system is classified into one of three tiers:
- Tier 1 (Read-only): Tools that only read data. These are the safest and get the lightest authentication. Examples: querying a database, searching files, reading metrics.
- Tier 2 (Write): Tools that modify state. These require explicit authentication and authorization. Examples: creating records, sending emails, modifying files.
- Tier 3 (Destructive): Tools that delete data, execute code, or modify system configuration. These require the highest level of authentication, are rate-limited aggressively, and log every invocation. Examples: database migrations, code execution, file deletion.
Langflow's mistake was treating a Tier 3 operation (code execution) like it was Tier 1 (no auth needed). This classification should be the first thing you design, not an afterthought.
Dynamic Analysis with Frida
I use Frida for dynamic instrumentation of running applications. When I deploy a new AI tool or integrate a third-party component, I hook into its runtime to verify:
- What system calls it makes (file access, network connections, process spawning)
- Whether it attempts to access resources outside its expected scope
- How it handles malformed input at runtime, not just in unit tests
// Frida script to monitor file access from an MCP server
Interceptor.attach(Module.findExportByName(null, 'open'), {
onEnter: function(args) {
const path = args[0].readUtf8String();
// Alert on any access outside the allowed directory
if (!path.startsWith('/srv/mcp-data/')) {
send({
type: 'alert',
severity: 'high',
message: `Unauthorized file access: ${path}`
});
}
}
});
This catches entire categories of vulnerabilities that static analysis misses, because you see what the code actually does under real conditions.
Binary Analysis with Ghidra
For compiled components -- native modules, Go binaries, Rust services -- I use Ghidra for static binary analysis. This is particularly important for third-party tools where you do not have source code. I check for:
- Hardcoded credentials or API keys in the binary
- Insecure cryptographic implementations
- Known vulnerable library versions linked into the binary
- Suspicious network communication patterns
Rate Limiting Everything
Every endpoint in my infrastructure is rate-limited. Not just the obvious ones like login endpoints, but every tool invocation, every API call, every webhook. This is the simplest security measure that prevents the widest class of attacks, and it is astonishing how many AI platforms ship without it.
My configuration looks roughly like this:
# nginx rate limiting for MCP endpoints
limit_req_zone $binary_remote_addr zone=mcp_read:10m rate=30r/s;
limit_req_zone $binary_remote_addr zone=mcp_write:10m rate=5r/s;
limit_req_zone $binary_remote_addr zone=mcp_exec:10m rate=1r/s;
location /mcp/read/ {
limit_req zone=mcp_read burst=50 nodelay;
# ...
}
location /mcp/write/ {
limit_req zone=mcp_write burst=10 nodelay;
# ...
}
location /mcp/exec/ {
limit_req zone=mcp_exec burst=2 nodelay;
# ...
}
Tier 3 operations get 1 request per second. If a compromised agent tries to exfiltrate data through rapid API calls, the rate limiter stops it cold.
What Most AI Developers Get Wrong
After reviewing dozens of AI projects -- both my own and others' -- here are the recurring security failures I see:
- API keys stored in plaintext. In environment files committed to git. In database columns without encryption. In configuration files that the AI agent itself can read. If your agent can access its own API keys, an attacker who compromises the agent gets those keys for free.
- No input validation on tool arguments. The AI says "search for file X" and the tool blindly searches for file X without checking whether X is a valid, safe path. This is how directory traversal attacks work, and they are trivial to execute against most MCP server implementations.
- Running agents as root. I have seen production AI agents running with root privileges because "it was easier to set up." A compromised root-level agent owns the entire server.
- No audit logging. If you cannot answer "what did the AI agent do at 3:47am last Tuesday?" you have a security problem. Every tool invocation should be logged with timestamps, arguments, and results.
- Trusting LLM output. The LLM says to execute a SQL query. You execute it. But the LLM was influenced by prompt injection in the user's input, and the query drops your database. Always validate and sanitize LLM-generated commands before execution.
A Practical Security Checklist
If you are building or deploying AI systems, run through this checklist. I use it for every project:
- Authentication on every endpoint. No exceptions. No "this is internal so it is fine." Internal services get compromised through SSRF and lateral movement.
- Tiered authorization. Read, write, and execute operations should have different permission levels.
- Input validation on all tool arguments. Validate types, lengths, paths, and query structures before execution.
- Rate limiting on every endpoint. Tier-appropriate limits. Aggressive limits on anything that executes code.
- Secret management. API keys in environment variables or a secrets manager, never in code or config files. Rotate keys regularly.
- Principle of least privilege. The AI agent runs as a non-root user with access only to the specific resources it needs.
- Audit logging. Every tool invocation logged with full context. Logs stored separately from the application.
- Sandboxed execution. Code execution happens in isolated containers or VMs. No shared filesystem with the host.
- Output sanitization. LLM-generated commands are validated against an allowlist of safe operations before execution.
- Dependency scanning. Regular scanning of all dependencies for known vulnerabilities. Automated alerts on new CVEs.
The Bigger Picture
The Langflow and Flowise exploits are not anomalies. They are the predictable result of an industry moving faster than its security practices. AI tooling is being built by ML engineers who are experts in model architecture and prompt engineering but have limited experience with application security. The frameworks they build optimize for developer experience and AI capability, not for defense in depth.
This is going to get worse before it gets better. As AI agents gain more capabilities -- file system access, database control, API integrations, code execution -- the blast radius of a compromised agent increases. A compromised chatbot in 2024 could leak conversation history. A compromised autonomous agent in 2026 can drain your cloud credits, modify your production database, and send emails as your CEO.
The fix is not to slow down AI development. The fix is to apply the security lessons we learned from two decades of web application security to this new domain. Authentication, authorization, input validation, rate limiting, least privilege, audit logging -- none of these are new concepts. They just need to be applied consistently to AI systems, which most teams are not doing.
If you are shipping AI agents to production without a security review, you are building the next headline. The exploits are not hypothetical. They are already here.