BSides Canberra 2025

From Sandbox Escapes to MCP Database Hijacks: Unveiling Agentic Vulnerabilities
2025-09-25 , Off-Main Track

In today’s AI-driven world, autonomous agents powered by advanced language models are handling everything from file processing to SQL queries with each capability opening up new attack vectors. In this talk, we draw on our year-long tracking of production-grade agentic AIs (including OpenAI’s ChatGPT) to reveal three classes of real-world threats and their defenses:

  • Sandbox Escapes & Code Execution: We dissect containerized sandboxes—revealing how malformed file uploads or hidden background daemons can break isolation, persist code, or hijack Jupyter kernel.

  • Steganographic Exfiltration & Indirect Prompt Injection: By embedding malicious prompts into innocuous images or Office documents, attackers can coerce multimodal models (e.g., GPT-4o) into leaking credentials or data without user interaction.

  • AI-Native MCP SQL Injection: We uncover how malicious prompts directed at Model Context Protocol (MCP) endpoints can silently tamper with or exfiltrate entire database backends—quickly cascading into downstream AI pipelines.

We demonstrate how an LLM-powered agents can be compromised by utilizing a proof-of-concept AI agent with these vulnerabilities, showing the impact of the exploits and emphasizing the critical need for advanced security measures.

Sean Park is a reverse engineer and AI security researcher who hunts for blind spots in modern agentic systems. From prompt injections to compromised MCP servers, he uncovers how small flaws in AI workflows can trigger full-scale compromise. Whether analyzing Jupyter kernel traffic, tracing hallucinated dependencies, or stress-testing sandboxed agents, Sean blends automation, adversarial thinking, and low-level precision to stay ahead of emerging threats. His motto: every system can be mapped, exploited—and secured.