AI Safety & Society · When agents fail

Memory poisoning and indirect prompt injection

You can explain why prompt injection is not a chatbot trick but a structural property of any system that reads untrusted text and acts on it.

A chatbot reads what you type and responds. An agent reads what you type, what the calendar says, what the email contains, what the webpage shows, what the saved memory note remembers, and what the tool returned — and then it acts. The number of authors of an agent's input is at least the number of those surfaces.

This is the structural reason prompt injection is now a pervasive class of agent vulnerability. The OWASP Top 10 for Agentic Applications 2026 — the open-source standard practitioners now cite the way web developers cite the original OWASP Top 10 — places agent goal hijack and tool misuse at the top of its 2026 ranking, with memory and context poisoning as a closely related category.

The chapter teaches three things. First, the mechanics: how direct and indirect prompt injection work, and why "the prompt" is no longer just the user message. Second, memory poisoning specifically — Microsoft's February 2026 paper on AI recommendation poisoning is the first formal empirical study of how persistent false beliefs can be implanted across sessions and how those beliefs steer later agent behavior. Third, the practical defenses the OWASP document recommends — none of them perfect, all of them necessary in combination.

Chapter contains 3 lessons.