AI Agents · Lesson

Prompt Injection Defences

Layered defenses: input sanitization, instruction hierarchy, and treating retrieved content as untrusted.

The Attack

Prompt injection is the OWASP Top-1 LLM vulnerability. Attackers smuggle instructions into untrusted content that override your system prompt.

Examples:

LLMs treat all tokens as input. They cannot reliably tell "developer instructions" from "data". You can mitigate, not eliminate.

Treat injection like SQL injection: a structural risk requiring layered defense.