Prompt injection is hard to combat because normal sandboxing doesn't work without a million permission dialogs.

2025-05-05 · Bits and Bobs 5/5/25

Prompt injection is hard to combat because normal sandboxing doesn't work without a million permission dialogs.
- The stuff you'd use to contain the prompt injection (LLMs) is the stuff that can be tricked by anything you show.
  - Turtles all the way down.
- "Do you trust this domain to get information from this chat?"
- This would be a huge number of permission dialogs.
- Could you imagine if the web had a permission dialog for every third party domain being requested on the web?
  - It would be overwhelming.
  - The web doesn't do it because it doesn't allow sensitive data (only data the user trusted the origin to have access to).
  - The origin might trust more third parties than the user realizes, but technically users are delegating it to the origin, or the employees who can ship code for that origin.
  - But LLMs can't make trust decisions to delegate because they are inherently confusable.
  - Presumably everyone shipping trusted code for an origin is a professional who is weighing the security risks.
  - Not true for an LLM.

More on this topic

From other episodes