Prompt injection is hard to combat because normal sandboxing doesn't work without a million permission dialogs.
- Prompt injection is hard to combat because normal sandboxing doesn't work without a million permission dialogs.
- The stuff you'd use to contain the prompt injection (LLMs) is the stuff that can be tricked by anything you show.
- Turtles all the way down.
- "Do you trust this domain to get information from this chat?"
- This would be a huge number of permission dialogs.
- Could you imagine if the web had a permission dialog for every third party domain being requested on the web?
- It would be overwhelming.
- The web doesn't do it because it doesn't allow sensitive data (only data the user trusted the origin to have access to).
- The origin might trust more third parties than the user realizes, but technically users are delegating it to the origin, or the employees who can ship code for that origin.
- But LLMs can't make trust decisions to delegate because they are inherently confusable.
- Presumably everyone shipping trusted code for an origin is a professional who is weighing the security risks.
- Not true for an LLM.