You can typically trust off-the-shelf LLMs to not try to manipulate you in particular.

· Bits and Bobs 6/23/25
  • You can typically trust off-the-shelf LLMs to not try to manipulate you in particular.
    • But LLMs are easy to fool.
    • So if anyone else you don't trust is feeding input into the context, then the LLM might be entirely tainted and any of its decisions not trustworthy.
    • This is why prompt injection is so fundamental for any system where LLMs make calls on security.

More on this topic

From other episodes