Prompt injection is very unlikely to be solved by the model simply getting so good it can't be tricked.

· Bits and Bobs 8/11/25
  • Prompt injection is very unlikely to be solved by the model simply getting so good it can't be tricked.
    • This is evident in the model card for GPT5.
    • A lot of AI people are (implicitly, perhaps unintentionally) making the bet that models will get good enough to make security concerns moot.
    • This is less because they have a good understanding of what the models will be able to do and more because they have a poor understanding of security.
    • It's also a smuggled infinity.
      • "If only models were perfect at not being tricked this would be safe."
    • Remember, the threat coevolves; as these tools are more used, the incentive for the threat gets larger.

More on this topic

From other episodes