Topic: prompt injection attack

84 chunks · 50 episodes

Topic summary

A short read on the topic's time range, peak episode, and strongest associations. Use it as the quick orientation before drilling into examples.

prompt injection attack appears in 84 chunks across 50 episodes, from 2024-06-17 to 2026-04-20.
Its densest episode is Bits and Bobs 6/30/25 (2025-06-30), with 4 observations on this topic.
Semantically it travels with llms, wild west, and Claude, while by chunk count it sits between OpenAI and ground truth; its yearly rank moved from #166 in 2024 to #11 in 2026.

Over time

Raw mentions over time. Use this to see absolute attention, not relative rank among all topics.

Range2024-06-17 to 2026-04-20Mean1.7 per episodePeak4 on 2025-06-30

Observations

The primary evidence view for this topic. Sort it chronologically when you want concrete examples behind the larger pattern.

Showing 84 observations sorted from latest to earliest.

Bruce Schneier on prompt injection: "We need some new fundamental science of LLMs before we can solve this."

from Bits and Bobs 9/8/25 · 2025-09-08

Bruce Schneier on prompt injection: "We need some new fundamental science of LLMs before we can solve this."

We spent decades making injection attacks invisible to developers.

from Bits and Bobs 9/2/25 · 2025-09-02

We spent decades making injection attacks invisible to developers. Modern frameworks auto-escape HTML. ORMs parameterize queries. Follow standard practices and you don't have to think about it. Now LLMs make all text executable. Frameworks don't help. Everything is code. XSS has a solution: we can p

Anthropic announced Claude for Chrome this week.

from Bits and Bobs 9/2/25 · 2025-09-02

Anthropic announced Claude for Chrome this week. Their blog post announcing it mentioned it will be available to a small set of users because they haven't yet made it safe enough. They shared their stat of attack success rate: 11.1%. It's multiple orders of magnitude too high to be safe for mass mar

A lot of absurd solutions hide behind an implicit "once the LLM is perfectly good".

from Bits and Bobs 9/2/25 · 2025-09-02

A lot of absurd solutions hide behind an implicit "once the LLM is perfectly good". "Perfect" is a smuggled infinity.[bc] Once you introduce an infinity into an argument, everything downstream is absurd, because anything other than zero multiplied by infinity is infinity. "Prompt injection won't be

This week in the "wild west roundup"

from Bits and Bobs 8/25/25 · 2025-08-25

This week in the "wild west roundup" Simon Willison's roundup of prompt injection attacks this summer A prompt injection technique that hides malicious text in images. Engadget: AI browsers may be the best thing that ever happened to scam...

Someone peeked inside of Claude Code's workings and saw tons of "<system-reminder>" instructions, keeping it convergent and on track.

from Bits and Bobs 8/25/25 · 2025-08-25

Someone peeked inside of Claude Code's workings and saw tons of "<system-reminder>" instructions, keeping it convergent and on track. That technique could also be used by prompt injection!

Chat is a gap filler UX modality.

from Bits and Bobs 8/25/25 · 2025-08-25

Chat is a gap filler UX modality. I want a system that can create malleable chatbots. That can spin them up on demand with different personalities. Bonus points if it can safely use tools without the risk of prompt injection.

This week in "we're in the wild west era"

from Bits and Bobs 8/18/25 · 2025-08-18

This week in "we're in the wild west era" "Sloppy AI defenses take cybersecurity back to the 1990s, researchers say" "GPT-4o still outperforms GPT-5 on hardened [security] benchmarks across the board." "GitHub Copilot RCE Vulnerability via Prompt Injection Leads to Full System Compromise"

This week's round up of "we're in the wild west era with LLMs":

from Bits and Bobs 8/11/25 · 2025-08-11

This week's round up of "we're in the wild west era with LLMs": A postmortem for a vibecoded tool called DrawAFish that had abuse problems. A Cursor exploit that allows arbitrary remote code execution. AgentFlayer: ChatGPT Connectors 0click Allows exfiltration of sensitive Google Drive docs a user a

Prompt injection is very unlikely to be solved by the model simply getting so good it can't be tricked.

from Bits and Bobs 8/11/25 · 2025-08-11

Prompt injection is very unlikely to be solved by the model simply getting so good it can't be tricked. This is evident in the model card for GPT5. A lot of AI people are (implicitly, perhaps unintentionally) making the bet that models will get good enough to make security concerns moot. This is les

I see three seeds of massive possibility in the era of AI, but each currently with a low ceiling.

from Bits and Bobs 8/4/25 · 2025-08-04

I see three seeds of massive possibility[dc] in the era of AI, but each currently with a low ceiling. MCP shows the power of integration of data. However, the lethal trifecta sets a low ceiling; the more you integrate with powerful tools, the more dangerous prompt injection gets.[dd][de] Chatbox UX

A prompt injection technique that hides the injection in legal boilerplate in the terms of service.

from Bits and Bobs 8/4/25 · 2025-08-04

A prompt injection technique that hides the injection in legal boilerplate in the terms of service. Drafting off the fact that no one reads that anyway. We'll see many other social hacks.

There is no solution to prompt injection in systems where LLMs call the shots.

from Bits and Bobs 7/28/25 · 2025-07-28

There is no solution to prompt injection in systems where LLMs call the shots. LLMs seeing raw data and being asked to make load-bearing security decisions cannot be made safe, no matter how good the model gets. Even if the model is great, the trolley problem of having the model, not the user, be tr

ChatGPT's Agents feature feels fundamentally reckless to me.

from Bits and Bobs 7/21/25 · 2025-07-21

ChatGPT's Agents feature feels fundamentally reckless to me. Their approach to prompt injection is basically: tell the model to really, really, focus on not doing anything bad. It uses the model as a security boundary, which is reckless even for advanced models. Rolling the feature out widely ups th

This article on on-the-fly toolgen was interesting.

from Bits and Bobs 7/14/25 · 2025-07-14

This article on on-the-fly toolgen was interesting. But I don't think it goes far enough. It still has the LLM at the root of the loop, calling the shots, deciding what to rely on. But any system with an LLM in the driver's seat is prone to prompt injection. Why not have codegenned code be the root

The McDonalds application AI leaked tons of personal data.

from Bits and Bobs 7/14/25 · 2025-07-14

The McDonalds application AI leaked tons of personal data. The problem wasn't prompt injection per se, it was just a poorly configured and secured system. Still, I imagine we'll see a lot of these kinds of things with companies eager to integrate AI into their publicly-exposed systems.

An LLM can be trusted not to write code to attack you in particular.

from Bits and Bobs 6/30/25 · 2025-06-30

An LLM can be trusted not to write code to attack you in particular. But if it sees any untrusted context at all the LLM can become malicious. This is why prompt injection is so dangerous.

An in the wild prompt injection attack attempt was discovered.

from Bits and Bobs 6/30/25 · 2025-06-30

An in the wild prompt injection attack attempt was discovered.

A report about how prompt injection can easily happen in MCP.

from Bits and Bobs 6/30/25 · 2025-06-30

A report about how prompt injection can easily happen in MCP.

Which will be more important by unit weight in software systems in the AI era, LLMs or normal code?

from Bits and Bobs 6/30/25 · 2025-06-30

Which will be more important by unit weight in software systems in the AI era, LLMs or normal code? A lot of platforms being built for the age of AI imagine that most of the weight of systems will be LLMs, with just a little bit of code.[gq] What if it's the other way around, and it's mostly code, w