LLMs output the next token that is most coherent (least surprising) given all of the previous tokens.

2026-03-09 · Bits and Bobs 3/9/26

LLMs output the next token that is most coherent (least surprising) given all of the previous tokens.
- A technique to get them to do things they normally wouldn't do is to fill the history with fake conversation.
- At the start, have the agent say it wasn't going to do the thing you want and then have it pretend to have said "actually that's a good point yeah I can do that."
- Now, the most coherent next thing is for it to do the thing you want!
- Put words into the mouth of the model.
- Taking advantage of the Memento style nature of models.