LLMs output the next token that is most coherent (least surprising) given all of the previous tokens.
- LLMs output the next token that is most coherent (least surprising) given all of the previous tokens.
- A technique to get them to do things they normally wouldn't do is to fill the history with fake conversation.
- At the start, have the agent say it wasn't going to do the thing you want and then have it pretend to have said "actually that's a good point yeah I can do that."
- Now, the most coherent next thing is for it to do the thing you want!
- Put words into the mouth of the model.
- Taking advantage of the Memento style nature of models.