It could be useful to have multiple LLM participants in a conversation.
- It could be useful to have multiple LLM participants in a conversation.[xg]
- Even if it's the same model, each instance could wear a different "hat" in the conversation, and the interplay between them could generate useful new insights that the LLM acting as a single "individual" couldn't have.
- The process of "thinking" within an LLM is different from the process to output text and then respond to text already in the conversation.
- The LLM outputting text in one "voice" to then pass on to be absorbed by a later model can wring more insight out of it than a single-shot generation could.
- Distilling the fuzzy internal vibes to specific words collapses the wave function in a way that reduces ambiguity but forces it to lock in a specific POV.[xh]
- This dynamic of giving space to reflect and collapse the wave function is similar to how chain of thought works.
- One problem with using multiple LLMs in a conversation though: LLMs always respond to every message.
- In a 1:1 conversation, this is reasonable: one person talks, then the other one does, and it always ping-pongs back and forth.
- LLMs are hyper-optimized for this behavior, it's basically impossible to get them to not do it.
- But in a multi-person conversation, the rules for when someone should speak are way different.
- Each participant has to understand if they have a useful-enough thing to add to the conversation, or would just distract the flow of the conversation.
- In humans there are tons of social cues we're constantly looking at to figure out if we overstepped in a conversation; LLMs don't have that.
- LLMs today will simply respond every time they are "spoken" to, even if they have nothing interesting to say[xi].