James Evans gave a fascinating talk I attended this week.
- James Evans gave a fascinating talk I attended this week.
- He studies how collectives "think."
- Unpredictability is the best predictor of a paper being highly influential.[gq]
- These ideas are "off manifold," they are outside the normal landscape of research.
- As AI makes it possible to work with large data sets, all of the human researchers are focusing on the same AI-adjacent topics, leading to less novelty.
- This is partly because humans have limited time, and want to make sure they can publish a paper.
- Consider choosing between:
- An "on manifold" idea that is more likely to be correct and publishable but not particularly interesting
- Vs an "off manifold" idea that is far more likely to not be correct or publishable, but if it is it is more likely to be interesting.
- Humans will pick the former.
- But AIs don't get bored, and you can assign them tasks that no human would agree to waste their time on.
- So you can have the AIs swarm precisely on the off-manifold ideas that humans aren't looking at.
- It's kind of like the reverse of Goodhart's Law.
- His research also finds that more complex language models naturally build a kind of "society of mind" inside of themselves.
- They literally develop multiple personas and talk to themselves in those personas.
- It's easier for a model to spit out: "Sarah: Hmmm that didn't work let me try diving deeper to see if I can fix it. Bob: Wait no, I think we should backtrack" than for a single model with a single persona to realize it should backtrack.
- It's kind of funny this happens; I think of it like the model having a ventriloquist dummy it's talking to that it's puppeting but also listening to like it's a real person.
- Plato said that all insight comes from dialogue; it makes sense that models create internal dialogues to get better at having insights.