When extracting information from LLMs, we're like cavemen poking them in the dark.
- When extracting information from LLMs, we're like cavemen poking them in the dark.
- LLMs encode vastly more information than we know how to retrieve.
- We're in the very early stages of figuring out how to wring out all of the information they encode.
- Getting great results out of LLMs is entirely the domain of folk knowledge, with people like Ethan and Lillach Mollick the undisputed champs.
- For example, like having LLMs have conversations with themselves to distill and dive deeper into the most promising options can give better results.
- You can look at the approaches that scale test-time compute (e.g. the approach that O1 and others use to get higher quality reasoning) as a savvy technique to wring more baseline knowledge out of a system.
- LLMs never get bored, and never run out of ideas; if you give them space, they will spew out all kinds of ideas.
- Most of them will be crap, but some subset will be good.
- If you give them the space to spew, and have some way of sifting through what they produce, you could find high quality results.
- Scaling test-time compute allows the LLM to unspool much more approximate knowledge in its own "internal monologue" and then select and synthesize the subset that is most promising.
- In some domains, like math proofs, you can use formal systems like Lean to cut through all of the noise and zero in on the formally plausible answers.
- In other domains, you can train a reward model that learns which kinds of intermediate thoughts are most useful.
- Computing inside AI frames our current ways of interacting with LLMs as like interacting with computers before they had a GUI.
- What other techniques will we develop to extract orders of magnitude more insight out of these models?[anr]