LLMs make weird judgment calls sometimes.

LLMs make weird judgment calls sometimes.
- Big labs fix it by fine-tuning.
  - This takes tons of capital and time… and can only be done by the labs with the model in the first place.^[ed]^[ee]
- But a system that can just distill usage from real users, automatically, could make a much faster correction loop.
  - That could help emegently improve it for a given task.