LLMs make weird judgment calls sometimes.

  • LLMs make weird judgment calls sometimes.
    • Big labs fix it by fine-tuning.
      • This takes tons of capital and time… and can only be done by the labs with the model in the first place.[ed][ee]
    • But a system that can just distill usage from real users, automatically, could make a much faster correction loop.
      • That could help emegently improve it for a given task.