LLMs have a Clever Hans problem.

  • LLMs have a Clever Hans problem.
    • Clever Hans was a horse that could do arithmetic by clomping his foot.
      • But it turned out he was looking for unconscious, subtle signals from his handler.
    • LLMs have a similar vibe.
    • LLMs do great when you have the task distilled down to an SAT question.
      • The right question, with the right context.
    • Once they're in that state, they do a very good job.
    • But it takes a ton of situated human intelligence to distill a real world problem into that format, to set them up for a good job.
    • Another example of the "last mile problem" for AI.