Models have to think about distillation both in the large and the small.

· Bits and Bobs 4/13/26
  • Models have to think about distillation both in the large and the small.
    • In the large, it's other labs distilling the model itself into a new model.
    • In the small, it's individual users distilling mechanistic code via the model, so they don't need to use the model for that use case in the future.