Anjali has some amazing insights on token economics.

· Bits and Bobs 5/4/26
    • Prefill is compute constrained.
    • Decode is memory constrained.
    • So the ratio between input and output matters most for their costs.