Reward functions work great in systems that have no indirect effects.

· Bits and Bobs 2/9/26
  • Reward functions work great in systems that have no indirect effects.
    • These are closed systems.
      • For example, AlphaZero just needed a single reward function based on who won the game.
    • But in systems with indirect effects it's impossible to create one true reward function to simply optimize for.
    • Systems with indirect effects are nearly every problem that actually matters!
    • Closed systems are often toys.