Gradient descent gets super linearly more effective in multiple dimensions.

2026-04-06 · Bits and Bobs 4/6/26

Gradient descent gets super linearly more effective in multiple dimensions.
- You only need one dimension to have an active gradient to descend to escape a local minima.
- The chance at least one dimension has a downward gradient goes up multiplicatively with more dimensions.
- This is one of the reasons gradient descent for LLMs and evolution is unreasonably effective.
- We're used to a puny three dimensions.
- An excellent video about how proteins can be discovered by evolution so effectively.

More on this topic

From other episodes