Gradient descent gets super linearly more effective in multiple dimensions.
- Gradient descent gets super linearly more effective in multiple dimensions.
- You only need one dimension to have an active gradient to descend to escape a local minima.
- The chance at least one dimension has a downward gradient goes up multiplicatively with more dimensions.
- This is one of the reasons gradient descent for LLMs and evolution is unreasonably effective.
- We're used to a puny three dimensions.
- An excellent video about how proteins can be discovered by evolution so effectively.