Gradient descent gets super linearly more effective in multiple dimensions.

· Bits and Bobs 4/6/26
  • Gradient descent gets super linearly more effective in multiple dimensions.
    • You only need one dimension to have an active gradient to descend to escape a local minima.
    • The chance at least one dimension has a downward gradient goes up multiplicatively with more dimensions.
    • This is one of the reasons gradient descent for LLMs and evolution is unreasonably effective.
    • We're used to a puny three dimensions.
    • An excellent video about how proteins can be discovered by evolution so effectively.

More on this topic

From other episodes