Gradient clipping is a way of forcing regularization.
- Gradient clipping is a way of forcing regularization.
- In ML, one way of regularizing is keeping track of, for each weight, how precise you believe it to be.
- That is, how often it has changed in the past.
- When you need to update it, you update precise values less than imprecise values.
- But this requires significantly more complexity.
- Another approach is simply gradient clipping.
- Simply cut off extreme values.
- It's less precise individually, but on average, stochastically it is apparently equivalent.