Nate Jones has made the point that where agents make mistakes are more valuable to quality improvements than when they get it right.
- Nate Jones has made the point that where agents make mistakes are more valuable to quality improvements than when they get it right.
- Those mistakes, where the human has to correct the output, are the surprisal gradient.
- Every bit of guidance from a human is a valuable jewel of input that should be kept to make the system improve and become default-converging.