When training a model, it's important to keep the test set isolated.
Every time the model interacts with the test set, a little bit of it is absorbed.
If you included the test set directly in the training set, even a model that overfit would look amazing, because it would look great at the subset of data.
You want to have the model only interact with the test set rarely and indirectly, to avoid being tainted and absorbing the test.
A similar thing happens with differential privacy.
Differential privacy guarantees work best for single-shot interactions with the data.
If you interact with data that has been prepared in n different ways with a given epsilon, you can still extract quite a bit of information about the underlying data.