For things that are privacy sensitive, the epsilon matters a lot.

· Bits and Bobs 5/27/24

Epsilon is a concept from differential privacy.

The epsilon says how likely two distinct datasets that differ by one record could be distinguished (revealing the presence or absence of that record).

A smaller epsilon is more private.

If the epsilon is sufficiently tight, your data is hiding in a tornado of other data, and its presence can't trace back to you.

That said, low epsilons typically make it harder to achieve the same amount of utility in the systems. It's a tradeoff.

How much a given feature might interfere with our privacy comes down to the epsilon.

There can be different epsilons for others in the ecosystem (what data might an outside observer be able to figure out about you) and also for the provider (what data might the provider themselves be able to detect from their privileged access to the data set).

In the default architecture of our current paradigm, the service provider has full, unfettered access to the totality of their data, but still might not be doing anything with some data, and so crossing the line to sift through it, even in an automated way, may change people's perceptions.

A "Do you want to opt out of this chat service training on your DMs" comes with an implied "... at what epsilon?"

The "at what epsilon" is rarely addressed.

The epsilon matters fundamentally… and yet high- and low-epsilon implementations are presented to users the same way.

A bad high-epsilon feature can taint the user's perception of all such features, even ones that would be implemented in a low-epsilon way.