Once you frame it as a smooth problem (differentiable), it can be optimized and use hill climbing.
One of the reasons that identifying a good "self steering metric" can be powerful.
For example, for rolling out an ambitious technology: "Maximize absolute amount of usage while minimizing the number of users who have such a bad time they'll never use it again."