Tools like Information Flow Control have existed for decades.
They allow you to make formal statements about the confidentiality and integrity of a composed system.
Part of the challenge is that they require precise policy definitions of when for example, certain information may be declassified (that is, safely used in a different context).
Writing a simple one-size-fits-all policy in a vacuum is easy, but then they are hard to use in real-world contexts.
If you want the policies to be applied to fractally wrinkled real-world situations, it could quickly get unwieldy.
Imagine a toy example of a cake recipe that in the third step calls for the addition of 5 tablespoons of tabasco sauce, and the policy needing to decide if that's reasonable.
You'd need a policy like "spicy sauces are OK to be in recipes as long as the dish is a savory one and the overall amount of sauce is less than 3% of the total volume of the dish".
It's impossible to imagine such detailed policies to be created, especially when you imagine all of the real-world scenarios it needs to cover.
This reduces to the metacrap fallacy.
One way to think of this is to make this machine work you need to have created hyper-intricate, hyper-precision gears for every possible need.
Clearly impossible!
But there's another way to do this.
There are lots of policies where 99% of the population would agree that it was allowed or not allowed.
For example, whether 5 tablespoons of tabasco sauce is legitimate in a cake recipe.
LLMs are society-scale crystallized intuition.
You can ask the LLM: "is it reasonable for a cake recipe to call for 5 tablespoons of tabasco sauce?"
That gives you an immediate good-enough default policy for cases where the vast majority of people would agree.
Good enough policies are ones that lead to very very few nasty surprises in practice, and where users that want to be a bit more flexible can ask the system to add a wrinkle for them.
There are a lot of plausible judgment calls where different people might disagree, but those are many, many orders of magnitude less common than the space of policies where the vast majority of people would agree.
If you have a good-enough baseline based on the crystallized intuition of society, you can wrinkle it with more specific needs.
For example, maybe a user protests that 5 tablespoons of tabasco sauce is legitimate… if you're making a cake to prank a friend.
In that case, you could add a wrinkle to the policy of "... unless the cake is known to be made as a prank".
If there are wrinkles that a small but consistent set of independent savvy users want, you might be able to expand that logic to the general populace, getting a self-wrinkling set of default policies that handle most cases well.
This is less like a top-down ontology, more like an emergent folksonomy that can grow itself by starting from a good-enough crystallized background knowledge.
Now, instead of hyper-precise clockwork gears, you have rough clay that you can smoosh into place.
More organic than mechanical.