Imagine: not one big all powerful AI, a swarm of little naive sprites.
They're cheap like code. But have some amount of common sense like a living thing.
Kind of like a swarm of bees, but where each bee has its own distinctive speciality.
You give the swarm an alternate universe without external side effects for them to tinker, experiment, and accumulate state.
Everything can be undone, nothing affects anything in the real world, so all tinkering in that pocket is safe.
Some external actions would be "safe" (modulo privacy leakage) like "fetch the current weather in Paris".
But some external actions are inherently unsafe, e.g. "buy these plane tickets" or "send the email to your boss telling them what you really think."
All external actions would be forbidden in this petri dish.
And then you as the overseeing human decide which things they've created that you want to pluck over the wall into the real world.
Which subset to promote to canon and make real.
Semi-automatic software.
Software that runs automatically, but always waits for a human LGTM before it does a non-reversible action.