Hallucinated mini-apps are having a moment.

WebSim and Windows9x are both surprisingly compelling.

Doubly so because they start off looking like a weird toy (low expectations) which means when they do something useful it's a mindblowing moment.

Capped downside, significant upside.

Anthropic Artifacts are just interface sugar, but they make the feedback loop immediate and help give a gradient of learning.

But hallucinated mini-apps today have two significant problems.

1) They can't safely work with your data.

The hallucinated apps all start off with no data in them.

They also don't have any sandboxing to speak of… you should be very wary about putting data into any of them.

2) They can't compose with other mini-apps.

That means the ceiling of functionality is the biggest mini-app that can fit in an LLM's understanding.

The LLM's capability sets a ceiling on what can be done.

Instead, you want composition of other mini-apps, building blocks into a much bigger whole.

The sky would be the limit of functionality, but you'd need a way to reason about the safety and data of the composed whole.

With enough tokens in context and big enough models, the ceiling might get reasonably high, but never orders of magnitude higher.

The only way to get the ceiling to keep ratcheting up is to allow composition.

To do 1 and 2 you'd need a private, secure base to allow safe composition of your data.

For example, private cloud enclaves and information flow control.

More on this topic