A few random off-the-cuff reactions to OpenAI's strawberry model in no particular order:
The performance of it is something that could be sometimes cobbled together with a whole lot of prompt-fu and judgment.
e.g. https://x.com/tommy_winarta/status/1834550186099576958 and https://x.com/daveshapi/status/1834599760931569677
But now it's fully automated and doesn't require user savviness, and also it's potentially self-improving.
It's kind of "just make it do chain of thought, but hide the intermediate parts from the user to not distract them, and also make the problem solving parts self-improving with more training"
One of the reasons AlphaGo was self-improving (more compute = better skills, without limit) was because the rules of Go are well grounded.
It was very easy to keep it aligned with how Go actually works so as it improves it never loses contact with reality.
A computer could run a normal program to ensure the "laws of physics" were consistent.
Strawberry is expensive!
It reminds me of the "supersonic consumer air travel is physically possible but not economically viable."
There will be a point where for a given use case it's just too expensive.
So you only break out the big guns rarely and when you really need it.
The thing I think is most interesting is how hard it is to steer based on a very long feedback loop.
Set it and then come back to it later... and if you realize "oh crap I forgot to include some relevant context" you have to do it again.
Longer feedback loops are inherently much harder to steer, and harder to learn how to steer because you get less rounds of experience with it.
The dumber models require more of a human on the steering wheel in iterations to guide it... a downside is the human has to be there and has to know how to do the lightweight steering (prompt-fu) but the upside is that the human can continuously correct with small corrections instead of it running off for minutes to answer a question that turns out to be ill-posed.