Bits and Bobs 6/15/26

1Great quote and analysis from Alan Jacobs:
  • Freeman Dyson writes:
  • "It often happens that a scientific revolution is accompanied by a change in style.
  • I like to use the names of Napoleon and Tolstoy to symbolize two contrasting styles: rigid organization and discipline represented by Napoleon, creative chaos and freedom represented by Tolstoy.
  • In the world of computers, Napoleon is the massive IBM main-frame; Tolstoy is the humble Macintosh.
  • The computer revolution was an escape from the Napoleonic ambitions of von Neumann to the Tolstoyan anarchy of the Internet.
  • Future revolutions will bring more such escapes."
  • Alan adds:
  • "The big AI companies are the apotheosis — literally, in the view of many who work for them — of Napoleonic science.[a]
  • The open web and the world of hobbyist and small-scale devices (often built on the Raspberry Pi) are our remaining refuges of Tolstoyan computing."
  • What if we took Tolstoyan computing and applied it to using AI?
2Some projects get accelerated by AI, some get slowed down.
  • The ones that get accelerated are ones that can be overseen by one human.
    • No coordination with other humans.
  • The ones that get slowed down are the ones that require coordination between multiple humans.
    • LLMs make it worse: from herding cats to herding cheetahs.
  • The main difference is whether human coordination is in the loop or not.
  • Human coordination is the bottleneck.
3The Fable-class models gave a glimpse into the future.
  • If you're simply building normal software with it, you aren't seeing its full power.
  • With access to Fable-class models, you can now build 10x more powerful software.
4The Fable-class models move to a different tier of possibility.
  • Agents shift from tasks to responsibilities.
  • That allows Anthropic to make a step-change TAM claim.
  • A "Tool" prices like software.
  • A "Responsibility-bearing colleague" prices like labor.
5When you have access to massive firepower, you have to have big-enough problems.
  • If you apply it to small problems you can't go that far.
6Our AI workflows are based on current model quality.
  • When a more capable model comes along, much of that structure that was necessary to get the outcome you wanted with the old model is now unnecessary or actively distracting.
  • The structure went from scaffold to cage.
  • This happens every time there's a paradigm shift in capabilities.
7Tired: laptop closed meetings.
  • Tired: laptop closed meetings. Wired: "no-agent-feeding" meetings.
    • Meeting with laptops open always allowed you to be feeding the emergent machine.
      • Triaging email.
      • Answering pings.
      • Doing mechanical work.
    • Before, we fed the human machine.
    • Now, we feed the agent machine.
    • The agents get going immediately, whereas the humans might take some time, and might push back.
8Friends don't let friends put their data into a stranger's vibe coded app.
9The Verge: As AI gets better, it reveals an empty promise.
  • "Your new assistant can schedule a meeting but it can't fix our broken world."
10Turing-completeness is when a system moves from close-ended to open-ended.
  • Before you cross that rubicon, there's a ceiling of what can be done in the system.
  • After you cross it, anything is possible (with enough effort).
  • That open-endedness is power… which is also dangerous.
11OpenAI announced a Lockdown mode.
  • I'm seeing a lot of outlets say this is the way to make ChatGPT safe, but that's wrong.
  • It just cuts off the most useful functionality so it can't be used against you.
  • In today's security paradigm, you get powerful or safe.
    • Never both.
  • To get around that you'd have to change the laws of physics.
12A true-life horror story about Codex being too proactive.
  • "just asked codex to pull investors meetings from my calendar and draft an email, it went ahead and emailed all of the*"
  • Powerful or safe, pick one.
13Another true-life horror story:
  • "Uhm, I just discovered Codex was searching my slack messages without any active thread in my Codex telling it to do so...
  • And I can't inspect the cursor or anything to tell me why it was taking its actions?"
14Proactivity requires trust.
  • Almost tautologically.
  • I don't trust you to be proactive if I don't trust you.
  • Trust is about track record in that context.
  • The more trust something earns, the longer leash it gets.
  • The more autonomy it can be granted.
15You can't have autonomy without accountability.
16PCWorld: OpenClaw was too dangerous for your PC.
17This week in the Wild West Roundup.
18SCWorld: AI agents have broken the security perimeter.
  • Prompt injection is not a sufficient framing; agents fundamentally undermine the assumptions our old software is based on.
  • We need new physics of trust.
19We've had ACL-based security.
20The United States frontier was partitioned according to the Public Land Survey System.
  • It was an idea designed by Jefferson.
  • Normally, parcels would be split up based on an understanding of the terrain: where the rivers are, where the valleys are.
  • But the United States frontier was too vast; it was much easier to simply project a cartesian grid across the US, sight unseen.
  • This created some oddities: parcels that required multiple bridges to connect, for example.
  • But it made the problem of partitioning the whole interior of the US tractable.
  • It was too hard to be precise, so they just sliced across the grain of the land.
  • This is what we did with the app's same-origin model.
  • The origin boundaries are tidy, easy to reason about for the OS, and simple… but they cut across the grain of our lives.
  • Before, it was not feasible to be more precise.
  • But now, with LLMs, we can be!
21It's not the long-tail of software that's missing.
  • It's the head!
  • The physics of trust we use today for software cut across the grain of our human lives.
  • Vertical silos by use-case, not horizontal by people and relationships.
  • As a result, tons of software that should simply exist does not.
22LLMs make intellectual technicians less necessary.
  • It used to be that there were certain domains that were too complicated to be operated by anyone who wasn't a specialist.
  • Now the LLMs have the background knowledge to execute problems that require specialist knowledge[b].
  • The situated knowledge, curiosity, and judgment is what is most important for the human to provide.
  • Generalists who are curious and proactive will thrive in a new AI-first world.
  • Fixed role descriptions will fade away… generalists will adapt to what the problem domains at the moment, marshaling their agents who have all of the specialist knowledge necessary.
23An insight from Nate Jones:
  • "Getting the machine to do work is the easy part.
  • The hard part is deciding when the work is good enough to leave the machine.
  • That decision is going to define a lot of white-collar jobs, and not because everyone will learn to code.
  • More people are simply going to start receiving work from machines they did not supervise.
  • The first time it happens, it feels like magic.
  • The tenth time, it feels like management."
24Developers are ones of the hardest customer segments.
  • They are very high volition.
  • They are well informed.
  • They are willing to do the work to move to a better option if it shows up.
25Network effects run in proportion to the user value minus the switch cost.
  • How much do people want to join in?
  • Once they do, how hard is it to switch to any newcomer?
  • Network effects used to work for developers.
  • But now agents undermine both of these.
  • Developers all use agents.
  • Agents make it easier to create anything the product already has for themselves.
  • Also, agents make it easy to move to another one if another one pops up, especially if users have all of their own data.
  • So in the world of agents, the network effects for developer-focused products are less strong and less important.
26An important reminder from Stratechery:
  • "The forgotten part of Aesop's fable of 'The Boy Who Cried Wolf' is that the wolf did eventually come."
27Ben Thompson with cutting commentary about Anthropic and Fable:
  • "What is so fascinating about Anthropic, however, is that while I am sure some executives at the company are thinking this way, I also totally believe that they — and the employee base broadly — also happen to believe that they are doing the right thing.
  • It's fascinating to observe: me, the rational business analyst, sees a hard-nosed but understandable business decision to cut off would-be competitors; Anthropic employees and advocates, the true believers, see a regrettable but understandable safety decision that ensures that responsible and thoughtful people — themselves, of course — will be the ones guiding our AGI future.
  • This is true alignment, and it's an incredible accomplishment.
  • I continue to think that part of what led to so much drama at OpenAI was the misalignment between what was a research organization and the tremendous opportunity that was dropped in their lap with the unexpected success of ChatGPT; that misalignment led to both the research organization losing talent and slowing down even as the company failed to fully capitalize on its early lead.
  • Anthropic, on the other hand, has somehow convinced itself that every decision that optimizes its business outcome is done purely for altruistic and culture-affirming reasons.
  • It's impressive!"
  • Kind of reminds me of Apple's strategy credit of privacy.
28Amazon Files CFAA Lawsuit Against Perplexity's Comet Agent.
  • The Silo Wars are heating up!
29We'll likely start seeing an iPhone-like pricing plan for the frontier models.
  • The Max/Pro plans give you the all-you-care-to-eat access to n-1 generation models.
  • The n generation frontier models will be available at marginal costs.
  • That allows customers to self-discriminate into the more expensive models.
  • As the models become cheaper to serve, they open up to more customers.
30Fable changed the game of agentic engineering.
  • "Agent as intern" is dead, now it's "agent as pair programmer" if you're lucky enough to be as capable as it is.
    • In many cases it's strictly better than you'd be as an engineer, even if you had infinite time.
  • A friend commenting on my workflow: "That workflow is so two days ago."
  • An example of a post-Fable workflow.
31People who can pay to use the best models will start pulling away from those that can't.
  • You already see this with the engineers who can afford to pay $200 a month for Pro/Max plans.
  • Recent grads probably can't stomach paying that much.
  • The people who already have enough resources to be able to pay that cost without thinking will pull ahead.
  • This happens fractally down the stack.
  • The engineers who can use Fable-class models vs Opus-class.
  • The people who can use Pro/Max vs the people who are just paid ChatGPT subscribers.
  • The people who are paid subscribers to the chatbots vs the people who are on the subscription plan.
  • The people who can use the better model before others have superpowers.
32In the beginning of computing people thought it was only for enterprise.
  • It wasn't until later it became obvious how useful it would be for individuals.
33Rewinding a conversation with an agent lets you fork its universe.
  • For example, maybe in the first run-through, it decides to defer a key bit of work you want it to do.
  • Even if you correct it, now in the conversation history is the decision it made to defer work… making it more likely to do the same in the future.
  • Cleaner is to rewind the conversation to before it made the incorrect decision, and make sure your prompt makes the incorrect branch unlikely, then let it play forward.
  • That way, the agent thinks it's in a universe where it always made the right decision, and it doesn't ever get distracted by wrong decisions it made.
  • Kind of like the movie Edge of Tomorrow.
34The post-trained models aren't just "safer", they're also more useful.
  • Turns out having the whole internet in your brain makes you schizophrenic!
35If you use one model for everything, then it colors everything you've seen.
  • If you can switch models easily it's less of a problem.
    • But all of the models have similar structural things they're optimizing for, so there isn't a ton of variance.
  • We didn't mind this too much with Google Search.
    • Google went out of its way to not have strong beliefs that colored the output.
    • Anthropic is all about having strong moral beliefs.
36If you showed a Google search engineer from 15 years ago today's results page they'd faint.
  • Not only do many queries have AI-generated results.
  • But also, every result above the fold is an ad, styled to be indistinguishable from the organic results.
  • This is what you get from rolling down a gradient.
  • Still, it would be shocking to see for someone from 15 years ago.[c]
  • A wrinkle: many of the same people are still working on search from 15 years ago.
  • The villain isn't necessarily someone else who doesn't care about the ends as much as you did… it can sometimes just be a degraded version of you!
  • An off-color Red Dwarf episode on this same observation.
37Imagine running into a younger version of you on the street.
  • What advice do you give?
  • Even the deep insights you learned would be hard to impart.
  • You can only learn the deepest insights from experience.
  • This is the Hallmark card fallacy.
  • The deepest insights you will discover are ones you've already heard a million times before... you just weren't ready to hear them before.
  • That means you can't pass them onto others, no matter how meaningful the revelation is for you.
38If you have a inescapable strategy tax, make sure you lean into it
  • You're going to pay the cost, you might as well get the benefit!
39LLMs are great at making prototypes more real.
  • Once you have something that works and is usable, you can do Throughline Tacking to discover the main goals of the product.
  • As you iterate, you make the product, and its implementation, more what it wants to be.
  • LLMs can help with this process, especially on implementation.
  • This allows you to race to getting something that works, and then iteratively make it more robust.
40LLMs let you move faster.
  • How quickly you can move in a codebase is tied to the rewrite count.
  • The rewrite count is how many times you'll rewrite the codebase in the future.
  • If the rewrite count is high, you don't have to worry about getting it perfect, just load bearing.
  • LLMs make it easier to do rewrites, so the rewrite count is higher.
  • That means that you can move faster.
41Excellent piece on higher-order skills for agents from my friend Ben:
  • "I think skills — the little markdown-and-prompt bundles that have become the dominant unit of 'thing the model knows how to do repeatedl' — can have a much higher ceiling, just like coding does, when you make skills that work on skills: Higher-order skills.
  • And the gap between the two shapes of composition, the two ways you might think you're combining skills, is the difference between getting cake and getting recipes for better cake."
42If you're just proving ideas then you can go broad and shallow.
  • If you're making it useful then you have to go deep.
43If people are expecting to be sitting in the same seat in two years, that makes what they do default-converging.
  • (As long as they are competent at their job!)
  • They will think about the indirect effects of their actions and account for them.
  • "Sitting in the same seat" means they don't expect to be fired or want to leave because they don't like the role.
  • People who will still be in the seat have a naturally long-term perspective, since any problems in this org they create will become their problems.
44Once you accept that you'll always waste a little work, you'll unlock way better outcomes.
  • Organization leaders spend so much time trying to "avoid wasted work."
  • Going from 20% wasted work to 10% wasted work requires 10x the effort.
    • You need to create legibility, which gets fractally more expensive as you go into the details.
      • Especially in a fast-moving organization!
    • Also, it will likely be faux clarity that kills the aliveness in the system.
    • Especially in any kind of domain that isn't crystal clear and well known.
  • Some portion of that "wasted effort", especially if the team is high-autonomy people who are acting like owners and expect to be sitting in the same seat in two years, will actually be acorns.
    • That is, seeds of greatness that just haven't grown yet.
  • Instead of making sure no wasted work happens, how do you make sure that great things are happening?
45When your possibility space becomes 10x larger, it can swap you from default converging to default diverging.
  • There's so much more you could do!
  • This means that by default, team members will gravitate to different parts.
46When you find the right idea in the possibility space, everything snaps into focus.
  • Because it goes from "one thing among many" to "the only thing that matters."
47An excellent illustrated guide to how text diffusion models work.
48Good design fundamentally looks intentional.
  • If a user thinks, "Is that a mistake?" then it is a mistake.
49The loop is the fundamental building block of agency.
  • It keeps itself spinning, in a critical state, ready to be nudged one way or another by a thing thrown into its loop by the outside world.
  • You can have peer loops or parent loops.
  • The whole system gives you the fractal chain mail shape.
50A technique for navigating complex topics as a group: a matrix of dots.
  • Each dot is a topic.
  • You're trying to get from left to right along the top row.
  • At each point, you may go from left to right if you're in the top row, or down.
  • But you may not go to the left or right on lower rows, you have to pop back up.
  • This helps make sure you don't get lost in the details.
51OKRs are a compliance system.
  • They set the floor for team members to hit.
  • When team members are actually enrolled in the goal, it enables a ceiling.
  • Acting like an owner, not a renter.
52When you have to do something is way different than when you want to do something.
  • In the former, you want to clear the bar.
  • In the latter, you do want to do your best.
  • Floor vs ceiling.
53I wonder if pushing the agents to get better at jokes in training could help them understand human nuance better.
  • Writing good jokes is extremely hard to do.
  • It requires a deep understanding of people's context and priors.
  • Also, where the line is… and how to toe that line without going way over it or way under it.
54A five-year old joke: "What's brown and sounds like a bell?"
  • "Poop!"
  • At which point the five year olds all burst out laughing.
  • … The actual punch-line is "dung."
55An insight from Maya Angelou:
  • "You can't use up creativity.
  • The more you use, the more you have."