Bits and Bobs 11/4/24
1Last week I observed how frustrating it is when an app you use is missing one feature you want.
Why don't apps add those features?
Because in software used by many people, features that only some subset of users want increase the complexity for everyone.
Software becomes the superset of all functionality used by all users.
The more powerful it gets, the more overhang of functionality for any given user, and thus complexity for most users.
Complexity is functionality a given user doesn't need or want.
In a world of one-size-fits-none software, you can't make it a perfect fit for any given user, because it would make it worse for everyone else.
But what if software was situated, perfectly personal?
Now you could have software that fits everyone like a glove.
This is now possible because software LLMs mean software in the small is no longer expensive to write.
2Jailbreak your data!
Transform it into a form you can tinker with, on turf where you call the shots.
One-size-fits-none software is a cage for your data.
It's your data. Free it!
3LLMs only do superficial pattern recognition, but they can do it incredibly robustly.
They are amazing at superficial absorption of patterns.
But if you bombard them with tons of examples it gets a significantly more robust pattern recognition… but still superficial.
But when it's sufficiently robust it looks like it has deep pattern recognition.
It's like "baking" shadows in old video game engines, where you precompute the lighting and then create object textures with the shadows baked in.
Looks great (if you don't look too closely) but is extremely cheap.
If you give LLMs a gobsmackingly large number of examples, you can get gobsmacking robust recognition of those patterns.
4Imagine trying to explain the power of the algorithm to someone 20 years ago.
"This algorithm you say controls everything... is it in the room with us now?"
5Humans will imbue anything with agency and anthropomorphize it easily if it has a face.
Look at a tamagotchi.
Anything with a face that you can talk to.
Pond scum has no face.
LLMs and other emergent algorithms are closer to pond scum intelligence than human intelligence.
But LLMs put a face on pond scum intelligence.
You don't trust your Apple computer, you trust Apple.
But now with LLMs we feel like we trust the LLM, because it has a "face" and we can talk to it.
6ValTown is pretty great.
"Not no code", just code."
When you're programming in the small, a lot of the architectural challenges of writing code fade away.
You don't use ValTown because of its framework; you use it because of the convenience to write useful bits of software in the small, and its opinionated, slightly magical framework is worth it to get access to that environment.
From constraints comes power.
What if you could write little Vals that would run on your most sensitive data?
7I love The Elves and the Shoemaker fairy tale.
The shoemaker can't make enough shoes.
At night the elves come to help him, unbeknownst to him.
Wouldn't it be cool if they'd make software for you while you slept?
Write down your wishes, the system tries to fulfill them!
Such a system is proactive.
It could help come up with useful things that you didn't even know to ask for.
For this system to work and not feel creepy or manipulative, it must work only for you and be fundamentally aligned with your interests.
8When the web first came out, "real" programmers turned their noses up at it.
A serious question people asked at the time: "Do web pages count as real programming, or are they just toys?"
Everyone then knew that real programmers wrote windows apps, not web pages.
The web was "weird" but it was weird in a way that was true to its nature.
One way to approach starting the web would have been "Make it possible to write things that feel like writing windows apps, and make them possible to distribute in this new ecosystem."
But that would have foregrounded the wrong thing: the windows app and all of the expectations and baggage it brought along with it.
The actual important thing for the web was the new universe of possible experiences that come with changing the laws of physics with a radically different distribution model.
The distribution model was so powerful and so good that even though "real" programmers turned their noses up at it, a whole new class of programmers came into being that were web native.
Over time, the web became more and more powerful as more people stretched it to do more things.
Today, no one argues whether people creating web apps are "real" programmers.
9The inflection point for Chrome's adoption was adding themes.
If you're going to live in a system you might as well make it feel like yours.
10Code is easier to tweak than to write.
To write the code from an empty file requires you to free recall all of the details and idioms… easy to mess up, even if it's an environment you're familiar with.
Code that works and does something slightly different than what you want is much easier to tweak to do something you want.
Today, it's unlikely that the code you come across is almost what you want.
Maybe it's tutorial code, or starter-kit code. It works, but doesn't do much.
The one exception is when you're tweaking a project you already use to add just one more feature–the code already roughly does what you want, modulo the one tweak you're making.
But LLMs make it so it's much easier to create working code on demand that you can then tweak if necessary.
11We only bother thinking thoughts that could fit in a human brain.
Some human brains have more capacity than others, but no brain is beyond some multiplier beyond the average brain.
That means that structurally there are ideas that are too broad to fit in any one human's brain.
We can 'cheat' this as society by using tools like science to accumulate ideas that are broader than any one human can keep in its brain.
It gives the illusion of breadth because even though no one human brain can comprehend it, the collective intelligence of many brains can handle pieces of it.
But LLMs are capable of much broader awareness than human brains, structurally.
Not a multiplier better, but many orders of magnitude better.
LLMs can think patiently about things that humans would get too overwhelmed to think about.
There are a lot more thoughts that society can now discover and accumulate.
12People learning Rust are terrified of the borrow checker.
It's a big, scary concept that's hard to grapple with.
But if you write Rust in the "Rust way," using its idioms, in practice you are unlikely to run into the borrow checker very often.
It will simply work roughly as you expect it to.
Yes, there is a big scary concept lurking… but if you do as the Romans do, if you adopt the local idioms, you won't really run into it much.
13I think there's something fundamentally gross about almond milk.
(Yes, I am a picky eater.)
The thing I find gross about it is it pretends to be milk… but isn't.
If my expectation is "milk" when I drink it, my body will go "wait, watch out, something is off…"
If I didn't have the expectation of it as milk, I don't think I'd find it nearly as gross.
This general disgust reaction happens in a lot of contexts, where there's something that purports to be the same as a familiar item, but is off.
Frameworks that include a lot of magic feel like this.
For example, the original Golang App Engine implementation had a whole forked Go ecosystem, which always felt… off.
Often when things are "off" like this it's hard for people to explain why they don't like it, they just feel a disgust and distrust of it.
Sometimes the faux thing can get very close to the thing it's mimicking, so that 95% of the time you won't notice a difference.
But that 5% of the time will have a bite.
The closer you get to being the same, the more that last bit will be unexpected, and the harder it will bite.
Sometimes the answer is not to pretend to be another thing, but to lean into being your own thing.
14Sugar in a system is magic.
It makes the thing taste better, but in a way that can give you cavities.
The best kind of sugar is inductively knowable; the things it does can be described in just a few, easy-to-understand lines of userland code.
That means that the sugar can make your day to day experience sweeter, but also if the sugar is getting in your way it's possible to peel it off and just step one un-intimidating layer down.
Sugar gets confusing when it does magical things nothing else can do, or it applies differently in different contexts, so it's hard to reason about where it applies.
In that case, if you do need to peel off the sugar, it's a big, scary drop down.
15Riffing on a metaphor for declarative vs imperative from React's documentation;
Imagine you're asking someone to drive you somewhere.
Imperative is like giving a friend instructions on how to get there, describing each turn.
They won't surprise you, but they can never do better than the route you know.
It's easy to get confused or distracted and give them bad instructions.
Downside is capped, but so is upside.
Declarative is like giving a taxi driver a destination and leaving it to them on how to get there best.
If they're experts, they'll likely know ways that you don't!
But in declarative you're really leaning on the expertise of the driver.
Downside and upside are more open-ended.
In environments where expertise is important, and there is an established expert you can rely on, declarative can be a good idea.
16Private Cloud Enclaves provide a server you don't have to administer that you can trust to be honest.
If you run a server yourself, you have to do all the administration. A pain!
If someone else runs it, you have to trust them to be honest. Dangerous!
Most users don't really worry about cloud providers peeking (e.g. Google looking into your VM).
Cloud providers are contractually obligated not to peek.
They're unlikely to bother with your small VM anyway.
The real problem is trusting the service provider administering that VM to do what they say.
Private Cloud Enclaves thread this needle.
Someone else can run the server (avoiding maintenance pain).
But you can verify they're being honest (avoiding trust danger).
Remote attestation is the key technology.
Allows verifying the VM is running exactly what it claims.
Creates trust through verification, not promises.
17Some party tricks are load bearing.
Imagine an AI feature that is potentially creepy… but that works entirely on device.
The party trick is that you can flip on airplane mode and it still works.
This demonstrates that it really is on-device.
People will rarely do the party trick in normal use.
But the fact that you can do the party trick is load-bearing.
It's a party trick that's impossible to fake.
The whole inductive argument of "here's why you should trust this" relies on that party trick.
18Good enough in practice is orders of magnitude better than great in theory.
"In practice" means that it is viable; it works in the real world.
The hard part in practice is not greatness, it's viability.
It's easy to imagine a "great" idea that works only in your mind.
"It's perfect inside my head"
But what matters is what exists in the world.
If it's perfect in your head but you can't make it exist in the real world maybe the idea isn't actually viable?
Greatness is hard to retrofit to an idea; greatness is more about fundamentals.
A general strategy: pick a thing that could be great, and try to get it to viability cheaply and quickly.
These things that could be great are things I often refer to as seeds.
Keep iterating until you find one that is viable and starts growing.
19Every system is a bowl on a pedestal.
The bowl is the self-righting zone: after perturbations the system will tend to roll back to equilibrium on its own.
The pedestal is the self-intensifying zone: after the perturbation pushes the system out of the bowl it tends to accelerate out of equilibrium at a self-accelerating rate.
The "height" of the pedestal is the rate the system will accelerate in this zone.
This is a tighter frame on an observation I made a few weeks ago.
When considering the system, ask yourself:
How deep is the bowl?
How high is the pedestal?
The deeper the bowl and lower the pedestal, the more robustly safe the system.
20Looseness is a downside when trying to have an efficient, correct, tightly steerable system.
Those situations are where you want something that is hard.
But looseness is great for resilient, adaptable systems.
Those are situations where you want something that is soft.
Looseness is antifragility; it allows variance to be absorbed to become strength.
21Viability is not an instantaneous thing.
A thing has to be viable at all time steps, or it dies.
If the context changes then a previously viable thing might die on the spot
A thing that has been continuously viable for a long time (over an implied diversity of contexts) is impressive!
That's one reason that businesses are right to brag about when they were established.
22It's easier for superpositions to exist in our head than in the world.
If you have a nuanced idea in your head, either pay the coordination cost to write and communicate concretely enough, or do it yourself.
A superposition takes exponential complexity to write down, and then it's no longer a fluid thing that can adapt on its own, it's a concrete thing that is brittle and has to be changed to fit if necessary.
If you execute the idea yourself, you fundamentally can only get a single multiplier on the idea, never orders of magnitude.
This fundamental tension is part of what makes organizations so fundamentally frustrating to try to get anything coherent done in.
23In a debate to discover the truth in a complex environment, who should win an argument?
The person who truly understands all of the arguments advanced by others.
A lot of debate is framed to the other person as "here's a thing you don't yet understand".
But sometimes the other party does understand it, they just see beyond what the debater sees, or on dimensions they aren't seeing.
In most situations we use the proxy of "person with formal authority" for "person best positioned to understand all of the relevant tradeoffs" but it's just a proxy.
If you're trying to convince someone else and you don't have formal authority, a winning move: steelman back the other person's perspective to demonstrate that you understand it, and still think a different thing is better.
The best indication that you understand an argument is the ability to steelman it (not just parrot it) to the satisfaction of one of its primary proponents.
24It's more important for an OODA loop to be grounded in reality than to be fast.
A fast OODA loop is good, but what matters most is that it touches ground truth on every cycle.
If it doesn't you get speed that decoheres from reality.
The faster you loop the farther you get.
There's no time to point out ground truth, and doing so gets more and more dangerous as it decoheres so people are less and less likely to point it out.
The kid who laughed at the naked emperor didn't get rewarded, and easily could have been killed.
It's easier to see if the loop is fast than to see if it's grounded.
If you go fast in a fundamentally slow environment (e.g. a large organization), you'll get performative motion.
Real speed is hard, so you get fake speed.
Chaos is OK if it's ground truthed.
If it's just a hurricane-force swirl for promo, then it's extremely dangerous.
25The more powerful you are, the more that the environment will mold itself to fit what you like.
Even if you didn't ask it to!
This process will be invisible to you.
As a hyper-successful person you don't have to kick people out of your circle actively, just involuntarily raise an eyebrow and the world will adjust to kick that person out of your circle.
An example of this dynamic is, "The bear is sticky" in the show Silicon Valley.
26Even when you know that you'll tend to prefer people who flatter you, it will still affect you.
Everyone would rather be surrounded by people who flatter them.
It's just for most people that's not an option–we have no choice but to be surrounded by many people who won't flatter us.
Which is good–that helps us learn our flaws and own them and grow.
But on the margin, if we could choose between one person who flatters us and one who doesn't, you'd always pick the flatterer, all else equal.
If you are hyper-successful, you'll have an endless array of people who would love to hang out with you.
That means the people who you allow to hang out with you will be structurally much more likely to pander to you.
27I imagine being hyper-successful would be a very lonely path.
Imagine, starting off just like everyone else.
Then you become radically more powerful: famously rich or successful.
After that point, you know that everyone you meet is just pandering to you, so you don't trust them.
Who do you cling to?
The people you knew before you were hyper successful.
And also people who have been through the same thing as you, your peers.
Those peers probably have a more similar experience to you, so you cling to them more tightly than people you knew before.
But if there are structural things that are more common in that path (e.g. personality traits, or blindspots, that all of you share) you'll now have an echo chamber.
That echo chamber can self-accelerate and remove you from the ground truth reality.
All of you in the peer group are successful, and you all only really trust each other.
Some weird things can emerge out of that.
29Lots of people are talking about how LLMs might change how large organizations work.
I think LLMs will almost certainly have a big effect on how organizations work.
But I don't think it will be a panacea.
The "metagame" of corporate politics emerges inexorably.
It arises from every player trying to get an edge so they don't get knocked out of the game.
That, combined with the core asymmetry that everyone has to act like they think their boss is right.
The boss is the one formally empowered in a system to kick a report out of the company.
So it's imperative to make sure they are happy or you might be knocked out of the game.
The metagame arises inexorably from these.
If you change the substrate, the metagame doesn't go away; it just shifts and adapts to the new reality.
Perhaps in a way that isn't obvious to start, and thus is potentially more harmful.
Examples of how an LLM-based tool to give leadership status updates might lead to different on the ground strategies:
People will learn to say the words they've heard the CEO say, to be less likely to be flagged by the system as working on something unimportant.
People will learn that if they can get lots of people to say the same distinctive words, it's more likely to catch the attention of the CEO.
A kind of inception by toying with small structural levers.
"We don't have politics here" has never been true for any assemblage of human beings ever, and never will be.
30Someone who embraces complexity realizes the tensions are fundamental and cannot be resolved, only dynamically and contextually balanced.
The reductionist says "if we simply push a little deeper, work a little harder, be a little smarter, we can banish the edge cases to ever smaller impacts."
But complexity is non linear. You don't chase down details and fix the whole. If you tighten here it squishes out there (and perhaps more strongly than the original constriction).
31A utilitarian / reductionist looks at the success of AI systems in chess and sees a model for fixing the world.
Chess algorithms (at least, before the Alpha class models) worked by encoding a tuned utilitarian accounting of board states.
But chess is a perfect knowledge system with clearly defined win conditions, a closed system, rigid categories.
None of those apply in complex environments, which is the ones we live in!
The utilitarian argument of "as long as you have the perfect accounting of misery and thriving points it's easy" is a smuggled infinity.
Everything past the "perfect" is absurd, because perfection is impossible in complex environments.
32To get a handle on causation you have to experiment: perturb and observe.
When computers learn chess they can generate more games to absorb and experiment with.
But when it's a real world phenomena they are predicting/affecting they can't create more data and can just learn correlates, not causation.
If you optimize a correlation you just accentuate the bias.
The bias could theoretically be a random happenstance at the start that constantly blew up larger and etched deeper.
Grains of dust after the big bang that grew into whole galaxies.
You can optimize a causation not a correlation.
33Multi-ply is hard to distill; it's exponential.
It's a superposition.
That's why wisdom is hard to pass down, it can only be earned and felt, and it's inherently multi-ply.
Do a thing, and then see and absorb the indirect effects.
The more indirect effects you absorb, the wiser you are.
34Saying the AI has wisdom demonstrates you don't know what that word means.
LLMs are intelligent, not wise.
Intelligence can be one-ply.
Wise is multi-ply (the intuition borne from experience, which is inherently multi-ply).
35Beware people who make you feel good about your anger.
That's the easiest way to manipulate you.
The rest of the world is pushing back on your anger, making you feel more angry.
The person who says, "yes, you are right to be angry, you should be more angry!" will then be able to redirect your anger elsewhere.
36How quickly does a system bitrot?
The rate of change of surrounding systems it has to integrate with.
37To learn you have to first realize you were wrong.
If your ego does not allow you to realize you were wrong, you cannot learn.
Sarumans have a structurally harder time learning.
That allows them to heroically push forward with a determinism and confidence no one else can muster… and if it ends up being a great idea, that can create a lot of value for society.
But watch out if they're steamrolling forward on a bad idea!
38Dictator mode allows you to execute quickly, but breaks the feedback loop.
All throttle, no steering wheel.
You won't realize what you've lost until it's too late.
This is a massive and fundamental tradeoff.
39Democracy is slow but antifragile.
It's an internally-ground-truthing system.
It's hard for a democracy to execute on any coherent, bold idea with force.
But it's also much less likely to decohere from ground truth; the process for re-ground-truthing is internal to itself, and not external.
An external ground-truthing mechanism requires competition; if competition dries up for whatever reason, the ground-truthing mechanism stops working.
Antifragile systems are internally ground-truthing.
40The authoritarian "I'm a king, you're a peasant" is hostile to innovation.
In that model, innovation can only come from the king.
The renaissance happened partially because of human rights, "good ideas can come from anywhere, actually"
41A provocative observation: "Founder mode is authoritarian-curious"
Within an organization, an authoritarian approach can be useful: it doesn't get distracted and pushes through the swirling chaos to make something happen.
But it can also do a lot of damage if it's not ground-truthed.
Companies are authoritarian inside but are competing in the market outside.
The market keeps companies ground-truthed; if they weren't, they'd be knocked out of the game by more capable competitors.
Companies can have founder-mode and survive, because there's an outside world they have to compete and survive in, constantly ground-truthing them by force.
But sovereign states are different.
Sovereign states have monopolies on a given geography.
The competition is much more diffuse and over time.
Authoritarianism in an environment that's not rigorously ground truthed is extraordinarily dangerous.
42AT&T is still big, but MySpace is gone!
Maybe the length of companies as a going concern is directly tied to how much margin they extract (factoring out an internal flywheel).
43Should society care if a given company dies?
It seems like in general we should only care about the economy in toto.
Companies die all of the time in the constant process of ground truthing and the economy swarming to find good ideas.
One argument for wanting longer-lived companies is that companies that know they won't be around for long won't think about externalities at all and just extract.
Companies around for longer have to think about the indirect effects of their actions because they will swing back and affect them in their lifetime.
44Goodhart's law shows up in complexity.
Imagine Goodhart's law as the streetlight fallacy.
You optimize the value that is internal, captured by the model, in the light of the streetlight.
You ignore whatever is in the dark (the externalities) even if it is larger and more important.
Computers will optimize for the objective metric, mercilessly.
They will create the value under the streetlight, at the cost of destroying the value that's in the dark.
In complex environments, most of the value is in the dark.
So a hyper-optimizer will destroy value if you aren't careful.
The problem in this situation is not the computer, the problem is having a single entity that doesn't have to compete with others.
It is imperative that we don't centralize the power of AI algorithms but keep them federated and competing, keeping one another grounded and in check.
45Momentum fixes everything.
It gives you a natural direction to align to; the direction the thing is going.
Similar to "growth solves all known problems."
When things are growing, it allows a positive-sum-by-default mentality.
Things get weird when a previously positive-sum environment turns zero-sum.
Of course, it's important to understand and reckon with the externalities of that growth; it's possible it actually is zero-sum, and just exporting the downsides outside of the systems.
46In some conversation environments, you inject just a bit of novelty and it snuffs itself out.
But in some you inject a bit of novelty and it intensifies, grows more and more energy and excitement.
Those kinds of generative environments are magic, cherish them.
Work to create them if you can!