Let's pull on a thread starting from the observation that LLMs only "think" one token at a time.

2024-01-29 · Bits and Bobs 1/29/24

Imagine a prompt like "Write a synopsis of X, and bold the most salient words."

The LLM has to choose to emit the markdown bold characters and then output the word.

But the LLM is effectively two separate invocations, two separate entities for generating that asterisk and then for generating the important world.

On the second iteration, how does it know what the first iteration was thinking?

How can it act as a single coherent thing when it's actually multiple things?

Imagine 5 friends sitting down and being told to generate one picture in five slices.

The first friend will draw the first vertical slice, then next one the next, and so on.

They won't be able to talk or communicate.

The first friend draws a bunch of guns pointing to the right.

The next friend sees that and goes "I guess there's something scary here... I'll draw a dragon."

The next friend sees those first two slices and goes, "OK, there's a scary monster here, maybe it's because the dragon is guarding a cave of gold."

Each friend is operating entirely independently and trying to add something that is coherent with what came before, without any explicit coordination.

But the result, if everyone collaborates, could look 100% coherent.

The key thing is that as long as the entities doing the components are fully operating in good faith and earnestly trying to add a thing that is coherent with what has come before (and what will likely come after--if you're doing a collaborative poem and someone ends their line with 'orange' they're just being a jerk), you can get the illusion of a pre-planned result, when it really wasn't at all.

Another mental model: LLMs are like the main character in memento.

The main character "wakes up", sees the tattoos and the sticky notes, and says "well I guess the next step is I'm supposed to call Teddy."

He does that action and then blanks and then wakes up and says "the phone is ringing, I guess it's Teddy i"m calling".

The LLM is in some ways little slices of independent agency, but they're coordinating by leaving tokens in the shared working space (stigmergy!) that they assume are to be trusted as inputs.

By the way, this is not too dissimilar from how our brains actually work.

(Not in terms of 'each token is a totally new computation from a blank slate' but in terms of 'lots of independent neurons cooperating to give the illusion of coherence')

The different parts of our brain are constantly working to maintain this consistent illusion of a coherent self.

E.g. the left brain post-hoc rationalizing what the right brain decided to do in split-brain patients.

Ian Couzin's research in how fish schools decide which target to navigate to shows it is precisely the same algorithm that our neurons use to coordinate as a group to emergently make the same kinds of decisions.

We feel like a coherent entity, but we're closer to a slime mold than we think!

This is not too dissimilar from how plain-old-code works, either.

The code is executed, and looks in its protected storage partition and thinks "Well I guess this user is named Alex Komoroske, because there's a note here that says that, and I'll presume that no one else has access to this locked storage room than previous iterations of me."

A corollary: the "identity" of a computing experience is code+data (data here including anything still resident in memory from the last invocation).

If you change the code, or the data, then it's a different "thing", operating with different agency.

A key observation: a centralized LLM in the cloud that has memory about you is potentially very scary / powerful (it could keep track of information about you to subtly manipulate you in the future) .

But the very same LLM in the cloud, that is known to not log any state, is significantly less scary, because each token you're effectively dealing with a whole new entity.

Let's pull on a thread starting from the observation that LLMs only "think" one token at a time.

More on this topic