A short read on the topic's time range, peak episode, and strongest associations. Use it as the quick orientation before drilling into examples.
training data appears in 14 chunks across 13 episodes, from 2023-11-13 to 2025-11-24.
Its densest episode is Bits and Bobs 6/24/24 (2024-06-24), with 2 observations on this topic.
Semantically it travels with llms, consistent bias, and ChatGPT, while by chunk count it sits between social network and web page; its yearly rank moved from #69 in 2023 to #128 in 2025.
Over time
?
Raw mentions over time. Use this to see absolute attention, not relative rank among all topics.
Range2023-11-13 to 2025-11-24Mean1.1 per episodePeak2 on 2024-06-24
Observations
?
The primary evidence view for this topic. Sort it chronologically when you want concrete examples behind the larger pattern.
Showing 14 observations sorted from latest to earliest.
...bly easy to poison a model of arbitrary size with deliberately chosen malicious training data.
Second, if there's a model that everyone uses that has a subtle but consistent bias, that bias at society scale could lead to significant society-sc...
...bias, so its consistency shows up despite the noise.
Very few "I don't know" in training data.
Because if the writer didn't know, why would they bother writing something in the first place?[hj][hk]
Humans predict what others will do based on i...
Ben Mathes distilling Babak Nivi:
"The meaning and soul went into the training data, and it's in us as we read the text.
It's not in the LLM anywhere.
But we can get it as a result of reading the output."
...ngly random one that is nearly miraculous.
If you're important and have lots of training data in the model, be on the lookout for these highly optimized incoming arguments!
...wondering what's happening here technically, an explainer:
When there's lots of training data with a particular style, using a similar style in your prompt will trigger the LLM to respond in that style. In this case, there's LOADS of fanfic:
h...
...e essence extractors, not mechanical reproducers.
The way they learn from their training data is more than just reproducing.
Essence is a new concept.
It doesn't exist in the legal canon yet other than things like trade secrets.
Copyright is a...
...er and better, with no humans in the loop.
A self-catalyzing infinite stream of training data.
That hasn't worked for things like reasoning yet because there's no ground truth you can efficiently compare against[adn][ado].
But now GPT4-class m...
...explosive advance because they had all of the internet to train on.
All of that training data was pent up, ready to catch fire when the right spark came, giving a massive explosive boost.
We extrapolate from that boost to a runaway effect, but...
...the human gives the LLM in these cooperative sessions would be extremely potent training data.
It's laser focused on the stuff the human didn't like or wanted to change, the parts to update to make the model better.
...arity.
If there are things that are similar, fundamentally, to your task in the training data, but not superficially, it will get confused and not know what to do.
...who has wrestled with generative image models to remove some detail.
All of the training data has descriptions of images as they actually are.
In that case, why would you describe what's not in the image? You can just describe what is in the i...
...for situations like this... and those are tuned based on the stuff it's seen in training data.
So we get confused that they get confused because they aren't doing basic computation, they're doing vibes matching on a mind-numbingly large datase...
...sts, etc.
AI 2.0: Deep learning. Supervised learning with bespoke, high-quality training data.
AI 3.0: Unsupervised learning. LLMs. Messy, kitchen-sink, highly scaled training data.
In any given situation, it's still possible to make a better-...
...mans absorb vibes from our experience.
LLMs absorb them from massive amounts of training data.
As Gordon notes, not artificial intelligence, but planetary-scale artificial intuition.