A model being trained on data or using RAG at inference time has wildly different characteristics.

2025-01-13 · Bits and Bobs 1/13/25

A model being trained on data or using RAG at inference time has wildly different characteristics.
- But a lot of discourse about LLMs doesn't differentiate the two.
  - I see even technical people muddle this all the time.
- There's a difference between an LLM in training absorbing a hologram of the knowledge vs RAG to help sift through concrete input with its background common sense it absorbed in training.
- Sometimes you just need its background worldly knowledge to give it common sense.
- If you want details, that's not sufficient and you''l need RAG.
- Adding more knowledge to a model is expensive, has long lead times, works on vibes and is imprecise.
  - The larger the model, the less that any incremental bit of data in training affects the output.
- RAG can't give huge context to a model that doesn't have the right background knowledge, but it can be updated quickly and can enable precision in details.
- Everyone talks about these things like they're the same, but they're wildly different.
- Training your own model is very capital intensive.
- But in many cases you can use an off-the-shelf LLM plus RAG and produce amazing results.
- The question is: how much background knowledge do you need for the LLM to have enough common sense to be able to tackle your concrete tasks where you bring the specific details for it to operate on.

More on this topic

From other episodes