If you just take whatever the LLM gives you without applying your judgment, your agency, then it draws you down to the average.
But if you look at what it gives you and pick the best ones, skimming off the above-average parts, or modify and tweak them, then you get above average results.
An average pipeline will have some variance; some things above average, some below.
If you can reliably take only the best, by applying a calibrated judgment, then you now have a pipeline of only above average.
Average + judgment = above-average.