Fig. 2 ·
The retrieval funnel
recall cheap · precision expensive
Corpus
10⁶+
passages · indexed offline
Retrieve
~150
hybrid · bi-encoder + BM25
fused with RRF · cheap
Rerank
~10
cross-encoder
expensive
Context
5–10
shown to
the model
cost / item →
precision worth paying for
items surviving each stage
compute cost per item
spend it where the set is small