TheSequence

TheSequence

The Sequence Opinion #509: Is RAG Dying?

Long context windows, fine tuning and other trends are challenging the viability of one of the most popular LLM techniques.

Mar 13, 2025
∙ Paid
Created Using Midjourney

Retrieval-Augmented Generation (RAG) is a technique that enhances generative models by integrating a retrieval mechanism, allowing them to access relevant external information. In a RAG pipeline, a query first triggers a search for pertinent documents, often using a vector database or search index. The retrieved text is then fed into the language model to guide its final response. This approach was pioneered around 2020 and quickly became significant for knowledge-intensive AI tasks. It allowed smaller or general-purpose models to achieve state-of-the-art results by incorporating external facts, addressing issues like hallucinations and outdated knowledge. RAG gained widespread adoption, powering numerous research papers and commercial applications. However, with rapid advancements in AI models and architectures, is RAG still as relevant today?

Limitations of RAG

Despite its strengths, RAG systems introduce several challenges:

This post is for paid subscribers

Already a paid subscriber? Sign in
© 2026 Jesus Rodriguez · Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture