Corrective Retrieval Augmented Generation (CRAG)

Corrective Retrieval Augmented Generation (CRAG), published by Shi-Qi Yan, Jia-Chen Gu, Yun Zhu, and Zhen-Hua Ling in January 2024, targets a specific failure mode of retrieval-augmented systems: what happens when the documents pulled back are wrong or irrelevant. Ordinary RAG feeds whatever it retrieves straight into the model, so bad retrieval reliably produces bad answers.

CRAG inserts a lightweight retrieval evaluator that scores how confident the system should be in the documents it found. Based on that score, it triggers corrective actions. If retrieval looks good, a decompose-then-recompose step filters out irrelevant sentences and keeps the key content. If retrieval looks poor, the system reaches beyond the local corpus and augments with large-scale web search to find better evidence. The approach is modular, so it can be bolted onto existing RAG pipelines.

The authors reported improvements across both short-form and long-form generation tasks, with the gains coming specifically from handling the cases where retrieval would otherwise have misled the model.

For a business, CRAG reflects a maturing view of RAG: rather than trusting retrieval blindly, the system checks its own evidence and corrects course, which is essential when a confident but wrong answer carries real cost.

Corrective Retrieval Augmented Generation (CRAG)

Sources

Related