Computer Science > Software Engineering

arXiv:2510.04905 (cs)

[Submitted on 6 Oct 2025]

Title:Retrieval-Augmented Code Generation: A Survey with Focus on Repository-Level Approaches

Authors:Yicheng Tao, Yao Qin, Yepang Liu

Abstract:Recent advancements in large language models (LLMs) have substantially improved automated code generation. While function-level and file-level generation have achieved promising results, real-world software development typically requires reasoning across entire repositories. This gives rise to the challenging task of Repository-Level Code Generation (RLCG), where models must capture long-range dependencies, ensure global semantic consistency, and generate coherent code spanning multiple files or modules. To address these challenges, Retrieval-Augmented Generation (RAG) has emerged as a powerful paradigm that integrates external retrieval mechanisms with LLMs, enhancing context-awareness and scalability. In this survey, we provide a comprehensive review of research on Retrieval-Augmented Code Generation (RACG), with an emphasis on repository-level approaches. We categorize existing work along several dimensions, including generation strategies, retrieval modalities, model architectures, training paradigms, and evaluation protocols. Furthermore, we summarize widely used datasets and benchmarks, analyze current limitations, and outline key challenges and opportunities for future research. Our goal is to establish a unified analytical framework for understanding this rapidly evolving field and to inspire continued progress in AI-powered software engineering.

Subjects:	Software Engineering (cs.SE); Computation and Language (cs.CL)
Cite as:	arXiv:2510.04905 [cs.SE]
	(or arXiv:2510.04905v1 [cs.SE] for this version)
	https://doi.org/10.48550/arXiv.2510.04905

Submission history

From: Yicheng Tao [view email]
[v1] Mon, 6 Oct 2025 15:20:03 UTC (1,425 KB)

Computer Science > Software Engineering

Title:Retrieval-Augmented Code Generation: A Survey with Focus on Repository-Level Approaches

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Software Engineering

Title:Retrieval-Augmented Code Generation: A Survey with Focus on Repository-Level Approaches

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators