Computer Science > Computation and Language

arXiv:2407.15268 (cs)

[Submitted on 21 Jul 2024 (v1), last revised 6 Feb 2025 (this version, v2)]

Title:Fact-Aware Multimodal Retrieval Augmentation for Accurate Medical Radiology Report Generation

Authors:Liwen Sun, James Zhao, Megan Han, Chenyan Xiong

Abstract:Multimodal foundation models hold significant potential for automating radiology report generation, thereby assisting clinicians in diagnosing cardiac diseases. However, generated reports often suffer from serious factual inaccuracy. In this paper, we introduce a fact-aware multimodal retrieval-augmented pipeline in generating accurate radiology reports (FactMM-RAG). We first leverage RadGraph to mine factual report pairs, then integrate factual knowledge to train a universal multimodal retriever. Given a radiology image, our retriever can identify high-quality reference reports to augment multimodal foundation models, thus enhancing the factual completeness and correctness of report generation. Experiments on two benchmark datasets show that our multimodal retriever outperforms state-of-the-art retrievers on both language generation and radiology-specific metrics, up to 6.5% and 2% score in F1CheXbert and F1RadGraph. Further analysis indicates that employing our factually-informed training strategy imposes an effective supervision signal, without relying on explicit diagnostic label guidance, and successfully propagates fact-aware capabilities from the multimodal retriever to the multimodal foundation model in radiology report generation.

Comments:	NAACL 2025 main
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2407.15268 [cs.CL]
	(or arXiv:2407.15268v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2407.15268

Submission history

From: Liwen Sun [view email]
[v1] Sun, 21 Jul 2024 21:04:28 UTC (1,015 KB)
[v2] Thu, 6 Feb 2025 05:51:07 UTC (827 KB)

Computer Science > Computation and Language

Title:Fact-Aware Multimodal Retrieval Augmentation for Accurate Medical Radiology Report Generation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Fact-Aware Multimodal Retrieval Augmentation for Accurate Medical Radiology Report Generation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators