From the course: LLM Foundations: Vector Databases for Caching and Retrieval Augmented Generation (RAG)
Introduction to retrieval augmented generation
From the course: LLM Foundations: Vector Databases for Caching and Retrieval Augmented Generation (RAG)
Introduction to retrieval augmented generation
Retrieval-augmented generation, or RAG for short, is arguably the most popular use case for LLMs in the business context. What is RAG? It is a framework that combines knowledge from a curated knowledge base with language capabilities of an LLM to provide accurate and well-structured answers. In RAG, we use a knowledge base for context-specific knowledge and an LLM for language generation, providing the best of both worlds. When a user asks a question through a prompt, the knowledge base provides contextual knowledge and the LLM provides well-structured answers. What are some key features and advantages of RAG? With RAG, we can use enterprise and confidential data sources to answer questions. This is not possible when using third-party LLMs. It allows us to combine multiple data sources in different formats to create a knowledge base. We can use product manuals in PDF format, support tickets from a ticketing system, and content from web pages together in a single knowledge base. The input data sources can be curated to extract only relevant knowledge and use it for RAG. This data can be continuously updated and pruned to keep it current. To find answers to queries, we can combine scalar and vector searches. Vector searches can be used to find relevant answers in vectors, while scalar filters can help with narrowing down the context. For example, if the user asks a troubleshooting question about a specific product, scalar filters can be used to filter answers for that specific product. RAG can use standard and out-of-the-box LLMs for language generation without the need to create or fine-tune custom models. This significantly reduces cost.