RealtimeKeep your app up to date
AuthenticationOver 80+ OAuth integrations
Convex Components
ComponentsIndependent, modular, TypeScript building blocks for your backend.
Open sourceSelf host and develop locally
AI CodingGenerate high quality Convex code with AI
Compare
Convex vs. Firebase
Convex vs. Supabase
Convex vs. SQL
DocumentationGet started with your favorite frameworks
SearchSearch across Docs, Stack, and Discord
TemplatesUse a recipe to get started quickly
Convex for StartupsStart and scale your company with Convex
Convex ChampionsAmbassadors that support our thriving community
Convex CommunityShare ideas and ask for help in our community Discord
Stack
Stack

Stack is the Convex developer portal and blog, sharing bright ideas and techniques for building with Convex.

Explore Stack
BlogDocsPricing
GitHub
Log inStart building
Back to Components

RAG

get-convex's avatar
get-convex/rag
View repo
GitHub logoView package

Category

Database
RAG hero image
npm install @convex-dev/rag

A component for semantic search, usually used to look up context for LLMs. Use with an Agent for Retrieval-Augmented Generation (RAG).

Use AI to search HUGE amounts of text with the RAG Component

✨ Key Features#

  • Add Content: Add or replace content with text chunks and embeddings.
  • Semantic Search: Vector-based search using configurable embedding models
  • Namespaces: Organize content into namespaces for per-user search.
  • Custom Filtering: Filter content with custom indexed fields.
  • Importance Weighting: Weight content by providing a 0 to 1 "importance".
  • Chunk Context: Get surrounding chunks for better context.
  • Graceful Migrations: Migrate content or whole namespaces without disruption.

Found a bug? Feature request? File it here.

Installation#

Create a convex.config.ts file in your app's convex/ folder and install the component by calling use:

// convex/convex.config.ts
import { defineApp } from "convex/server";
import rag from "@convex-dev/rag/convex.config.js";

const app = defineApp();
app.use(rag);

export default app;

Run npx convex codegen if npx convex dev isn't already running.

Basic Setup#

// convex/example.ts
import { components } from "./_generated/api";
import { RAG } from "@convex-dev/rag";
// Any AI SDK model that supports embeddings will work.
import { openai } from "@ai-sdk/openai";

const rag = new RAG(components.rag, {
  textEmbeddingModel: openai.embedding("text-embedding-3-small"),
  embeddingDimension: 1536, // Needs to match your embedding model
});

Add context to RAG#

Add content with text chunks. Each call to add will create a new entry. It will embed the chunks automatically if you don't provide them.

export const add = action({
  args: { text: v.string() },
  handler: async (ctx, { text }) => {
    // Add the text to a namespace shared by all users.
    await rag.add(ctx, {
      namespace: "all-users",
      text,
    });
  },
});

See below for how to chunk the text yourself or add content asynchronously, e.g. to handle large files.

Semantic Search#

Search across content with vector similarity

  • text is a string with the full content of the results, for convenience. It is in order of the entries, with titles at each entry boundary, and separators between non-sequential chunks. See below for more details.
  • results is an array of matching chunks with scores and more metadata.
  • entries is an array of the entries that matched the query. Each result has a entryId referencing one of these source entries.
  • usage contains embedding token usage information. Will be { tokens: 0 } if no embedding was performed (e.g. when passing pre-computed embeddings).
export const search = action({
  args: {
    query: v.string(),
  },
  handler: async (ctx, args) => {
    const { results, text, entries, usage } = await rag.search(ctx, {
      namespace: "global",
      query: args.query,
      limit: 10,
      vectorScoreThreshold: 0.5, // Only return results with a score >= 0.5
    });

    return { results, text, entries, usage };
  },
});

Generate a response based on RAG context#

Once you have searched for the context, you can use it with an LLM.

Generally you'll already be using something to make LLM requests, e.g. the Agent Component, which tracks the message history for you. See the Agent Component docs for more details on doing RAG with the Agent Component.

However, if you just want a one-off response, you can use the generateText function as a convenience.

This will automatically search for relevant entries and use them as context for the LLM, using default formatting.

The arguments to generateText are compatible with all arguments to generateText from the AI SDK.

export const askQuestion = action({
  args: {
    prompt: v.string(),
  },
  handler: async (ctx, args) => {
    const userId = await getAuthUserId(ctx);
    const { text, context } = await rag.generateText(ctx, {
      search: { namespace: userId, limit: 10 },
      prompt: args.prompt,
      model: openai.chat("gpt-4o-mini"),
    });
    return { answer: text, context };
  },

Note: You can specify any of the search options available on rag.search.

Filtered Search#

You can provide filters when adding content and use them to search. To do this, you'll need to give the RAG component a list of the filter names. You can optionally provide a type parameter for type safety (no runtime validation).

Note: these filters can be OR'd together when searching. In order to get an AND, you provide a filter with a more complex value, such as categoryAndType below.

// convex/example.ts
import { components } from "./_generated/api";
import { RAG } from "@convex-dev/rag";
// Any AI SDK model that supports embeddings will work.
import { openai } from "@ai-sdk/openai";

// Optional: Add type safety to your filters.
type FilterTypes = {
  category: string;
  contentType: string;
  categoryAndType: { category: string; contentType: string };
};

const rag = new RAG<FilterTypes>(components.rag, {
  textEmbeddingModel: openai.embedding("text-embedding-3-small"),
  embeddingDimension: 1536, // Needs to match your embedding model
  filterNames: ["category", "contentType", "categoryAndType"],
});

Adding content with filters:

await rag.add(ctx, {
  namespace: "global",
  text,
  filterValues: [
    { name: "category", value: "news" },
    { name: "contentType", value: "article" },
    {
      name: "categoryAndType",
      value: { category: "news", contentType: "article" },
    },
  ],
});

Search with metadata filters:

export const searchForNewsOrSports = action({
  args: {
    query: v.string(),
  },
  handler: async (ctx, args) => {
    const userId = await getUserId(ctx);
    if (!userId) throw new Error("Unauthorized");

    const results = await rag.search(ctx, {
      namespace: userId,
      query: args.query,
      filters: [
        { name: "category", value: "news" },
        { name: "category", value: "sports" },
      ],
      limit: 10,
    });

    return results;
  },
});

Add surrounding chunks to results for context#

Instead of getting just the single matching chunk, you can request surrounding chunks so there's more context to the result.

Note: If there are results that have overlapping ranges, it will not return duplicate chunks, but instead give priority to adding the "before" context to each chunk. For example if you requested 2 before and 1 after, and your results were for the same entryId indexes 1, 4, and 7, the results would be:

[
  // Only one before chunk available, and leaves chunk2 for the next result.
  { order: 1, content: [chunk0, chunk1], startOrder: 0, ... },
  // 2 before chunks available, but leaves chunk5 for the next result.
  { order: 4, content: [chunk2, chunk3, chunk4], startOrder: 2, ... },
  // 2 before chunks available, and includes one after chunk.
  { order: 7, content: [chunk5, chunk6, chunk7, chunk8], startOrder: 5, ... },
]
export const searchWithContext = action({
  args: {
    query: v.string(),
    userId: v.string(),
  },
  handler: async (ctx, args) => {
    const { results, text, entries, usage } = await rag.search(ctx, {
      namespace: args.userId,
      query: args.query,
      chunkContext: { before: 2, after: 1 }, // Include 2 chunks before, 1 after
      limit: 5,
    });

    return { results, text, entries, usage };
  },
});
Get your app up and running in minutes
Start building
Convex logo
ProductSyncRealtimeAuthOpen sourceAI codingChefFAQPricing
DevelopersDocsBlogComponentsTemplatesStartupsChampionsChangelogPodcastLLMs.txt
CompanyAbout usBrandInvestorsBecome a partnerJobsNewsEventsTerms of servicePrivacy policySecurity
SocialTwitterDiscordYouTubeLumaLinkedInGitHub
A Trusted Solution
  • SOC 2 Type II Compliant
  • HIPAA Compliant
  • GDPR Verified
©2025 Convex, Inc.