From the course: LLM Foundations: Vector Databases for Caching and Retrieval Augmented Generation (RAG)
Tune vector DB performance
From the course: LLM Foundations: Vector Databases for Caching and Retrieval Augmented Generation (RAG)
Tune vector DB performance
How do we tune the performance of a vector database? For a vector database, the key performance area is vector search. So how do we tune the performance of vector search? What impacts the effectiveness of vector search? First, it is the data in the database, its similarity, and its size. Then there is the embedding model and its ability to represent the content and semantics in the text. Then comes the metric type and how it can help find similar items. Finally, there is the threshold which differentiates what is a match and what is not. How do we find the best combination of the embedding model, metric type, and threshold for a given use case? We need to do that by experimentation. First, for experiments, we need a good test dataset that closely represents the type of data that will be used in production. The data should be labeled. It should have search strings and corresponding right results. With this, we need to try out different embedding models and metric types. We need to determine which combination results in the best accuracy across the dataset. Then we need to also experiment with distance thresholds and find the right value that can differentiate between matches and no matches in the test dataset. Once these values are determined, implemented, and deployed, the performance of the application should be also monitored from time to time to ensure that the values are still the best options. Sometimes changes in the nature of data or search queries can make these values suboptimal.