From the course: Vector Databases in Practice: Deep Dive

Hybrid searches

From the course: Vector Databases in Practice: Deep Dive

Hybrid searches

- [Instructor] A hybrid search is a really interesting best-of-both-worlds type search. A hybrid search actually involves a vector database performing two searches under the hood. Those two searches being a vector search and a separate keyword search. Then the hybrid search fusion algorithm will cleverly combine the two sets of results from those searches to produce one final set of results. It sounds relatively simple and in some ways it is, but it has very high utility. In fact, many of our users tell us how useful they find it for their particular real-life applications. So now let's take a look at a few examples of hybrid search. You can perform a hybrid search with just one search query like we do here for the word stellar. And if we perform the search, we see those results come up. You'll notice that the top result includes the exact word used in the query while the others do not. Of course, that could still happen in a vector search. So why don't we dig a little bit deeper into our results? Let's update our search query to retrieve the score and explain score metadata and see what that tells us. First, let's inspect the score. The score is a measure of how well that object meets the search criteria like the BM25 score, and here you'll also see the explanation of the score. This shows us what hybrid search does. What we're seeing is that the top result did well in the vector search as well as the BM25 or the keyword search. Remember that BM25 is the specific keyword search algorithm that we're using, and because hybrid search combines two searches, the vector and the keyword searches, the results are weighted relative to each other based on a parameter. Alpha is adjustable by the user. You can adjust it to tune whether you want the hybrid search to act more like a keyword search by moving alpha towards zero or more like a vector search by moving alpha towards one. To demonstrate, let's try the same search with the alpha of zero, which is a pure keyword search. Now here, as you might expect, the only results that now come up are the objects that contain the exact word of which there's just the one. You can try the same query yourself with a keyword search syntax, and the results here should be identical to our query here where the alpha is set to zero. Hybrid search is a very practical tool that we see quite a lot in the wild. Of course, there's no best or one-size-fits-all search type, but hybrid search can work very well with real-life datasets by providing a balanced approach. Because it combines the two complimentary search types and allows you to weight them with the alpha value, we often see this capability being adopted as the favorite search type. As always, please try it out, including trying different queries and alpha values to see what happens. Next, we'll take a look at retrieval augmented generation. This is where we can go beyond simple data retrieval by combining data with the power of large language models.

Contents