From the course: LLM Foundations: Vector Databases for Caching and Retrieval Augmented Generation (RAG)

Search vector fields

We saw some examples of scalar queries in the previous video. Now, let's execute vector search. As discussed earlier, for doing vector search, we need to convert the search string also to a vector before it can be compared using the same distance measure as the index. First, we define the search parameters for the search. This is set as L2. This should be of the same metric type that was used to build the index on the vector field. An index is a prerequisite for the vector field before it can be used for semantic search. We set the offset to zero. This means the results are returned from the first scene match for the input query. Offsets can be used for pagination purposes. Ignore_growing is a Boolean parameter. Milvus internally processes data in segments. The parameter is whether the search should ignore segments that are not fully populated. If set to true, the search may ignore some newly added data. Setting it to false would also include all new data at an additional query cost. nprobe indicates the number of clusters to search starting from the most matching records cluster. Reducing nprobe helps in efficiency, but may possibly ignore additional matches beyond the number of clusters searched. Then there is the search string. In this case, it is machine learning. We are looking for descriptions that closely match with this search string. We use the same embedding model as before to get the embeddings for the string. For search, we use the search method in the collection object. The data parameter is for the search strings embedding. The anns_field indicates the vector field to search for. The bottom field is for the search parameters. Limit is used to limit the number of returned records. We set it to 10 so it will only return the top 10 matches. Expression is used to pass additional filters on these scalar fields, similar to how scalar queries were done. The output field indicates the list of fields that needs to be returned from the search. Consistency level controls whether data in processing will be considered for the search. The results are returned in the s_results variable. This contains the hits object that can be iterated over to get the results. We iterate over this object and print the entity ID, distance, and title. Let's run this search now. Based on the ordering and the distance, we can see that the courses that are most related to machine learning are returned first based on their descriptions. Now, let's do a search on a string that is not related to the contents in the vector DB. The search string is best movies of the year. We will execute a similar set of steps now. On executing the query, we, again, see all the records being returned, even though the contents are in no way related to the search string. This is the problem with vector search. It will always return results in the descending order of matches. So how do we ensure that we get results that are similar to the search string? We need to use the distances returned and use a similarity cut off threshold. We can see that the distances returned in this query are in the range of 0.6. This is much higher than the distances we saw in the earlier query, especially for those courses that are related to machine learning. We can set a similarity threshold of, say, 0.5 and only use results where the distances are below the threshold. We can also run searches using the Attu interface. Here we go to the vector search link. We can select a database and a collection here. But for searching, we need to provide a vector directly, not a search string. You can use this option for testing if needed.

Contents