From the course: LLM Foundations: Vector Databases for Caching and Retrieval Augmented Generation (RAG)
Managing data in Milvus
From the course: LLM Foundations: Vector Databases for Caching and Retrieval Augmented Generation (RAG)
Managing data in Milvus
One of the key operations with the database is inserting, updating, and deleting data. Let's quickly review the capabilities provided by Milvus in this regard. We will demonstrate them with code examples in the next chapter. So what data management capabilities exist in Milvus? In Milvus, data is stored in collections. Each row or record in a collection is referred to as an entity. We will use entities and rows interchangeably in this course. In addition to inserting a single row of data, Milvus provides the ability to upload data in bulk. This is the most common pattern for data insertion, especially when loading large documents. It is also optimal to insert data in bulk rather than as individual rows. After inserts are done, a flush operation is needed before newly inserted data is indexed based on the indexes created. Milvus automatically flushes data after the pending records reach a specific size after insertion. But if immediate querying is needed, it is recommended to manually trigger the flush operation. Milvus also supports the absurd operation. In this case, if a duplicate record is inserted with the same primary key, the existing record is updated rather than creating a new record. Records can also be deleted using the primary key or a Boolean expression as a filter.