rag

package
v0.0.0-...-d9fdc95 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Oct 28, 2025 License: MIT Imports: 11 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func AddDocumentToVectorStore

func AddDocumentToVectorStore(ctx context.Context, g *genkit.Genkit, vectorStore *MemoryVectorStore, embedder ai.Embedder, text string) error

AddDocumentToVectorStore is a helper function to add documents to the vector store

func ChunkText

func ChunkText(text string, chunkSize, overlap int) []string

ChunkText takes a text string and divides it into chunks of a specified size with a given overlap. It returns a slice of strings, where each string represents a chunk of the original text.

Parameters:

  • text: The input text to be chunked.
  • chunkSize: The size of each chunk.
  • overlap: The amount of overlap between consecutive chunks.

Returns:

  • []string: A slice of strings representing the chunks of the original text.

func ChunkWithMarkdownHierarchy

func ChunkWithMarkdownHierarchy(content string) []string

func CosineSimilarity

func CosineSimilarity(v1, v2 []float32) float64

CosineSimilarity calculates the cosine similarity between two vectors

func DefineMemoryVectorRetriever

func DefineMemoryVectorRetriever(g *genkit.Genkit, vectorStore *MemoryVectorStore, embedder ai.Embedder) ai.Retriever

DefineMemoryVectorRetriever creates a memory vector retriever using the MemoryVectorStore

func ExampleUsage

func ExampleUsage(g *genkit.Genkit, embedder ai.Embedder)

ExampleUsage demonstrates how to use the custom retriever

func SplitMarkdownBySections

func SplitMarkdownBySections(markdown string) []string

SplitMarkdownBySections splits markdown content by headers (# ## ### etc.) Returns a slice where each element contains a section starting with a header

func SplitTextWithDelimiter

func SplitTextWithDelimiter(text string, delimiter string) []string

SplitTextWithDelimiter splits the given text using the specified delimiter and returns a slice of strings.

Parameters:

  • text: The text to be split.
  • delimiter: The delimiter used to split the text.

Returns:

  • []string: A slice of strings containing the split parts of the text.

Types

type MarkdownChunk

type MarkdownChunk struct {
	Header         string
	Content        string
	Level          int
	Prefix         string
	ParentLevel    int
	ParentHeader   string
	ParentPrefix   string
	Hierarchy      string
	SimpleMetaData string                 // Additional metadata if needed
	Metadata       map[string]interface{} // additional metadata
	KeyWords       []string               // Keywords that could be extracted from the content
}

func ParseMarkdownHierarchy

func ParseMarkdownHierarchy(content string) []MarkdownChunk

ParseMarkdownHierarchy parses the given markdown content and returns a slice of MarkdownChunk structs preserving the hierarchical context

type MemoryVectorRetrieverOptions

type MemoryVectorRetrieverOptions struct {
	Limit      float64 // Minimum similarity threshold
	MaxResults int     // Maximum number of results to return
}

MemoryVectorRetrieverOptions defines the options for the memory vector retriever

type MemoryVectorStore

type MemoryVectorStore struct {
	Records map[string]VectorRecord
}

func (*MemoryVectorStore) GetAll

func (mvs *MemoryVectorStore) GetAll() ([]VectorRecord, error)

func (*MemoryVectorStore) LoadFromJSONFile

func (mvs *MemoryVectorStore) LoadFromJSONFile(filename string) error

LoadFromFile loads the vector store from a JSON file

func (*MemoryVectorStore) Save

func (mvs *MemoryVectorStore) Save(vectorRecord VectorRecord) (VectorRecord, error)

func (*MemoryVectorStore) SaveJSONToFile

func (mvs *MemoryVectorStore) SaveJSONToFile(filename string) error

SaveToFile persists the vector store to a JSON file

func (*MemoryVectorStore) SearchSimilarities

func (mvs *MemoryVectorStore) SearchSimilarities(embeddingFromQuestion VectorRecord, limit float64) ([]VectorRecord, error)

SearchSimilarities searches for vector records in the MemoryVectorStore that have a cosine distance similarity greater than or equal to the given limit.

Parameters:

  • embeddingFromQuestion: the vector record to compare similarities with.
  • limit: the minimum cosine distance similarity threshold.

Returns:

  • []llm.VectorRecord: a slice of vector records that have a cosine distance similarity greater than or equal to the limit.
  • error: an error if any occurred during the search.

func (*MemoryVectorStore) SearchTopNSimilarities

func (mvs *MemoryVectorStore) SearchTopNSimilarities(embeddingFromQuestion VectorRecord, limit float64, max int) ([]VectorRecord, error)

SearchTopNSimilarities searches for the top N similar vector records based on the given embedding from a question. It returns a slice of vector records and an error if any. The limit parameter specifies the minimum similarity score for a record to be considered similar. The max parameter specifies the maximum number of vector records to return.

type VectorRecord

type VectorRecord struct {
	Id               string    `json:"id"`
	Prompt           string    `json:"prompt"`
	Embedding        []float32 `json:"embedding"`
	CosineSimilarity float64
}

func GetTopNVectorRecords

func GetTopNVectorRecords(records []VectorRecord, max int) []VectorRecord