Skip to content

Conversation

@falonso81
Copy link
Contributor

@falonso81 falonso81 commented Dec 20, 2024

The following changes are included in this PR:

  • Added Context Cache for VertexAI and GeminiAI plugins
  • Context Cache Samples for both plugins
  • Unit tests

Checklist (if applicable):

@falonso81 falonso81 changed the title [WIP] Feat: create context caching for vertexai Dec 27, 2024
@hugoaguirre hugoaguirre self-assigned this Feb 25, 2025
@hugoaguirre hugoaguirre requested a review from apascal07 March 3, 2025 17:17

//copy:stop

//copy:start vertexai.go translateResponse
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VertexAI should not replicate this function anymore since u.CachedContentTokenCount (@ line 558) is not being provided in the response struct (see https://pkg.go.dev/cloud.google.com/go/vertexai/genai#UsageMetadata)

@hugoaguirre hugoaguirre changed the title feature(go/plugins/vertexai): create context caching for vertexai Mar 3, 2025
@hugoaguirre hugoaguirre changed the title feature(go/plugins/vertexai): create context caching for VertexAI and GoogleAI plugins Mar 3, 2025
@hugoaguirre hugoaguirre changed the title feature(go/plugins/vertexai): add support for context caching for VertexAI and GoogleAI plugins Mar 3, 2025
return nil, err
}

return client.CreateCachedContent(ctx, cc)
Copy link
Contributor

@hugoaguirre hugoaguirre Mar 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A thing to mention is that in JS we do a cache lookup based on the displayName (https://github.com/google-gemini/generative-ai-js/blob/0c8d8cde00a33ea9da94e5d1bb033dc261d0588f/types/server/caching.ts#L35) of the CachedContent in Go, this field is not available; making us unable to perform such operations.

Copy link
Collaborator

@apascal07 apascal07 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, this follows the JS implementation a little too closely with too much overly complicated logic. Let's keep it simple for now, focusing only on the cache marker + TTL in the metadata logic. If we ever add more fields, we can return a struct. It's all internal functions so there won't be breaking changes.


// WithTextPrompt adds a simple text user prompt to ModelRequest.
func WithTextPrompt(prompt string) GenerateOption {
func WithTextPrompt(prompt string, ttl ...int) GenerateOption {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this modified and why is it a variadic arg?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, this shouldn't be there

},
}
// copy the existing metadata and add cache
if m.Metadata != nil {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does the opposite of what it says here. It will overwrite cache with whatever was in there already. It makes sense for this deliberate call to overwrite any existing metadata.

}

var ContextCacheSupportedModels = [...]string{
"gemini-1.5-flash-001",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gemini-1.5-flash-002, gemini-1.5-pro-002, and Gemini 2.0 don't support caching?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are right, added gemini-2.0 stable models as well

return nil, nil
}
if endOfCachedContents < 0 || endOfCachedContents >= len(request.Messages) {
return nil, fmt.Errorf("invalid endOfCachedContents index")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return nil, fmt.Errorf("invalid endOfCachedContents index")
return nil, fmt.Errorf("end of cached content index %q is invalid", cacheEndIdx)
Temperature float64 `json:"temperature,omitempty"`
TopK int `json:"topK,omitempty"`
TopP float64 `json:"topP,omitempty"`
TTL time.Duration `json:"ttl,omitempty"`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a generated file, there's no corresponding TTL field in the source:

export const GenerationCommonConfigSchema = z.object({

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, after addressing your comments, TTL config should not be in this place but in Message.Metadata instead

OutputTokens int `json:"outputTokens,omitempty"`
OutputVideos float64 `json:"outputVideos,omitempty"`
TotalTokens int `json:"totalTokens,omitempty"`
CachedTokens int `json:"cachedTokens,omitempty"`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same with this. We can probably either add it as a field or add it to Custom.

)

// DEFAULT_TTL in seconds (5 minutes)
const DEFAULT_TTL = 300
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's an inconsistent mix of seconds int and time.Duration throughout.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed: ttlSeconds is always int and conversion to time.Duration is done right before detecting the cache mark

@hugoaguirre
Copy link
Contributor

Moving efforts to address issue #2159 first since VertexAI plugin does not fully support Context Caching. When completing #2159, this PR should be modified to use the new SDK functions and remove redundant code between VertexAI and GoogleAI plugins.

cc @yesudeep @apascal07

@hugoaguirre
Copy link
Contributor

Since we are moving forward on using the unified go-genai SDK in GooleAI and VertexAI plugins, this PR will be closed and moved to #2266

@hugoaguirre hugoaguirre closed this Mar 7, 2025
@github-project-automation github-project-automation bot moved this from In Progress to Done in Genkit Backlog Mar 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

3 participants