VisionEmbeddingModelResult

Prediction format for large vision model embedding api.

Fields

imageEmbedding array (ListValue format)

The 1024 dimension image embedding result from the provided image.

textEmbedding array (ListValue format)

The 1024 dimension text embedding result from the provided text.

videoEmbeddings[] object (VideoEmbedding)

Video embeddings.

JSON representation
{ "imageEmbedding": array, "textEmbedding": array, "videoEmbeddings": [ { object (`VideoEmbedding`) } ] }

VideoEmbedding

The video embedding message.

Fields

startOffsetSec integer

The start offset of the video.

endOffsetSec integer

The end offset of the video.

embedding array (ListValue format)

The 1024 dimension video embedding result from the provided video.

JSON representation
{ "startOffsetSec": integer, "endOffsetSec": integer, "embedding": array }

Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-10-20 UTC.