-
Notifications
You must be signed in to change notification settings - Fork 4.6k
Open
Description
Path: /components/vectordbs/dbs/databricks
In the documentation the way to use databricks as a vector store is given like this:
config = {
"vector_store": {
"provider": "databricks",
"config": {
"workspace_url": "https://your-workspace.databricks.com",
"access_token": "your-access-token",
"endpoint_name": "your-vector-search-endpoint",
"index_name": "catalog.schema.index_name",
"source_table_name": "catalog.schema.source_table",
"embedding_dimension": 1536
}
}
}
With the latest release of mem0 version 1.0.0 it supports databricks but since the documentation is not correct I am getting this error:
validated_self = self.__pydantic_validator__.validate_python(data, self_instance=self) pydantic_core._pydantic_core.ValidationError: 1 validation error for MemoryConfig vector_store Value error, Extra fields not allowed: source_table_name, index_name. Please input only the following fields: index_type, pipeline_type, warehouse_name, query_type, workspace_url, endpoint_name, azure_client_id, client_secret, collection_name, embedding_model_endpoint_name, azure_client_secret, catalog, embedding_dimension, endpoint_type, client_id, table_name, schema, access_token [type=value_error, input_value={'workspace_url': 'https:...edding_dimension': 1536}, input_type=dict] For further information visit https://errors.pydantic.dev/2.12/v/value_error {"pid": 17559, "job_id": "AJ_g7dLw5nNYaAa"}
On further checking mem0/vector_stores/databricks.py I cannot find the variable source_table_name
What I can see is only these variables are supported:
workspace_url (str): Databricks workspace URL.
access_token (str, optional): Personal access token for authentication.
client_id (str, optional): Service principal client ID for authentication.
client_secret (str, optional): Service principal client secret for authentication.
azure_client_id (str, optional): Azure AD application client ID (for Azure Databricks).
azure_client_secret (str, optional): Azure AD application client secret (for Azure Databricks).
endpoint_name (str): Vector search endpoint name.
catalog (str): Unity Catalog catalog name.
schema (str): Unity Catalog schema name.
table_name (str): Source Delta table name.
index_name (str, optional): Vector search index name (default: "mem0").
index_type (str, optional): Index type, either "DELTA_SYNC" or "DIRECT_ACCESS" (default: "DELTA_SYNC").
embedding_model_endpoint_name (str, optional): Embedding model endpoint for Databricks-computed embeddings.
embedding_dimension (int, optional): Vector embedding dimensions (default: 1536).
endpoint_type (str, optional): Endpoint type, either "STANDARD" or "STORAGE_OPTIMIZED" (default: "STANDARD").
pipeline_type (str, optional): Sync pipeline type, either "TRIGGERED" or "CONTINUOUS" (default: "TRIGGERED").
warehouse_name (str, optional): Databricks SQL warehouse Name (if using SQL warehouse).
query_type (str, optional): Query type, either "ANN" or "HYBRID" (default: "ANN").
Metadata
Metadata
Assignees
Labels
No labels