Skip to content

Issue in docs related to supported vector databases in databricks section #3638

@rahul-anand1

Description

@rahul-anand1

Path: /components/vectordbs/dbs/databricks

In the documentation the way to use databricks as a vector store is given like this:

config = {
    "vector_store": {
        "provider": "databricks",
        "config": {
            "workspace_url": "https://your-workspace.databricks.com",
            "access_token": "your-access-token",
            "endpoint_name": "your-vector-search-endpoint",
            "index_name": "catalog.schema.index_name",
            "source_table_name": "catalog.schema.source_table",
            "embedding_dimension": 1536
        }
    }
}

With the latest release of mem0 version 1.0.0 it supports databricks but since the documentation is not correct I am getting this error:

validated_self = self.__pydantic_validator__.validate_python(data, self_instance=self) pydantic_core._pydantic_core.ValidationError: 1 validation error for MemoryConfig vector_store Value error, Extra fields not allowed: source_table_name, index_name. Please input only the following fields: index_type, pipeline_type, warehouse_name, query_type, workspace_url, endpoint_name, azure_client_id, client_secret, collection_name, embedding_model_endpoint_name, azure_client_secret, catalog, embedding_dimension, endpoint_type, client_id, table_name, schema, access_token [type=value_error, input_value={'workspace_url': 'https:...edding_dimension': 1536}, input_type=dict] For further information visit https://errors.pydantic.dev/2.12/v/value_error {"pid": 17559, "job_id": "AJ_g7dLw5nNYaAa"}

On further checking mem0/vector_stores/databricks.py I cannot find the variable source_table_name
What I can see is only these variables are supported:

workspace_url (str): Databricks workspace URL.
access_token (str, optional): Personal access token for authentication.
client_id (str, optional): Service principal client ID for authentication.
client_secret (str, optional): Service principal client secret for authentication.
azure_client_id (str, optional): Azure AD application client ID (for Azure Databricks).
azure_client_secret (str, optional): Azure AD application client secret (for Azure Databricks).
endpoint_name (str): Vector search endpoint name.
catalog (str): Unity Catalog catalog name.
schema (str): Unity Catalog schema name.
table_name (str): Source Delta table name.
index_name (str, optional): Vector search index name (default: "mem0").
index_type (str, optional): Index type, either "DELTA_SYNC" or "DIRECT_ACCESS" (default: "DELTA_SYNC").
embedding_model_endpoint_name (str, optional): Embedding model endpoint for Databricks-computed embeddings.
embedding_dimension (int, optional): Vector embedding dimensions (default: 1536).
endpoint_type (str, optional): Endpoint type, either "STANDARD" or "STORAGE_OPTIMIZED" (default: "STANDARD").
pipeline_type (str, optional): Sync pipeline type, either "TRIGGERED" or "CONTINUOUS" (default: "TRIGGERED").
warehouse_name (str, optional): Databricks SQL warehouse Name (if using SQL warehouse).
query_type (str, optional): Query type, either "ANN" or "HYBRID" (default: "ANN").

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions