Skip to content

Resolve issue where config from API is cast to incorrect model type (faulty discriminator logic) #114

Open
@aaronsteers

Description

@aaronsteers

With our library of 300+ sources and 60+ destinations, certain API endpoints should return a "configuration" that is typed to the correct source or destination class, but they don't properly deserialize into the proper classes. Instead, they attempt to deserialize into the first match alphabetically (e.g. "Airtable" instead of "Snowflake" or "MySQL").

I've worked around this by hacking a bit and getting the original raw dict object, but this has been a stumbling block for specific use cases.

Workaround logic is here:

https://github.com/airbytehq/PyAirbyte/blob/f7b88eba400d7aa768c8c370cfbac6f18dfc61c6/airbyte/_util/api_util.py#L577-L597


Speakeasy has some docs on how to set up discriminator logic here:

Example code from the docs:

components:
  responses:
    OrderResponse:
      oneOf:
        - $ref: "#/components/schemas/DrinkOrder"
        - $ref: "#/components/schemas/IngredientOrder"
      discriminator:
        propertyName: orderType

In our case, the descriminating property does exist in the data as sourceType and destinationType, but it is not defined with the above syntax.

Here is an example declaration which shows sourceType should be ready to leverage if we reference it in the descriminator declaration:

    source-aha:
      type: "object"
      required:
      - "api_key"
      - "url"
      - "sourceType"
      properties:
        api_key:
          type: "string"
          title: "API Bearer Token"
          airbyte_secret: true
          description: "API Key"
          order: 0
          x-speakeasy-param-sensitive: true
        url:
          type: "string"
          description: "URL"
          title: "Aha Url Instance"
          order: 1
        sourceType:
          title: "aha"
          const: "aha"
          enum:
          - "aha"
          order: 0
          type: "string"

Current it does not appear that we define any descriminator logic to DestinationConfiguration or SourceConfiguration.

Below is destination configuration. Note there is oneOf logic but no discriminator logic defined. Same for SourceConfiguration, although I'm not showing it because it is much larger.

Show/Hide

https://raw.githubusercontent.com/airbytehq/airbyte-platform/refs/heads/main/airbyte-api/server-api/src/main/openapi/api_sdk.yaml

    DestinationConfiguration:
      description: The values required to configure the destination.
      example: { user: "charles" }
      oneOf:
        - title: destination-google-sheets
          $ref: "#/components/schemas/destination-google-sheets"
        - title: destination-astra
          $ref: "#/components/schemas/destination-astra"
        - title: destination-aws-datalake
          $ref: "#/components/schemas/destination-aws-datalake"
        - title: destination-azure-blob-storage
          $ref: "#/components/schemas/destination-azure-blob-storage"
        - title: destination-bigquery
          $ref: "#/components/schemas/destination-bigquery"
        - title: destination-clickhouse
          $ref: "#/components/schemas/destination-clickhouse"
        - title: destination-convex
          $ref: "#/components/schemas/destination-convex"
        - title: destination-databricks
          $ref: "#/components/schemas/destination-databricks"
        - title: destination-dev-null
          $ref: "#/components/schemas/destination-dev-null"
        - title: destination-duckdb
          $ref: "#/components/schemas/destination-duckdb"
        - title: destination-dynamodb
          $ref: "#/components/schemas/destination-dynamodb"
        - title: destination-elasticsearch
          $ref: "#/components/schemas/destination-elasticsearch"
        - title: destination-firebolt
          $ref: "#/components/schemas/destination-firebolt"
        - title: destination-firestore
          $ref: "#/components/schemas/destination-firestore"
        - title: destination-gcs
          $ref: "#/components/schemas/destination-gcs"
        - title: destination-iceberg
          $ref: "#/components/schemas/destination-iceberg"
        - title: destination-milvus
          $ref: "#/components/schemas/destination-milvus"
        - title: destination-mongodb
          $ref: "#/components/schemas/destination-mongodb"
        - title: destination-motherduck
          $ref: "#/components/schemas/destination-motherduck"
        - title: destination-mssql
          $ref: "#/components/schemas/destination-mssql"
        - title: destination-mysql
          $ref: "#/components/schemas/destination-mysql"
        - title: destination-oracle
          $ref: "#/components/schemas/destination-oracle"
        - title: destination-pgvector
          $ref: "#/components/schemas/destination-pgvector"
        - title: destination-pinecone
          $ref: "#/components/schemas/destination-pinecone"
        - title: destination-postgres
          $ref: "#/components/schemas/destination-postgres"
        - title: destination-pubsub
          $ref: "#/components/schemas/destination-pubsub"
        - title: destination-qdrant
          $ref: "#/components/schemas/destination-qdrant"
        - title: destination-redis
          $ref: "#/components/schemas/destination-redis"
        - title: destination-redshift
          $ref: "#/components/schemas/destination-redshift"
        - title: destination-s3
          $ref: "#/components/schemas/destination-s3"
        - title: destination-s3-glue
          $ref: "#/components/schemas/destination-s3-glue"
        - title: destination-sftp-json
          $ref: "#/components/schemas/destination-sftp-json"
        - title: destination-snowflake
          $ref: "#/components/schemas/destination-snowflake"
        - title: destination-snowflake-cortex
          $ref: "#/components/schemas/destination-snowflake-cortex"
        - title: destination-teradata
          $ref: "#/components/schemas/destination-teradata"
        - title: destination-timeplus
          $ref: "#/components/schemas/destination-timeplus"
        - title: destination-typesense
          $ref: "#/components/schemas/destination-typesense"
        - title: destination-vectara
          $ref: "#/components/schemas/destination-vectara"
        - title: destination-weaviate
          $ref: "#/components/schemas/destination-weaviate"
        - title: destination-yellowbrick
          $ref: "#/components/schemas/destination-yellowbrick"

Proposed fix

To resolve, we should add this text to the DestinationConfiguration declaration in the OpenAPI spec:

    DestinationConfiguration:
      # ...
      discriminator:
        propertyName: destinationType
      oneOf:
        - title: destination-google-sheets
          $ref: "#/components/schemas/destination-google-sheets"
      # ...

and similarly for sources:

    SourceConfiguration:
      # ...
      discriminator:
        propertyName: sourceType
      oneOf:
        - title: ...
          $ref: ...
      # ...

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions