Skip to content

Broadcast Timeout on Ontario GeoJson boundary #224

Open
@bdgeise

Description

@bdgeise

Using the attached geojson boundary for Ontario there was an error for a broadcast timeout when trying to run against about ~1b points. The points data set is parquet.

Final app status: FAILED, exitCode: 15, (reason: User class threw exception: org.apache.spark.sql.catalyst.errors.package$TreeNodeException: execute, tree: Exchange SinglePartition

Also this seemed to result in a broadcast time out on Spark and increasing spark.sql.autoBroadcastJoinThreshold and spark.sql.broadcastTimeout did not help.

We did notice through a conversion process to create the GeoJson structure that the precision is very high >15.

ontario_ca.geojson.zip

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions