Skip to content

Conversation

@fivetran-kwoodbeck
Copy link
Collaborator

@fivetran-kwoodbeck fivetran-kwoodbeck commented Dec 29, 2025

Added transpilation of Snowflake UNIFORM.

Changes

sqlglot/dialects/duckdb.py

  • Added UNIFORM_INT_TEMPLATE and UNIFORM_FLOAT_TEMPLATE class attributes using pre-parsed SQL templates with placeholders
  • Added uniform_sql() method to transpile Snowflake's UNIFORM(min, max, gen) to DuckDB equivalent

tests/dialects/test_snowflake.py

  • Added DuckDB targets to existing UNIFORM test cases

Transpilation Logic

Snowflake's UNIFORM(min, max, gen) generates random values in the range [min, max]:

Input DuckDB Output
UNIFORM(1, 10, RANDOM()) CAST(FLOOR(1 + RANDOM() * (10 - 1 + 1)) AS BIGINT)
UNIFORM(1, 10, RANDOM(5)) CAST(FLOOR(1 + RANDOM() * (10 - 1 + 1)) AS BIGINT)
UNIFORM(1, 10, 5) (seed) CAST(FLOOR(1 + (ABS(HASH(5)) % 1000000) / 1000000.0 * (10 - 1 + 1)) AS BIGINT)

Implementation Details

  • Uses template-based approach with exp.maybe_parse() and exp.replace_placeholders()
  • Integer bounds → integer result (wrapped in FLOOR and CAST ... AS BIGINT)
  • Float bounds → float result (no casting)
  • Numeric seed values use deterministic hash-based random: (ABS(HASH(seed)) % 1000000) / 1000000.0
@github-actions
Copy link
Contributor

github-actions bot commented Dec 29, 2025

SQLGlot Integration Test Results

Comparing:

  • this branch (sqlglot:feature/transpile-uniform, sqlglot version: feature/transpile-uniform)
  • baseline (main, sqlglot version: 28.5.1.dev45)

⚠️ Limited to dialects: snowflake, duckdb

By Dialect

dialect main sqlglot:feature/transpile-uniform difference links
duckdb -> duckdb 4003/4003 passed (100.0%) 4003/4003 passed (100.0%) No change full result / delta
snowflake -> duckdb 626/1085 passed (57.7%) 626/1085 passed (57.7%) No change full result / delta
snowflake -> snowflake 981/1085 passed (90.4%) 981/1085 passed (90.4%) No change full result / delta

Overall

main: 6173 total, 5610 passed (pass rate: 90.9%), sqlglot version: 28.5.1.dev45

sqlglot:feature/transpile-uniform: 6173 total, 5610 passed (pass rate: 90.9%), sqlglot version: feature/transpile-uniform

Difference: No change

Copy link
Collaborator

@VaggelisD VaggelisD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work once again @fivetran-kwoodbeck! Though extreme, I was wondering whether DuckDB's RANDOM() is enough to simulate uniform distributions, but it's probably fine.

Leaving a few comments to consider:

Comment on lines 1610 to 1614
UNIFORM_INT_TEMPLATE: exp.Expression = exp.maybe_parse(
"CAST(FLOOR(:min + :random * (:max - :min + 1)) AS BIGINT)"
)

UNIFORM_FLOAT_TEMPLATE: exp.Expression = exp.maybe_parse(":min + :random * (:max - :min)")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't really have to parse these as templates, it should be doable to generate these expressions directly.

Do note that SQLGlot expressions have overloaded magic methods, meaning that math ops between expressions are automatically parsed as subtrees:

>>> foo
Var(this=foo)
>>> math_expr = foo + foo * (foo - 1)
>>> math_expr
Add(
  this=Var(this=foo),
  expression=Paren(
    this=Mul(
      this=Var(this=foo),
      expression=Paren(
        this=Sub(
          this=Var(this=foo),
          expression=Literal(this=1, is_string=False))))))
>>> math_expr.sql()
'foo + (foo * (foo - 1))'
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I originally did it as expressions but thought it might have been too complex. I reverted, let me know if that works.

@fivetran-kwoodbeck
Copy link
Collaborator Author

I did some testing across 100 samples for a variety of parameters, it's in the Jira. RANDOM seems to work OK.

Nice work once again @fivetran-kwoodbeck! Though extreme, I was wondering whether DuckDB's RANDOM() is enough to simulate uniform distributions, but it's probably fine.

Leaving a few comments to consider:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

3 participants