Skip to content

Conversation

@weiznich
Copy link
Member

This commit introduces a new diesel_infer_query crate to parse and analyze SQL queries. For now the main usecase for this is to analyze queries used to define views to infer whether or not a field returned by a SQL view might contain null values.

Practically speaking we use the sqlparser crate to parse the SQL and later traverse the provided AST to:

  • Resolve all query sources
  • Collect all fields in the SELECT clause

We then use the SELECT clause with a resolver with actual access to the database (for resolving referenced tables, views, etc) to resolve information about th fields in the SELECT clause one by one.

Overall I tried as much to restrict the current implementation to known cases for now. We always can extend this later step by step.

This new crate is licensed as MPL-2.0 as I don't want others to just copy te source without contributing back. I feel that MPL offers a good balance between beeing restrictive and enforcing that compared to MIT/Apache 2.0 on the one side and AGPL on the other side.

This functionality is then used in Diesel CLI to correct the nullablity of view fields. This is currently gated behind an explictly experimental flag that needs explictly be enabled. If inference fails due to unsupported SQL constructs, etc we just assume that the field is nullable. I choose that fields are Nullable<T> by default if we don't know more as that will certainly work in all cases.

The current implementation is able to handle basic queries that select literals or fields from tables. We also have some support for joins to correctly infer if a field comming from a left joined table or not. Otherwise this is relativly incomplete for now.

For both the new crate and for diesel-cli this commit adds a few tests that demonstrate what's already possible and what not.

I nevertheless open this PR and want to merge this as I feel that's kind of the minimal valuable version of this feature.

Future improvements include extending the supported SQL to more expression types and more complex queries. Having real world examples would be really helpful here. See #4805

In the long run we might use this (or similar) functions also for an improved raw SQL entry point in diesel itself. For that we likely would instead use a schema resolver based on the schema.rs the user provides.

Part of #43

@weiznich weiznich force-pushed the feature/diesel_infer_query branch 3 times, most recently from 5f94e5c to 8022d65 Compare October 31, 2025 17:34
This commit introduces a new `diesel_infer_query` crate to parse and
analyze SQL queries. For now the main usecase for this is to analyze
queries used to define views to infer whether or not a field returned by
a SQL view might contain null values. 

Practically speaking we use the `sqlparser` crate to parse the SQL and
later traverse the provided AST to:

* Resolve all query sources
* Collect all fields in the `SELECT` clause

We then use the `SELECT` clause with a resolver with actual access to
the database (for resolving referenced tables, views, etc) to resolve
information about th fields in the `SELECT` clause one by one.

Overall I tried as much to restrict the current implementation to known
cases for now. We always can extend this later step by step.

This new crate is licensed as MPL-2.0 as I don't want others to just
copy te source without contributing back. I feel that MPL offers a good
balance between beeing restrictive and enforcing that compared to
MIT/Apache 2.0 on the one side and AGPL on the other side.

This functionality is then used in Diesel CLI to correct the nullablity
of view fields. This is currently gated behind an explictly experimental
flag that needs explictly be enabled. If inference fails due to
unsupported SQL constructs, etc we just assume that the field is
nullable. I choose that fields are `Nullable<T>` by default if we don't
know more as that will certainly work in all cases.

The current implementation is able to handle basic queries that select
literals or fields from tables. We also have some support for joins to
correctly infer if a field comming from a left joined table or not.
Otherwise this is relativly incomplete for now.

For both the new crate and for diesel-cli this commit adds a few tests
that demonstrate what's already possible and what not.

I nevertheless open this PR and want to merge this as I feel that's kind
of the minimal valuable version of this feature.

Future improvements include extending the supported SQL to more
expression types and more complex queries. Having real world examples
would be really helpful here.

In the long run we might use this (or similar) functions also for an
improved raw SQL entry point in diesel itself. For that we likely would
instead use a schema resolver based on the `schema.rs` the user
provides.
@weiznich weiznich force-pushed the feature/diesel_infer_query branch from 8022d65 to b6830ff Compare October 31, 2025 17:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

1 participant