Skip to content

feat(rules): add W023 scalar-udf-in-where#34

Merged
Pawansingh3889 merged 1 commit into
Pawansingh3889:mainfrom
mvanhorn:osc/30-w023-scalar-udf
Apr 29, 2026
Merged

feat(rules): add W023 scalar-udf-in-where#34
Pawansingh3889 merged 1 commit into
Pawansingh3889:mainfrom
mvanhorn:osc/30-w023-scalar-udf

Conversation

@mvanhorn

Copy link
Copy Markdown
Contributor

Summary

Closes #30. Part of the v0.7 Performance Rules Pack.

Adds W023 scalar-udf-in-where, mirroring W019 / CountDistinctUnbounded in shape and test layout (the issue itself points at W019 as the canonical reference).

What it catches

-- Fires
SELECT order_id FROM orders WHERE dbo.fn_IsHighValue(total) = 1;

-- Does NOT fire (built-in, no schema prefix)
SELECT order_id FROM orders WHERE LEN(name) > 5;

The rule extracts WHERE / HAVING / ON predicate bodies and looks for any <schema>.<name>( call inside them. The schema-prefix requirement is the key heuristic: built-ins do not carry a dbo. / <schema>. prefix in T-SQL, so LEN, UPPER, SUBSTRING, CONVERT, YEAR, etc. naturally don't match. SELECT-list calls don't match either because the clause-body extraction excludes them.

Edge cases covered (per the issue + tests)

  • Built-in functions in WHERE: LEN, UPPER, SUBSTRING -- no fire (no schema prefix).
  • Schema-qualified UDF in SELECT list only -- no fire (extracted from WHERE/HAVING/ON only).
  • EXISTS (... WHERE dbo.fn_X(col) = 1) correlated subquery -- fires on inner WHERE.
  • HAVING dbo.fn_X(col) > 0 -- fires.
  • JOIN b ON dbo.fn_X(a.id) = b.id -- fires.
  • WHERE x.y = 1 (table.column reference) -- no fire (the regex requires a trailing ().

Files

  • sql_guard/rules/warnings.py: ScalarUdfInWhere class.
  • sql_guard/rules/__init__.py: import + ALL_RULES registration.
  • tests/test_new_rules.py: 10 new tests covering positive + negative cases.
  • tests/fixtures/warnings.sql: W023 fixture line.
  • tests/test_rules.py: bumped test_all_rules_loaded (34 -> 35) and test_24_warnings (24 -> 25).
  • README.md: rule table row + contributors entry update.
  • CHANGELOG.md: ### Added line under [Unreleased].

Verification

  • pytest tests/test_new_rules.py -k w023 -- 10 passed.
  • pytest tests/test_rules.py -k "all_rules_loaded or 24_warnings" -- 2 passed.
  • The existing test_deep_nesting_detected failure in tests/test_structural.py is pre-existing on main (verified by stashing my changes) and unrelated to this PR.

Out of scope

  • Postgres VOLATILE function detection (the issue notes this as v0.8).
  • T-prefix re-coding to T006 if the maintainer prefers T-SQL-only rules to use a T prefix; happy to rename in a follow-up.
  • Detecting UDFs without a schema prefix (bare MyFunc(col)). T-SQL convention is to prefix them; tracking the trade-off in the issue's edge-case 1 + 2 discussion.
T-SQL scalar UDFs in WHERE/HAVING/ON clauses force row-by-row
evaluation and prevent index seeks. Add W023 that flags
<schema>.<name>(...) calls inside predicate clauses. Built-ins
(LEN, UPPER, SUBSTRING, etc.) lack a schema prefix and are not
affected.

Mirrors W019 / CountDistinctUnbounded in shape and test layout.
@codecov-commenter

Copy link
Copy Markdown

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@Pawansingh3889 Pawansingh3889 merged commit 50968ad into Pawansingh3889:main Apr 29, 2026
7 checks passed
@Pawansingh3889

Copy link
Copy Markdown
Owner

Schema-prefix heuristic is the right call. Cleanly separates user UDFs from built-ins, no allow-list to maintain. Test coverage hits the cases I'd worry about: HAVING, JOIN ON, EXISTS subquery inner predicate, SELECT-list-only false positive.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

3 participants