-
Notifications
You must be signed in to change notification settings - Fork 16.1k
fix(datasets): prevent double time filter application in virtual datasets #35890
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
…sets Fixes #34894 When using get_time_filter(remove_filter=True) in virtual datasets, time filters were incorrectly applied twice - once in the inner query (virtual dataset) and again in the outer query. Root cause: Timeseries charts use the __timestamp alias for time filters, but virtual dataset templates use actual column names (e.g., "dttm"). The removed_filters check didn't recognize these as the same column, causing the filter to be re-applied. Solution: Enhanced the filter skip logic to check both the __timestamp alias and the actual datetime column name when determining if a filter should be skipped. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Code Review Agent Run #103684Actionable Suggestions - 0Additional Suggestions - 7
Review Details
Bito Usage GuideCommands Type the following command in the pull request comment and save the comment.
Refer to the documentation for additional commands. Configuration This repository uses Documentation & Help |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've completed my review and didn't find any issues.
Files scanned
| File Path | Reviewed |
|---|---|
| superset/models/helpers.py | ✅ |
Explore our documentation to understand the languages and file types we support and the files we ignore.
Check out our docs on how you can make Korbit work best for you and your team.
Interaction Diagram by BitosequenceDiagram
participant User as User Request
participant API as REST API<br/>
participant Explore as ExploreMixin.get_sqla_query<br/>🔄 Updated | ●●○ Medium
participant TP as TemplateProcessor
participant Filter as Filter Logic<br/>🔄 Updated | ●●○ Medium
participant DB as Database
User->>API: Query with time filter
API->>Explore: get_sqla_query(query_obj)
Explore->>TP: Process Jinja template
TP-->>Explore: Returns removed_filters list
Explore->>Filter: Check filter_col_name & __timestamp
Note over Filter: If flt_col == __timestamp, check both<br/>alias and actual column name
Filter-->>Explore: should_skip_filter decision
Explore->>DB: Build WHERE clause (skip if in removed_filters)
DB-->>API: Return compiled SQL
API-->>User: Response
Critical path: User Request -> REST API -> ExploreMixin.get_sqla_query -> Filter Logic -> Database
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR addresses issue #34894 where time filters are being applied twice in virtual datasets - once in the inner query (from the Jinja template with remove_filter=True) and again in the outer query. The fix ensures that when remove_filter=True is used with get_time_filter() in a Jinja template, the filter is recognized and skipped in the outer query, particularly when the __timestamp alias is used in timeseries queries.
Key Changes
- Enhanced filter skip logic to handle
__timestampalias properly by checking both the alias and the actual column name - Added comprehensive test coverage for the fix with three test scenarios
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| superset/models/helpers.py | Added logic to recognize __timestamp alias as equivalent to the actual datetime column when checking if a filter should be skipped |
| tests/unit_tests/models/test_time_filter_double_application.py | Added comprehensive test suite to verify time filter is only applied once when using remove_filter=True |
| print(f"\n\nGenerated SQL:\n{generated_sql}\n\n") | ||
|
|
Copilot
AI
Oct 29, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Debug print statements should be removed from test code or replaced with proper logging. Consider using logger.debug() or removing this entirely.
| print(f"\n\nGenerated SQL:\n{generated_sql}\n\n") |
| print(f"\n\nSimple case SQL:\n{generated_sql}\n\n") | ||
|
|
Copilot
AI
Oct 29, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Debug print statements should be removed from test code or replaced with proper logging. Consider using logger.debug() or removing this entirely.
| print(f"\n\nSimple case SQL:\n{generated_sql}\n\n") |
| print(f"\n\nTimestamp alias case SQL:\n{generated_sql}\n\n") | ||
|
|
Copilot
AI
Oct 29, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Debug print statements should be removed from test code or replaced with proper logging. Consider using logger.debug() or removing this entirely.
| print(f"\n\nTimestamp alias case SQL:\n{generated_sql}\n\n") |
SUMMARY
Fixes #34894
When using
get_time_filter(remove_filter=True)in virtual datasets, time filters were incorrectly applied twice:Root Cause: Timeseries charts use the special
__timestampalias for time filters, but virtual dataset Jinja templates use actual column names (e.g., "dttm"). Theremoved_filterscheck didn't recognize these as referring to the same column, causing the filter to be re-applied in the outer query.Solution: Enhanced the filter skip logic in
superset/models/helpers.pyto check both the__timestampalias and the actual datetime column name when determining if a filter should be skipped.BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF
BEFORE (Bug): Virtual dataset with
get_time_filter('dttm', remove_filter=True)would generate SQL like:AFTER (Fixed): The filter is only applied once in the inner query:
TESTING INSTRUCTIONS
Run the new unit tests to verify the fix:
Run existing time filter tests to ensure no regressions:
Manual test (optional):
get_time_filter('your_time_col', remove_filter=True, target_type='DATE')ADDITIONAL INFORMATION
Code Changes:
superset/models/helpers.py: Added logic to handle__timestampalias when checking removed_filters (10 lines)tests/unit_tests/models/test_time_filter_double_application.py: Added comprehensive unit tests (360 lines)Testing Coverage:
remove_filter=True__timestampalias handling in timeseries queriesAll pre-commit checks pass (mypy, ruff, pylint).