Skip to content

Conversation

@Tony2h
Copy link
Contributor

@Tony2h Tony2h commented Oct 31, 2025

Description

Please briefly describe the code changes in this pull request.

Checklist

Please check the items in the checklist if applicable.

  • Is the user manual updated?
  • Are the test cases passed and automated?
  • Is there no significant decrease in test coverage?
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @Tony2h, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request resolves a critical issue in the state window aggregation mechanism, specifically addressing errors that occurred when processing data blocks containing entirely null values or when state windows extended across multiple data blocks. The changes involve significant refactoring of the aggregation logic, introducing more robust handling for partial data blocks and nulls, and ensuring that state windows are correctly processed, initialized, and reset. This enhances the reliability and accuracy of state window operations, especially in complex data streaming scenarios.

Highlights

  • Refactored State Window Aggregation: The core logic for aggregating data within closed state windows has been extracted into a new helper function, processClosedStateWindow, improving modularity and readability of the code.
  • Enhanced State Window Management: A new resetWindowRowsSup function was introduced to ensure proper initialization and resetting of state window tracking structures, which is crucial for correct window boundary detection and aggregation.
  • Improved Data Block Handling: The system now robustly handles state windows that span across multiple data blocks. This includes logic for merging partial blocks and carrying over unprocessed rows, ensuring continuity in state window calculations.
  • Addressed All-Null Datablock Error: Specific logic has been implemented to correctly manage state windows when the state column contains all-null data blocks, preventing previous errors and ensuring accurate aggregation in such scenarios.
  • New Test Cases for Null Handling: New test data (data3-3.csv) and a Python script to generate it (dataMoreThan4096.py) have been added to thoroughly validate the fix, particularly focusing on scenarios with numerous null values in state window columns.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@Tony2h
Copy link
Contributor Author

Tony2h commented Oct 31, 2025

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request addresses an error in state window handling when a datablock is entirely composed of null values. The fix involves a significant refactoring of the state window aggregation logic. Now, it buffers unprocessed parts of a data block and merges them with subsequent blocks, which correctly handles state windows that span across multiple blocks, especially in the presence of nulls. The logic has also been improved by extracting helper functions like processClosedStateWindow, which enhances code clarity and maintainability. My review includes a critical comment to remove a leftover debug print statement and a couple of medium-severity suggestions for code and repository hygiene.

SExprSupp* pSup = &pOperator->exprSupp;

SColumnInfoData* pStateColInfoData = taosArrayGet(pBlock->pDataBlock, pInfo->stateCol.slotId);
printDataBlock(pBlock, "tooony", "tooony", 9898);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

This printDataBlock call appears to be a leftover debug statement. It should be removed before merging to avoid polluting logs in production environments.

T_LONG_JMP(pTaskInfo->env, code);
}


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This is an extra blank line that can be removed to improve code conciseness.

@@ -0,0 +1,1808 @@
'2025-10-10 14:16:32',null,-1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Adding large, generated data files like this one to the repository can significantly increase its size over time. It's generally better practice to generate such files on-the-fly during the test setup phase (using the accompanying test/dataMoreThan4096.py script) and then clean them up afterward. This keeps the repository lean and avoids bloating the git history.

@gemini-code-assist
Copy link
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

1 similar comment
@gemini-code-assist
Copy link
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

2 participants