-
Notifications
You must be signed in to change notification settings - Fork 2k
TypeError: object of type 'NoneType' has no len() fixed #972
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AI Code Review by LlamaPReview
🎯 TL;DR & Recommendation
Recommendation: Approve with suggestions
This PR fixes a critical TypeError that crashes the application when empty training data is provided, but introduces minor code style and type safety inconsistencies that could mask data quality issues.
🌟 Strengths
- Directly resolves a runtime crash reported in issue #971, improving application stability.
| Priority | File | Category | Impact Summary | Anchors |
|---|---|---|---|---|
| P2 | src/vanna/base/base.py | Bug | Silently skips None documentation, masking data issues | - |
| P2 | src/vanna/base/base.py | Maintainability | Method lacks input validation, remains vulnerable | - |
| P2 | src/vanna/base/base.py | Architecture | Type hint mismatch with actual data flow | - |
🔍 Notable Themes
- Data Validation Gaps: Multiple findings highlight insufficient input validation and type consistency in the documentation processing flow, suggesting a need for broader defensive programming improvements.
💡 Have feedback? We'd love to hear it in our GitHub Discussions.
✨ This review was generated by LlamaPReview Advanced, which is free for all open-source projects. Learn more.
| if len(documentation_list) > 0: | ||
| initial_prompt += "\n===Additional Context \n\n" | ||
|
|
||
| for documentation in documentation_list: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P2 | Confidence: High
The fix addresses the immediate TypeError by adding a None check before calling str_to_approx_token_count(). However, the solution introduces a potential logical inconsistency - when documentation is None, it's silently skipped without any indication or logging. This could mask data quality issues in the training pipeline. Additionally, the code formatting is inconsistent with the rest of the condition structure.
| for documentation in documentation_list: | |
| if documentation is not None and ( | |
| self.str_to_approx_token_count(initial_prompt) | |
| + self.str_to_approx_token_count(documentation) | |
| < max_tokens | |
| ): |
Evidence: method:add_documentation_to_prompt, method:str_to_approx_token_count
| max_tokens: int = 14000, | ||
| ) -> str: | ||
| if len(documentation_list) > 0: | ||
| initial_prompt += "\n===Additional Context \n\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Contextual Comment]
This comment refers to code near real line 515. Anchored to nearest_changed(541) line 541.
P2 | Confidence: High
Speculative: The root cause analysis suggests the str_to_approx_token_count method lacks input validation. While the immediate fix handles the None case at the call site, the underlying method remains vulnerable to similar issues if called elsewhere with invalid inputs. Adding defensive programming at the method level would make the code more robust.
Code Suggestion:
def str_to_approx_token_count(self, string: str) -> int:
if string is None:
return 0
return len(string) / 4Evidence: method:str_to_approx_token_count
[Contextual Comment]
This comment refers to code near real line 534. Anchored to nearest_changed(541) line 541.
P2 | Confidence: Medium
Speculative: The method signature indicates it accepts list[str] but the actual data flow includes None values, suggesting a type contract violation. This indicates either the type hint is incorrect or there's a data integrity issue upstream. The method should either validate input types or update the type hint to reflect the actual contract (list[Optional[str]]).
Code Suggestion:
def add_documentation_to_prompt(
self,
initial_prompt: str,
documentation_list: list[Optional[str]], # Updated type hint
max_tokens: int = 14000,
) -> str:Evidence: method:add_documentation_to_prompt
fix issue #971