-
Notifications
You must be signed in to change notification settings - Fork 2k
Fix chart generation for categorical and mixed data (post-validation + tests) #954
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…tests for validation Co-authored-by: Genie <genie@cosine.sh>
…plots Co-authored-by: Genie <genie@cosine.sh>
…f categorical and numeric data are present Co-authored-by: Genie <genie@cosine.sh>
…f categorical and numeric data are present Co-authored-by: Genie <genie@cosine.sh> Fix graphics generation for categorical data in plots
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Auto Pull Request Review from LlamaPReview
🎯 TL;DR & Recommendation
Recommendation: Approve with suggestions
This PR effectively addresses misleading chart generation for categorical and mixed data with robust fallbacks and comprehensive tests, but includes minor suggestions for code quality and robustness improvements.
🌟 Strengths
- Fixes a critical user-reported issue with incorrect visuals, enhancing data representation.
- Adds deterministic tests that ensure reliability without external dependencies.
| Priority | File | Category | Impact Summary | Anchors |
|---|---|---|---|---|
| P2 | src/vanna/base/base.py | Bug | May incorrectly override valid charts due to index matching. | |
| P2 | src/vanna/base/base.py | Maintainability | DRY violation in type detection logic increases maintenance. | |
| P2 | src/vanna/base/base.py | Robustness | Exception swallowing masks potential bugs without logging. | |
| P2 | src/vanna/base/base.py | Bug | Assumes value_counts output format, risking breakage on changes. | |
| P2 | tests/test_graphics.py | Testing | Brittle test helper may fail with new chart types. |
💡 LlamaPReview Community
Have feedback on this AI Code review tool? Join our GitHub Discussions to share your thoughts and help shape the future of LlamaPReview.
| numeric_cols = df.select_dtypes(include=["number"]).columns.tolist() | ||
| categorical_cols = df.select_dtypes( | ||
| include=["object", "category"] | ||
| ).columns.tolist() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P2 | Confidence: High
The column type detection logic is duplicated in both the try and except blocks. This violates DRY principles and increases maintenance overhead. Future changes to type handling would require updates in multiple places.
Code Suggestion:
numeric_cols = df.select_dtypes(include=["number"]).columns.tolist()
categorical_cols = df.select_dtypes(include=["object", "category"]).columns.tolist()
# Use these precomputed lists throughout the function| looks_like_default_index = ( | ||
| len(x_values) == len(df) | ||
| and all(isinstance(v, (int, float)) for v in x_values) | ||
| and x_values == list(range(len(df))) | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P2 | Confidence: Medium
The heuristic for detecting default index plots may produce false positives if the data contains integer values matching the index range (e.g., a column with values [0,1,2,3] in a 4-row DataFrame). This could incorrectly override valid LLM-generated charts. The condition relies on exact index matching, which might not account for legitimate integer-based categorical data.
| looks_like_default_index = ( | |
| len(x_values) == len(df) | |
| and all(isinstance(v, (int, float)) for v in x_values) | |
| and x_values == list(range(len(df))) | |
| ) | |
| looks_like_default_index = ( | |
| len(x_values) == len(df) | |
| and all(isinstance(v, (int, float)) for v in x_values) | |
| and x_values == df.index.tolist() # More explicit index match | |
| ) |
| except Exception: | ||
| # If inspection fails, leave the LLM chart as-is | ||
| pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P2 | Confidence: Medium
Silently swallowing exceptions during figure inspection could mask underlying issues (e.g., malformed figures). While intended to preserve the original chart on errors, this might hide bugs that should be logged for debugging.
Code Suggestion:
except Exception as inspect_error:
# Log debug information but keep the original chart
logging.debug(f"Figure inspection failed: {inspect_error}")| def _extract_counts(fig): | ||
| trace = fig.data[0] | ||
| values = getattr(trace, "values", None) | ||
| if values is None: | ||
| values = getattr(trace, "y", None) | ||
| return sorted(list(values)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P2 | Confidence: Medium
The test helper function assumes the first trace contains either 'values' (pie) or 'y' (bar) data. This might not hold for future chart types or multi-trace figures. Tests could become brittle if the visualization approach changes.
Code Suggestion:
def _extract_counts(fig):
if hasattr(fig.data[0], 'values'): # Pie chart
return sorted(fig.data[0].values)
elif hasattr(fig.data[0], 'y'): # Bar chart
return sorted(fig.data[0].y)
else:
raise ValueError("Unsupported chart type in test")| vc = df[category].value_counts(dropna=False).reset_index() | ||
| vc.columns = [category, "count"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P2 | Confidence: Low
The column renaming assumes value_counts() returns exactly two columns ('index' and the count). This may break if pandas changes its output format or if the category column is named 'index'. While currently stable, this implicit coupling is fragile.
| vc = df[category].value_counts(dropna=False).reset_index() | |
| vc.columns = [category, "count"] | |
| vc = df[category].value_counts(dropna=False).reset_index(name='count') | |
| vc = vc.rename(columns={'index': category}) # Handle potential column name conflicts |
Summary
Addresses incorrect visuals reported in issue #952 where categorical-only or mixed data often produced misleading charts (e.g., line plots over the default 0..N-1 index).
Adds post-execution validation of LLM-generated figures:
Categorical-only: always replace with count-based charts (pie if ≤10 categories, otherwise bar).
Mixed data (categorical + numeric): if LLM emits a generic line over the index, replace with bar(x=categorical, y=first numeric).
Improves fallback when LLM code fails: categorical-only also uses count-based charts.
Implementation
src/vanna/base/base.py:
After executing LLM plotly_code, detect categorical-only or “line over index” patterns and override figure accordingly.
Fallback logic for categorical-only data aggregates counts to avoid equal-sized slices.
tests/test_graphics.py:
test_categorical_only_uses_counts_in_chart
test_categorical_only_overrides_llm_line_chart
test_mixed_data_overrides_llm_line_chart_to_bar
CI/Testing
Tests are deterministic and do not require API keys or network.
Run locally:
tox
or: pytest -q
Backwards compatibility
If LLM returns a sensible, explicit chart (e.g., bar with categorical x and numeric y), validation does not override it.
Closes
Closes #952