-
Notifications
You must be signed in to change notification settings - Fork 4k
[python-package] Add TypeGuard return type to _is_pyarrow_table() for mypy type narrowing #7115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
[python-package] Add TypeGuard return type to _is_pyarrow_table() for mypy type narrowing #7115
Conversation
jameslamb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your interest in LightGBM and for taking this on!
For PRs like this, including the exact error text from mypy is helpful. It helps other contributors know specifically what you're trying to fix, and helps others searching GitHub for similar errors to find your solution.
I guess in this case it's all of these?
python-package/lightgbm/basic.py:2110: error: Item "str" of "str | Path | ndarray[tuple[Any, ...], dtype[Any]] | Any | Sequence | list[Sequence] | list[ndarray[tuple[Any, ...], dtype[Any]]]" has no attribute "column_names" [union-attr]
python-package/lightgbm/basic.py:2110: error: Item "Path" of "str | Path | ndarray[tuple[Any, ...], dtype[Any]] | Any | Sequence | list[Sequence] | list[ndarray[tuple[Any, ...], dtype[Any]]]" has no attribute "column_names" [union-attr]
python-package/lightgbm/basic.py:2110: error: Item "ndarray[tuple[Any, ...], dtype[Any]]" of "str | Path | ndarray[tuple[Any, ...], dtype[Any]] | Any | Sequence | list[Sequence] | list[ndarray[tuple[Any, ...], dtype[Any]]]" has no attribute "column_names" [union-attr]
python-package/lightgbm/basic.py:2110: error: Item "Sequence" of "str | Path | ndarray[tuple[Any, ...], dtype[Any]] | Any | Sequence | list[Sequence] | list[ndarray[tuple[Any, ...], dtype[Any]]]" has no attribute "column_names" [union-attr]
python-package/lightgbm/basic.py:2110: error: Item "list[Sequence]" of "str | Path | ndarray[tuple[Any, ...], dtype[Any]] | Any | Sequence | list[Sequence] | list[ndarray[tuple[Any, ...], dtype[Any]]]" has no attribute "column_names" [union-attr]
python-package/lightgbm/basic.py:2110: error: Item "list[ndarray[tuple[Any, ...], dtype[Any]]]" of "str | Path | ndarray[tuple[Any, ...], dtype[Any]] | Any | Sequence | list[Sequence] | list[ndarray[tuple[Any, ...], dtype[Any]]]" has no attribute "column_names" [union-attr]
If so, awesome find! I can see how mypy wouldn't be able to understand this function call, because it's only returning bool:
LightGBM/python-package/lightgbm/basic.py
Line 405 in 52dbf06
| def _is_pyarrow_table(data: Any) -> bool: |
But there are several calls to that function and you're only changing 1 here.
git grep '_is_pyarrow_table'Instead, could you please:
- revert this change
- update the return hint of
_is_pyarrow_tableto"TypeGuard[pa_Table]"
And see if that resolves it?
https://mypy.readthedocs.io/en/latest/type_narrowing.html describes TypeGuard if you haven't seen it before.
@wagner-austin I see you've opened more PRs with type-hinting fixes. Much appreciated, and it's great to have them split up in small contributions! Before we review any more of those, please do this (including the exact text from |
|
Oh and can you please also link all of them to the |
… mypy type narrowing Change _is_pyarrow_table() return type from bool to TypeGuard[pa_Table] to enable mypy type narrowing at all call sites. This follows the existing pattern used by _is_pyarrow_array() and allows mypy to recognize data.column_names as valid when extracting feature names from PyArrow tables. Fixes: microsoft#3867
4c20333 to
f83c38e
Compare
jameslamb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the investigation and changes, looks great!
Confirmed in the CI logs that this change eliminates those errors and gets us down to just 12 mypy errors remaining 🎉
Found 12 errors in 4 files (checked 9 source files)
Don't worry if the R-package / rchck CI job fails... that's a known issue unrelated to any changes you make in the Python package (#7113).
Summary
Change
_is_pyarrow_table()return type frombooltoTypeGuard[pa_Table]to enable mypy type narrowing at all call sites. This follows the existing pattern used by_is_pyarrow_array().Mypy Error (before fix)
Related Issue
Fixes: #3867
Test Plan
pre-commit run --files python-package/lightgbm/basic.py- all checks passed