-
Notifications
You must be signed in to change notification settings - Fork 4.6k
xds: give up pool lock before closing xdsclient channel #8445
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…eadlocks with nested xds
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #8445 +/- ##
==========================================
- Coverage 82.48% 82.27% -0.21%
==========================================
Files 414 414
Lines 40418 40422 +4
==========================================
- Hits 33338 33257 -81
- Misses 5732 5800 +68
- Partials 1348 1365 +17
🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for finding and fixing it back. Perhaps we should file an issue for writing the test?
| delete(p.clients, name) | ||
|
|
||
| client.Close() | ||
| for _, s := range client.bootstrapConfig.XDSServers() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Even these 2 cleanups of authorities and xds servers were being done after releasing the lock before because they were within client's close method. May be we should move them down as well after closing the generic client.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm worried about actual races here, since the bootstrapConfig appears to be a pointer, and could be modified later.
In my opinion, we should have added a test when we first fixed the issue. Now that it has occurred again, I think we should add the test as soon as possible. I can try to write it before we attempt to import the fix back into google3. |
* xdsclient: preserve original bytes for decoding when the resource is wrapped (#8411) * xds: Avoid error logs when setting fallback bootstrap config (#8419) * xdsclient: relay marshalled bytes of complete resource proto to decoders (#8422) * xds: give up pool lock before closing xdsclient channel (#8445) * transport: release mutex before returning on expired deadlines in server streams (#8451) --------- Co-authored-by: Easwar Swaminathan <easwars@google.com> Co-authored-by: Arjan Singh Bal <46515553+arjan-bal@users.noreply.github.com> Co-authored-by: Purnesh Dixit <purneshdixit@google.com> Co-authored-by: Doug Fawley <dfawley@google.com>
This prevents deadlocks with nested xds channels.
I believe this must be cherry-picked into the 1.74 release branch, too.
RELEASE NOTES: none