Skip to content

Conversation

@dfawley
Copy link
Member

@dfawley dfawley commented Jul 10, 2025

This prevents deadlocks with nested xds channels.

I believe this must be cherry-picked into the 1.74 release branch, too.

RELEASE NOTES: none

@dfawley dfawley added this to the 1.74 Release milestone Jul 10, 2025
@dfawley dfawley requested review from arjan-bal and purnesh42H July 10, 2025 22:34
@codecov
Copy link

codecov bot commented Jul 10, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 82.27%. Comparing base (3d0cb79) to head (de39873).
Report is 1 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #8445      +/-   ##
==========================================
- Coverage   82.48%   82.27%   -0.21%     
==========================================
  Files         414      414              
  Lines       40418    40422       +4     
==========================================
- Hits        33338    33257      -81     
- Misses       5732     5800      +68     
- Partials     1348     1365      +17     
Files with missing lines Coverage Δ
xds/internal/xdsclient/pool.go 75.55% <100.00%> (+0.74%) ⬆️

... and 21 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
Copy link
Contributor

@purnesh42H purnesh42H left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for finding and fixing it back. Perhaps we should file an issue for writing the test?

delete(p.clients, name)

client.Close()
for _, s := range client.bootstrapConfig.XDSServers() {
Copy link
Contributor

@purnesh42H purnesh42H Jul 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Even these 2 cleanups of authorities and xds servers were being done after releasing the lock before because they were within client's close method. May be we should move them down as well after closing the generic client.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm worried about actual races here, since the bootstrapConfig appears to be a pointer, and could be modified later.

@arjan-bal
Copy link
Contributor

Thanks for finding and fixing it back. Perhaps we should file an issue for writing the test?

In my opinion, we should have added a test when we first fixed the issue. Now that it has occurred again, I think we should add the test as soon as possible. I can try to write it before we attempt to import the fix back into google3.

@dfawley dfawley merged commit eb4a783 into grpc:master Jul 11, 2025
17 checks passed
@dfawley dfawley deleted the dlock branch July 11, 2025 20:28
Pranjali-2501 pushed a commit to Pranjali-2501/grpc-go that referenced this pull request Jul 17, 2025
arjan-bal added a commit that referenced this pull request Jul 17, 2025
* xdsclient: preserve original bytes for decoding when the resource is wrapped (#8411)

* xds: Avoid error logs when setting fallback bootstrap config (#8419)

* xdsclient: relay marshalled bytes of complete resource proto to decoders (#8422)

* xds: give up pool lock before closing xdsclient channel (#8445)

* transport: release mutex before returning on expired deadlines in server streams (#8451)

---------

Co-authored-by: Easwar Swaminathan <easwars@google.com>
Co-authored-by: Arjan Singh Bal <46515553+arjan-bal@users.noreply.github.com>
Co-authored-by: Purnesh Dixit <purneshdixit@google.com>
Co-authored-by: Doug Fawley <dfawley@google.com>
dimpavloff pushed a commit to dimpavloff/grpc-go that referenced this pull request Aug 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

3 participants