Skip to content

runtime: SIGPROF during stack barrier install can panic #11863

@aclements

Description

@aclements

The following sequence of events can lead to an out-of-bounds access attempt in the runtime:

  1. A SIGPROF comes in on a thread while the G on that thread is in _Gsyscall. The sigprof handler calls gentraceback, which saves a local copy of the G's stkbar slice. Currently the G has no stack barriers, so this slice is empty.
  2. On another thread, the GC concurrently scans the stack of the goroutine being profiled (it considers it stopped because it's in _Gsyscall) and installs stack barriers.
  3. Back on the sigprof thread, gentraceback comes across a stack barrier in the stack and attempts to look it up in its (zero length) copy of G's old stkbar slice and attempts an out-of-bounds access.

Because of the particularly prickly context, this double faults and turns into a "panic: fatal error: malloc deadlock".

I can reproduce this ~1 in 10 runs by applying https://go-review.googlesource.com/12674 and running

cd $GOROOT/src/runtime/pprof
go test -c
stress ./pprof.test -test.v -test.short

This should have nothing to do with CL 12647, but applying the CL makes it easy to reproduce (presumably because of some effect on timings).

I'm not sure what the solution to this is. We already have a few cases where we just give up when we're walking the stack for a profile. We could do that here, too, if gentraceback encounters a stack barrier it wasn't expecting. Alternatively, we could make sigprof pickier about Gs in _Gsyscall, though I'm not sure how exactly.

@rsc @RLH @ianlancetaylor @dvyukov

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions