Skip to content

Race condition between sample capture and jitdump record #127

@vvuk

Description

@vvuk

I'm observing a scenario where samply has samples that are timestamped before the timestamp of the relevant jitdump record, causing the sample to not be resolved. This is on macOS, M3 Max, with CoreCLR. I frequently see samples with a timestamp that's 35µs, 75µs, etc. before the jitdump record.

My theory is that this is due to the sample interval, and with samples being timestamped with the start of their interval, instead of the end. Sequence of events that I think is happening:

  1. samply captures get_monotonic_timestamp() for a sample in mac Sampler::run
    a. (side note -- on macOS, there is clock_gettime_nsec_np(CLOCK_UPTIME_RAW) which is defined to be the same as mach_absolute_time() after appropriate timebase-info conversion; should maybe use this instead of the low-level mach functions?)
  2. samply starts looping through tasks, calling task.sample
  3. In the interval between get_monotonic_timestamp() and actually sampling the task's threads, the JIT finished compiling something, emitted a jitdump record, and jumped to the JIT code that is now being sampled.

If I replace the captured sample's now_mono in ThreadPorfiler::sample_impl with a call to get_monotonic_timestamp(), thus timestamping the sample with the time immediately after it was captured, all the JIT resolution issues go away.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions