[IPC Protocol][dotnet-gcdump] Add non-lossy dotnet-gcdump mode#5886
[IPC Protocol][dotnet-gcdump] Add non-lossy dotnet-gcdump mode#5886mdh1418 wants to merge 5 commits into
Conversation
CollectTracing6 (0x0207) extends CollectTracing5 with a trailing sessionBufferMode field on the streaming-session payload: 0 = Drop (lossy circular buffer), 1 = Block (non-lossy; producers block until the reader drains). The user_events payload is unchanged. Available in .NET 11.0 and later. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
This PR adds a new non-lossy (Block) buffering mode for IPC streaming EventPipe sessions (CollectTracing6) and wires it into dotnet-gcdump collect via a new --non-lossy flag to improve reliability when collecting heap snapshots under high event volume.
Changes:
- Extend
Microsoft.Diagnostics.NETCore.Clientto support CollectTracing5/6 payload serialization (event filters + buffering mode) and command selection. - Add a
--non-lossyoption todotnet-gcdump collectand plumb it through toStartEventPipeSessionbuffering configuration. - Update IPC protocol documentation to describe CollectTracing6 and its
sessionBufferModefield.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| src/Tools/dotnet-gcdump/DotNetHeapDump/EventPipeDotNetHeapDumper.cs | Adds a nonLossy path and uses EventPipeSessionConfiguration with EventPipeBufferingMode.Block. |
| src/Tools/dotnet-gcdump/CommandLine/ReportCommandHandler.cs | Updates call into TryCollectMemoryGraph for the new signature. |
| src/Tools/dotnet-gcdump/CommandLine/CollectCommandHandler.cs | Adds --non-lossy, propagates it to collection, and prints a helpful error on unsupported runtimes. |
| src/Microsoft.Diagnostics.NETCore.Client/DiagnosticsIpc/IpcCommands.cs | Adds CollectTracing5/6 command IDs. |
| src/Microsoft.Diagnostics.NETCore.Client/DiagnosticsClient/EventPipeSessionConfiguration.cs | Introduces buffering mode + V5/V6 serialization (including event filters). |
| src/Microsoft.Diagnostics.NETCore.Client/DiagnosticsClient/EventPipeSession.cs | Selects CollectTracing5/6 based on filters/buffering mode and adds filter detection. |
| src/Microsoft.Diagnostics.NETCore.Client/DiagnosticsClient/EventPipeProvider.cs | Adds per-provider Event ID filtering support via EventPipeProviderEventFilter. |
| documentation/design-docs/ipc-protocol.md | Documents CollectTracing6 and the new sessionBufferMode field. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
EventPipeProviderEventFilter exposes the per-provider Event ID allow/deny list that CollectTracing5 introduced (available on .NET 10+). It is opted into via a new EventPipeProvider constructor overload, leaving the original constructor signature intact for binary compatibility. A session whose providers carry a filter is started with CollectTracing5 via SerializeV5, which adds the session-type wire prefix. enable+ids is an allow-list, !enable+ids a deny-list, and !enable+empty (the default for a null filter) allows all events. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
EventPipeBufferingMode {Default, Block} selects the session buffering mode, exposed through a new EventPipeSessionConfiguration constructor overload (the original constructor signatures are left intact for binary compatibility). It is serialized by SerializeV6 (the CollectTracing5 streaming payload plus a trailing sessionBufferMode). A session with a non-default buffering mode is started with CollectTracing6 (.NET 11+); Block requests non-lossy collection in which the runtime blocks producers rather than dropping events when the buffer fills.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
collect gains --non-lossy, which starts the GCHeapSnapshot session in Block buffering mode so the runtime blocks producers instead of dropping events when the buffer fills. This produces a complete gcdump on large heaps, at the cost of slower collection, and requires a target runtime that supports CollectTracing6. When the target is too old, collect catches UnsupportedCommandException and reports that non-lossy requires .NET 11+, suggesting a plain (lossy) collection instead. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Covers which CollectTracing command (2 through 6) is chosen for a given EventPipeSessionConfiguration, and the CollectTracing5/6 payload layout: the IpcStream session-type prefix, the per-provider event filter (allow-list, deny-list, and the null = allow-all encoding), and the trailing buffering mode that distinguishes V6 from V5. EventPipeSession.CreateStartMessage is made internal so the selection logic is testable via InternalsVisibleTo. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
fd1f00f to
40be140
Compare
noahfalk
left a comment
There was a problem hiding this comment.
LGTM modulo a couple comments inline.
| if (config.RundownKeyword != DefaultRundownKeyword && config.RundownKeyword != 0) | ||
| if (config.BufferingMode != EventPipeBufferingMode.Default) | ||
| { | ||
| // V6 adds an opt-in session buffering mode (its payload also carries any event filters) |
There was a problem hiding this comment.
| // V6 adds an opt-in session buffering mode (its payload also carries any event filters) | |
| // V6 adds an opt-in session buffering mode |
Nit: seemed arbitrary to highlight event filters here among all the different data in the payload
| /// <summary> | ||
| /// The runtime default: a circular buffer that drops events when it overflows (lossy). | ||
| /// </summary> | ||
| Default = 0, |
There was a problem hiding this comment.
| Default = 0, | |
| Drop = 0, |
| }; | ||
|
|
||
| private static readonly Option<bool> NonLossyOption = | ||
| new("--non-lossy") |
There was a problem hiding this comment.
I'd suggest we don't offer this as a command-line option. I assume all users will want non-lossy behavior whenever it is supported. So the tool can try CollectTracing6 and if that reports unsupported then fall back.
There was a problem hiding this comment.
If we make it default, should we still have a way to move back to lossy behavior if some workload hits issues/deadlocks with the new non-lossy behavior?
| /// <param name="dsrouter">The dsrouter command to use for collecting the gcdump.</param> | ||
| /// <returns></returns> | ||
| private static async Task<int> Collect(CancellationToken ct, int processId, string output, int timeout, bool verbose, string name, string diagnosticPort, string dsrouter) | ||
| private static async Task<int> Collect(CancellationToken ct, int processId, string output, int timeout, bool verbose, string name, string diagnosticPort, string dsrouter, bool nonLossy) |
There was a problem hiding this comment.
Should we use the enum instead and name this parameter bufferingMode instead? If so the same naming should be applied through this PR.
Diagnostics counterpart to dotnet/runtime#129457 for fixing #2404
This PR proposes a new CollectTracing command
CollectTracing6that exposes a newsessionBufferModeconfiguration option to specify how an IPC Streaming EventPipe Session using buffers should handle events when buffers are full.0 - Drop (default, pre-existing behavior)
1 - Block (non-lossy, parks producer threads until buffer space is available)
Additionally, this PR introduces serializing for CollectTracing5 (non-UserEvents) and CollectTracing6, and adds a new
--non-lossyoption todotnet-gcdump collectto leverage the new non-lossy mode.