[benchmark]: collect resource metrics during benchmark runs#21759
[benchmark]: collect resource metrics during benchmark runs#21759AR21SM wants to merge 1 commit into
Conversation
Signed-off-by: AR21SM <mahajanashishar21sm@gmail.com>
|
Hi @AR21SM. Thanks for your PR. I'm waiting for a etcd-io member to verify that this patch is reasonable to test. If it is, they should reply with Tip We noticed you've done this a few times! Consider joining the org to skip this step and gain Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: AR21SM The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
cc @serathius @siyuanfoundation :) |
|
I have verified it works as described. @kishen-v Can you verify the change is compatible with the expected data format of perf-dash? |
|
|
||
| generatePerfReport bool | ||
|
|
||
| metricsURL string |
There was a problem hiding this comment.
the metrics collector should just use the same endpoint as the benchmark with /metrics path.
instead of using the url as an argument, it's better to pass in a list of metric names.
| var summaries []MetricSummary | ||
| var samples []MetricSample | ||
| if sampler != nil { | ||
| summaries, samples = sampler.stop() |
There was a problem hiding this comment.
if something fails before this line, sampler.stop() will never be called.
| Samples []MetricSample `json:"samples"` | ||
| } | ||
|
|
||
| func writeMetricTimeSeriesReport(benchmarkOp string, samples []MetricSample) { |
There was a problem hiding this comment.
Could be generalized to cover writePerfDashReport. The metric for that would be latency
| client *http.Client | ||
|
|
||
| mu sync.Mutex | ||
| samples map[string][]float64 |
There was a problem hiding this comment.
what is the difference between samples and series?
more documentation in general will be appreciated
Hey @siyuanfoundation, Sure! I'll take a look at the changes and also confirm if it is in-line with what perf-dash expects. Will update the findings here in a day or two. Thanks! |
related issues #21634 , #16467
This PR adds prometheus resource metric sampling to the benchmark report path.
When
--metrics-urlis provided, the benchmark samples the following etcd resource metrics during the run:process_resident_memory_bytesgo_memstats_heap_alloc_bytesgo_memstats_heap_inuse_bytesThe max value for each sampled metric is appended to the existing perfdash JSON report when
--report-perfdashis enabled.Raw sampled resource metric time-series are written to a separate JSON artifact:
EtcdResourceMetrics_benchmark_<operation>_<timestamp>.jsonsample artifact snippets from local run:
{ "version": "v1", "dataItems": [ { "data": { "Perc50": 0.5683, "Perc90": 0.809, "Perc99": 1.7786 }, "labels": { "Operation": "PUT" }, "unit": "ms" }, { "data": { "Max": 39854080 }, "labels": { "Metric": "process_resident_memory_bytes", "Operation": "PUT" }, "unit": "bytes" }, { "data": { "Max": 5165704 }, "labels": { "Metric": "go_memstats_heap_alloc_bytes", "Operation": "PUT" }, "unit": "bytes" }, { "data": { "Max": 7127040 }, "labels": { "Metric": "go_memstats_heap_inuse_bytes", "Operation": "PUT" }, "unit": "bytes" } ] }{ "version": "v1", "operation": "PUT", "samples": [ { "timestamp": "2026-05-17T03:54:28.369478719Z", "values": { "go_memstats_heap_alloc_bytes": 4903944, "go_memstats_heap_inuse_bytes": 7086080, "process_resident_memory_bytes": 39854080 } }, { "timestamp": "2026-05-17T03:54:29.039713056Z", "values": { "go_memstats_heap_alloc_bytes": 5165704, "go_memstats_heap_inuse_bytes": 7127040, "process_resident_memory_bytes": 39854080 } } ] }