feat: check for spans in agent log group by avi-alpert · Pull Request #1404 · aws/agentcore-cli

avi-alpert · 2026-05-28T01:39:54Z

Description

Add runtime log group as an additional span source alongside aws/spans for all span query sites.

For every place we check the aws/spans log group, also check for spans in the runtime log group and union the results. These changes are backwards compatible so command still work with spans in either location.

Changes:

get-trace.ts — fetchSpans now queries both aws/spans and the runtime log group in parallel, concatenates results, swallows ResourceNotFoundException from either
span-collector.ts — fetchSessionSpans now queries both aws/spans and the runtime log group for OTEL spans in parallel, concatenates results; added executeQueryGraceful helper
run-eval.ts — discoverSessions now queries both aws/spans and the runtime log group in parallel, concatenates results
fetch-session-spans.ts (recommendation command) — now makes 3 parallel calls instead of 2: aws/spans for span records, runtime log group for span records, runtime log group for log records
ABTestDetailScreen.tsx — debug checks now query both aws/spans and the runtime log group for experiment spans, summing counts across both
post-deploy-ab-tests.ts — added runtime log group ARN wildcard to the AB test role's IAM policy so the online eval service can read spans there

I tested by running the following commands on agents writing spans to aws/spans and the agent log group:

agentcore traces list and traces get
agentcore run eval
agentcore run recommendation

Related Issue

Closes #

Documentation PR

Type of Change

Testing

How have you tested the change?

I ran npm run test:unit and npm run test:integ
I ran npm run typecheck
I ran npm run lint
If I modified src/assets/, I ran npm run test:update-snapshots and committed the updated snapshots

Checklist

I have read the CONTRIBUTING document
I have added any necessary tests that prove my fix is effective or my feature works
I have updated the documentation accordingly
I have added an appropriate example to the documentation to outline the feature, or no new docs are needed
My changes generate no new warnings
Any dependent changes have been merged and published

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the
terms of your choice.

github-actions · 2026-05-28T01:40:48Z

Package Tarball

aws-agentcore-0.15.0.tgz

How to install

gh release download pr-1404-tarball --repo aws/agentcore-cli --pattern "*.tgz" --dir /tmp/pr-tarball
npm install -g /tmp/pr-tarball/aws-agentcore-0.15.0.tgz

agentcore-devx-automation · 2026-05-28T01:42:24Z

Claude Security Review: no high-confidence findings. (run)

agentcore-cli-automation

Nice feature — querying both log groups makes the tooling resilient as the span-emission story evolves. I have one main concern that shows up in three places: when a span exists in both aws/spans (via X-Ray Transaction Search) and the runtime log group (emitted directly by the agent), the new code concatenates the result sets without deduplication. Depending on the user's setup this can lead to duplicate session entries in the picker, double-counted spans sent to evaluators, and duplicate rows in trace output. Inline comments below.

agentcore-cli-automation · 2026-05-28T01:43:01Z

+    executeQueryGraceful(client, SPANS_LOG_GROUP, query, startTimeSec, endTimeSec),
+    executeQueryGraceful(client, runtimeLogGroupName, query, startTimeSec, endTimeSec),
+  ]);
+  const rows = [...spansRows, ...runtimeRows];


When the same sessionId exists in both aws/spans and the runtime log group, the merged rows will produce two SessionInfo entries for that session — one with the count from aws/spans and one from the runtime log group. The TUI session picker (RunEvalScreen, RunBatchEvalFlow, RecommendationScreen) will then show duplicates.

A few ways to handle this:

Merge by sessionId in JS — sum spanCount, take the earlier firstSeen.

Run a single Insights query that lists both log groups (StartQuery accepts logGroupNames: [...]), so the stats ... by sessionId aggregation happens server-side.

Dedupe by sessionId and just keep the row with the higher count.

Option 2 is probably the cleanest since it avoids the double-count of spans entirely.

Can we please check this ? The TUI screen concern seems valid.

agentcore-cli-automation · 2026-05-28T01:43:01Z

+    executeQueryGraceful(client, SPANS_LOG_GROUP, spanQuery, startTimeSec, endTimeSec),
+    executeQueryGraceful(client, runtimeLogGroup, spanQuery, startTimeSec, endTimeSec),
+  ]);
+  const allSpanRows = [...sharedSpanRows, ...runtimeSpanRows];


Same duplication concern as discoverSessions: if a span lands in both aws/spans and the runtime log group (transaction search + agent-emitted), allSpanRows will contain it twice, the same parsed doc will be pushed to sessionMap.get(sessionId) twice, and the duplicates will be sent to the evaluators. For TRACE/TOOL_CALL evaluators in particular this could meaningfully skew results.

Options:

Issue a single StartQuery with both log groups in logGroupNames (server-side dedup is still your responsibility, but at least there's only one result set to walk).

Dedupe allSpanRows by spanId (or traceId+spanId) before building sessionMap.

Prefer one source over the other (e.g., if any rows came back from aws/spans, ignore the runtime log group rows for spans, and only use the runtime log group for the runtime-log lookup further down).

agentcore-cli-automation · 2026-05-28T01:43:01Z

+    } else {
+      return { success: false, error: result.error };
+    }
+  }


allRows may contain the same span twice when it shows up in both log groups, which would result in duplicate CloudWatchSpanRecord entries in the trace returned to the web UI / CLI consumers (and would break parent/child tree rendering since the same spanId would appear twice).

Suggest deduping by spanId after concatenation, e.g.:

const seen = new Set<string>(); const spans: CloudWatchSpanRecord[] = allRows .filter(row => row.traceId && row.spanId && !seen.has(row.spanId!) && (seen.add(row.spanId!), true)) .map(row => ({ ... }));

or pass both log groups to a single StartQuery call.

agentcore-cli-automation · 2026-05-28T01:43:01Z

  ]);

-  onProgress?.(`Found ${spanRecords.length} span records, ${logRecords.length} log record candidates`);
+  const allSpanRecords = [...sharedSpanRecords, ...runtimeSpanRecords];


allSpanRecords is the union of records from both log groups. If a span exists in both (which is the whole reason we're querying both), it gets parsed and pushed to spans twice, which means the OTEL mapper on the recommendation Lambda will see duplicate spans for the same traceId/spanId. That can produce inflated trajectory counts and skew tool-description recommendations.

After parsing, please dedupe by something stable (e.g. traceId + spanId, or JSON.stringify(parsed) for log records that don't have spanIds) before pushing into spans.

A span wont ever be duplicated, however we do have log-events and spans having the same sessionId in the agent-log-group and aws/spans respectively in current behavior.

avi-alpert · 2026-05-28T01:45:59Z

re the automated comments: The same span will never live in both log groups

Hweinstock

makes sense to me! Worth getting someone who worked on evals/logs in CLI to look at it.

Hweinstock · 2026-05-28T02:35:51Z


  try {
    const createResult = await iamClient.send(
      new CreateRoleCommand({


totally out of scope, but any idea why we create a role imperatively here? Is this not supported in the CDK?

Hweinstock · 2026-05-28T02:40:56Z

        ],
        Resource: [
          `${arnPrefix(region)}:logs:*:${accountId}:log-group:/aws/bedrock-agentcore/evaluations/*`,
+          `${arnPrefix(region)}:logs:*:${accountId}:log-group:/aws/bedrock-agentcore/runtimes/*`,


since this is managed imperatively, is there a risk that this permission isn't added to existing deployed ab-tests? (not familiar with the logic here)

Wondering if this blocks customers who have a deployed ab-test and want to use the new log group.

Hweinstock · 2026-05-28T02:42:36Z

+/**
+ * Execute a CloudWatch Logs Insights query, returning [] if the log group does not exist.
+ */
+export async function executeQueryGraceful(


nit: can we wrap the params in an object. I find that with 3+ parameters in typescript it can make things hard to read without named arguments.

feat: check for spans in agent log group

97e92ba

avi-alpert requested a review from a team May 28, 2026 01:39

github-actions Bot added the size/m PR size: M label May 28, 2026

avi-alpert temporarily deployed to e2e-testing May 28, 2026 01:40 — with GitHub Actions Inactive

github-actions Bot added the agentcore-harness-reviewing AgentCore Harness review in progress label May 28, 2026

agentcore-devx-automation Bot added the claude-security-reviewing Claude Code /security-review in progress label May 28, 2026

agentcore-devx-automation Bot removed the claude-security-reviewing Claude Code /security-review in progress label May 28, 2026

agentcore-cli-automation suggested changes May 28, 2026

View reviewed changes

github-actions Bot removed the agentcore-harness-reviewing AgentCore Harness review in progress label May 28, 2026

Hweinstock approved these changes May 28, 2026

View reviewed changes

avi-alpert merged commit 2e94fdd into aws:main May 28, 2026
36 checks passed

Conversation

avi-alpert commented May 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Related Issue

Documentation PR

Type of Change

Testing

Checklist

Uh oh!

github-actions Bot commented May 28, 2026

Package Tarball

How to install

Uh oh!

agentcore-devx-automation Bot commented May 28, 2026

Uh oh!

agentcore-cli-automation left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

avi-alpert commented May 28, 2026

Uh oh!

Hweinstock left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

avi-alpert commented May 28, 2026 •

edited

Loading