Skip to content

log(db): Add db failure cause log message#4749

Merged
TheodoreSpeaks merged 1 commit into
stagingfrom
log/db-failure-cause
May 27, 2026
Merged

log(db): Add db failure cause log message#4749
TheodoreSpeaks merged 1 commit into
stagingfrom
log/db-failure-cause

Conversation

@TheodoreSpeaks
Copy link
Copy Markdown
Collaborator

Summary

  • Log the underlying Postgres/driver error cause (code, severity, detail, routine, errno, syscall) when workflow execution fails
  • Drizzle wraps the real driver error as error.cause, which the logger's Error serializer dropped — leaving only an opaque Failed query: ... message with no pg error code
  • Walk the cause chain (depth-bounded to 10) and surface the first error carrying a code; this distinguishes a connection drop (08006), rejected connection (53300), and statement timeout (57014)
  • Wrapped in try/catch so the diagnostic can never throw or hang inside the error handler — worst case is no cause field, never a masked failure

Type of Change

  • Improvement (logging/observability)

Testing

Tested manually; bun run lint clean, bun run check:api-validation:strict passed, tsc --noEmit clean. Motivated by intermittent schedule-execution failures where the workspace/personal env reads in preprocessing fail with an opaque "Failed query" and no driver error code.

Checklist

  • Code follows project style guidelines
  • Self-reviewed my changes
  • Tests added/updated and passing
  • No new warnings introduced
  • I confirm that I have read and agree to the terms outlined in the Contributor License Agreement (CLA)

@vercel
Copy link
Copy Markdown

vercel Bot commented May 26, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Actions Updated (UTC)
docs Skipped Skipped May 27, 2026 12:08am

Request Review

@cursor
Copy link
Copy Markdown

cursor Bot commented May 26, 2026

PR Summary

Low Risk
Observability-only change in the execution error handler; no execution, auth, or data-path behavior changes.

Overview
When workflow execution fails in execution-core, logs now include a structured cause object alongside the top-level error.

A new describeErrorCause helper walks up to 10 levels of error.cause (typical Drizzle → Postgres/Node driver wrapping) and picks the first error with a code, then logs fields like code, severity, detail, routine, errno, and syscall via filterUndefined. That makes opaque “Failed query” failures easier to classify (e.g. connection drop vs pool rejection vs statement timeout). The helper is wrapped in try/catch so diagnostics never break the failure path.

Reviewed by Cursor Bugbot for commit 81aa8d0. Bugbot is set up for automated code reviews on this repo. Configure here.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 26, 2026

Greptile Summary

This PR adds structured driver-error logging when workflow execution fails. Drizzle wraps the raw pg error as error.cause, which the existing logger serializer discarded — leaving only an opaque "Failed query" message with no Postgres error code.

  • Introduces describeErrorCause, which walks the error chain (starting at error itself, bounded to 10 hops), prefers the first node with a code property (distinguishing e.g. connection drop 08006, statement timeout 57014), and falls back to error.cause when no code is found anywhere in the chain.
  • The extracted fields (code, severity, detail, routine, errno, syscall) are passed as a guarded spread ...(errorCause ? [{ cause: errorCause }] : []) so the logger's production path picks them up via Object.assign(entry, { cause: errorCause }) — adding a nested, CloudWatch-queryable cause block alongside the existing error/stack fields.
  • Wrapped in try/catch so the diagnostic path can never throw or mask the original failure.

Confidence Score: 5/5

Safe to merge — the change is purely additive observability work confined to the catch block of the workflow executor; the error path behaviour is unchanged if describeErrorCause returns nothing.

The walk is correctly bounded, the depth-1 fallback is only kept when no code-carrying error is found deeper in the chain, filterUndefined removes irrelevant undefined fields, and the try/catch ensures no secondary throw can occur. The guarded spread means the log call signature is identical to the original when no cause is found. All previously raised review concerns have been addressed in this commit.

No files require special attention.

Important Files Changed

Filename Overview
apps/sim/lib/workflows/executor/execution-core.ts Adds describeErrorCause to walk the error chain and extract pg/driver diagnostic fields, then spreads them as a structured cause arg into the logger — wrapped in try/catch and depth-bounded to 10. No logic errors found; the depth-1 fallback, guard on the spread, and filterUndefined usage are all correct.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[catch error: unknown] --> B[describeErrorCause error]
    B --> C{current instanceof Error?}
    C -- no --> D[return undefined]
    C -- yes --> E{candidate.code !== undefined?}
    E -- yes --> F[driver = candidate, break]
    E -- no --> G{depth === 1?}
    G -- yes --> H[driver = candidate fallback]
    G -- no --> I[advance: current = candidate.cause]
    H --> I
    I --> J{depth < 10?}
    J -- yes --> C
    J -- no --> K{driver set?}
    F --> K
    K -- no --> D
    K -- yes --> L[filterUndefined name/message/code/severity/detail/routine/errno/syscall]
    L --> M[return Record]
    M --> N{errorCause truthy?}
    N -- no --> O[logger.error msg, error]
    N -- yes --> P[logger.error msg, error, cause: errorCause]
    P --> Q[Structured JSON entry with cause block]
Loading

Reviews (2): Last reviewed commit: "log(db): Add db failure cause log messag..." | Re-trigger Greptile

Comment thread apps/sim/lib/workflows/executor/execution-core.ts Outdated
Comment thread apps/sim/lib/workflows/executor/execution-core.ts
@TheodoreSpeaks
Copy link
Copy Markdown
Collaborator Author

@greptile review

@TheodoreSpeaks TheodoreSpeaks merged commit 0fedceb into staging May 27, 2026
14 checks passed
@TheodoreSpeaks TheodoreSpeaks deleted the log/db-failure-cause branch May 27, 2026 00:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant