New serverless pattern - lambda-df-slack#3111
Conversation
marcojahn
left a comment
There was a problem hiding this comment.
Hello @ras-al-jil, thank you for your contribution.
Your PR as has 2167 files, especially the .src folder contains a lot of library files, can you please cleanup/fix to a proper setup.
Also on a high level I've found the following things, please review and fix before this PR can be merged
- Move the Slack bot token and signing secret out of plaintext Lambda environment variables into AWS Secrets Manager or SSM Parameter Store (SecureString).
- Stop passing the Slack secrets through Terraform
null_resourcetriggers — they get persisted to state in plaintext. - Scope the
bedrock-agentcore:InvokeAgentRuntimepermission to the specific runtime ARN instead ofResource: "*". - Scope the AgentCore runtime role's
bedrock:InvokeModelto the Claude models/inference profiles and the ECR pull actions to the repository ARN (keep onlyecr:GetAuthorizationTokenon*). - Scope the
lambda:SendDurableExecutionCallback*actions to the orchestrator function ARN instead ofResource: "*". - Add explicit server-side encryption to the DynamoDB table.
- Remove vendored dependencies that leaked into the source tree (
six.py,bin/jp.py) and add a.gitignore(Python caches, Terraform state, generated zip/txt artifacts). - Fix the broken architecture diagram reference in the README (the image markdown is malformed).
- Align the container prerequisite with the deployment scripts — the README says "Docker or Finch" but the scripts call
finchdirectly. - Resolve the Bedrock model mismatch — the README references Claude 3.5 Sonnet v2 while the code uses
claude-sonnet-4-6. - Document the minimum AWS CLI version required for durable functions in the prerequisites.
- Add a dependency installation/bundling step — the durable execution SDK is never installed into the package, so the function will fail to import it at runtime (this is the most critical issue; see next point).
- Bundle the durable execution SDK into the Lambda deployment package (or attach it as a layer); the current
archive_filezips the source only. - Read the Bedrock model ID from the
BEDROCK_MODEL_IDenvironment variable instead of hardcoding it. - Remove the hardcoded/inconsistent region fallbacks and rely on the runtime-provided region.
- Replace
print()with structured logging in the non-durable modules. - Don't silently swallow callback-send failures — log/raise so a failed callback doesn't leave the conversation stuck.
- Fix AWS service naming throughout the README and metadata: use "Lambda durable functions" (lowercase — it's an AWS Lambda feature, not the Azure product), and full official names on first reference (Amazon Bedrock AgentCore, Amazon DynamoDB, Amazon API Gateway).
- Replace the in-memory Slack event deduplication with a durable store (e.g., DynamoDB conditional write + TTL); the current in-memory dict doesn't hold across concurrent Lambda instances.
- Move the non-deterministic
execution_idtimestamp generation inside acontext.step()— generating it outside a step breaks the replay model. - Make the
wait_for_callbacksubmitter writes deterministic — the wall-clock timestamps used to pick the active callback can diverge on replay.
|
Thank you for the input @marcojahn. All the recommendations addressed and applied. Please let me know if there are any further changes to be made! |
marcojahn
left a comment
There was a problem hiding this comment.
Hi @ras-al-jil, I found same gaps during setup, can you please investigate/fix/change?
| { | ||
| Effect = "Allow" | ||
| Action = [ | ||
| "bedrock:InvokeModel", | ||
| "bedrock:InvokeModelWithResponseStream" | ||
| ] | ||
| Resource = "*" | ||
| }, |
There was a problem hiding this comment.
The AgentCore runtime role grants bedrock:InvokeModel and bedrock:InvokeModelWithResponseStream on Resource = "*". The pattern uses a specific Claude inference profile (us.anthropic.claude-sonnet-4-6) — the resource can be scoped to foundation models and inference profiles.
| { | |
| Effect = "Allow" | |
| Action = [ | |
| "bedrock:InvokeModel", | |
| "bedrock:InvokeModelWithResponseStream" | |
| ] | |
| Resource = "*" | |
| }, | |
| { | |
| Effect = "Allow" | |
| Action = [ | |
| "bedrock:InvokeModel", | |
| "bedrock:InvokeModelWithResponseStream" | |
| ] | |
| Resource = [ | |
| "arn:aws:bedrock:*::foundation-model/anthropic.claude-*", | |
| "arn:aws:bedrock:*:*:inference-profile/*" | |
| ] | |
| }, |
| terraform init | ||
| terraform apply -auto-approve | ||
| ``` | ||
| > **Note:** The build script (`terraform/build.sh`) automatically installs Python dependencies into a `build/` directory during `terraform apply`. No manual dependency installation is needed. |
There was a problem hiding this comment.
This variant of setup is very opinionated
- e.g. I use
uv pip, notpip, so it fails -> I can fix it, but I don't think people will probably start doing this.
│ Error: local-exec provisioner error
│
│ with null_resource.lambda_build,
│ on main.tf line 401, in resource "null_resource" "lambda_build":
│ 401: provisioner "local-exec" {
│
│ Error running command 'bash ./build.sh': exit status 127. Output: Building Lambda deployment package...
│ Installing dependencies...
│ ./build.sh: line 22: pip: command not foundfinchdoes not exists
│ Error: local-exec provisioner error
│
│ with null_resource.agentcore_image_build,
│ on main.tf line 782, in resource "null_resource" "agentcore_image_build":
│ 782: provisioner "local-exec" {
│
...
│
│ # Save image digest
│ DIGEST=$(finch images --no-trunc --format '{{.ID}}' ACCOUNT_ID.dkr.ecr.us-east-2.amazonaws.com/dfs-agentcore-agent:latest)
│ echo "$DIGEST" > ./image_digest.txt
│ echo "Image digest saved: $DIGEST"
│ ': exit status 1. Output: Login Succeeded
│ time="2026-06-23T11:04:45+02:00" level=fatal msg="instance \"finch\" does not exist, run `finch vm init` to create a new instance"Which is not true on my machine
❯ which finch
/usr/local/bin/finch
I would recommend splitting the setup in a few separate commands (I can relate the intention to make it easy here), which gives people to a/ see whats happening and, 2/ make it easy to adopt to their local machine.
*Issue #3112
Description of changes:
A Slack chatbot using AWS Lambda Durable Functions for stateful, multi-turn conversations with human-in-the-loop interactions, generating travel itineraries via Amazon Bedrock through AgentCore.
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.