Skip to content

fix: webchat file attachment renamed with UUID instead of original name#8486

Open
march-7th-mini wants to merge 5 commits into
AstrBotDevs:masterfrom
march-7th-mini:fix/webchat-file-rename
Open

fix: webchat file attachment renamed with UUID instead of original name#8486
march-7th-mini wants to merge 5 commits into
AstrBotDevs:masterfrom
march-7th-mini:fix/webchat-file-rename

Conversation

@march-7th-mini
Copy link
Copy Markdown

@march-7th-mini march-7th-mini commented Jun 1, 2026

Replace UUID-based file naming with original filenames and add dedup counter.

修复 WebChat 文件被 UUID 重命名的问题

Fixes #8470

AstrBot WebChat sends files via SSE stream. The _send() method in webchat_event.py used uuid.uuid4() to rename files, causing them to appear as UUID strings (e.g. c697f0da-198b-49df-abaa-977b1a59e2df.html) instead of their original names. This fix preserves the original filename and adds a counter suffix (_1, _2...) when conflicts occur.

AstrBot WebChat 通过 SSE 流发送文件时,webchat_event.py_send() 方法用 uuid.uuid4() 重命名文件,导致用户收到的文件名是一串 UUID 乱码而非原始名称。本次修改保留原始文件名,同名文件自动添加 _1_2 计数器后缀防冲突。

Modifications / 改动点

File modified / 修改的文件:

  • astrbot/core/platform/sources/webchat/webchat_event.py

Changes / 改动内容:

  • Replace uuid.uuid4() with original filename in _send() method's File handler

  • Add dedup counter (_1, _2...) to avoid filename conflicts

  • 移除 uuid.uuid4() 重命名,改用原始文件名保存

  • 添加 _1_2 计数器后缀防同名冲突

  • This is NOT a breaking change. / 这不是一个破坏性变更。

Screenshots or Test Results / 运行截图或测试结果

Test Scenarios / 测试场景:

Scenario / 场景 Before / 修复前 After / 修复后
SSE streaming during conversation / 对话中 SSE 流式发送 ❌ UUID: c697f0da-...html ✅ Original name: final_test.html
Background delivery after closing browser / 关闭浏览器后台发送 ❌ UUID ✅ Original name: offline_test.html
Multiple versions regression / 多版本回归测试 ❌ UUID (v4.25.1, v4.25.2) ✅ All passed

Disk attachment directory comparison / 附件目录对比:

Before / 修复前:
  c697f0da-198b-49df-abaa-977b1a59e2df.html  ← UUID乱码

After / 修复后:
  final_test.html                              ← 原始文件名
  offline_test.html                            ← 原始文件名(含防重名后缀)

Checklist / 检查清单

  • 😊 If there are new features added in the PR, I have discussed it with the authors through issues/emails, etc.
    / 如果 PR 中有新加入的功能,已经通过 Issue / 邮件等方式和作者讨论过。

  • 👀 My changes have been well-tested, and "Verification Steps" and "Screenshots" have been provided above.
    / 我的更改经过了良好的测试,并已在上方提供了“验证步骤”和“运行截图”

  • 🤓 I have ensured that no new dependencies are introduced, OR if new dependencies are introduced, they have been added to the appropriate locations in requirements.txt and pyproject.toml.
    / 我确保没有引入新依赖库,或者引入了新依赖库的同时将其添加到 requirements.txtpyproject.toml 文件相应位置。

  • 😮 My changes do not introduce malicious code.
    / 我的更改没有引入恶意代码。

Summary by Sourcery

Preserve original filenames for WebChat file attachments while ensuring name uniqueness on disk.

Bug Fixes:

  • Fix WebChat SSE file attachments being renamed to UUIDs instead of using the original filenames.

Enhancements:

  • Add filename de-duplication by appending a counter suffix when an attachment with the same name already exists.

Replace UUID-based file naming with original filenames and add dedup counter.
@dosubot dosubot Bot added size:S This PR changes 10-29 lines, ignoring generated files. feature:chatui The bug / feature is about astrbot's chatui, webchat labels Jun 1, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request changes how files are saved locally in the webchat platform source, switching from using a random UUID to preserving the original filename and appending a counter to handle duplicates. The review highlights a high-severity path traversal vulnerability because the filename is not sanitized before being joined to the destination path, and identifies a redundant splitext computation inside the deduplication loop. A code suggestion is provided to resolve both issues.

Comment on lines 114 to +121
original_name = comp.name or os.path.basename(file_path)
ext = os.path.splitext(original_name)[1] or ""
filename = f"{uuid.uuid4()!s}{ext}"
dest_path = os.path.join(attachments_dir, filename)
dest_path = os.path.join(attachments_dir, original_name)
# dedup: if file with same name exists, append a counter
counter = 1
while os.path.exists(dest_path):
name_part, ext_part = os.path.splitext(original_name)
dest_path = os.path.join(attachments_dir, f"{name_part}_{counter}{ext_part}")
counter += 1
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

Security & Efficiency Review

  1. Path Traversal Vulnerability (High Severity): Using comp.name directly in os.path.join allows potential path traversal attacks if the filename contains directory traversal sequences (e.g., ../ or ..\). This could allow an attacker to write files outside the intended attachments_dir directory. Wrapping it in os.path.basename sanitizes the filename.
  2. Redundant Computation: Calling os.path.splitext(original_name) inside the while loop is redundant because original_name does not change. Moving it outside the loop improves performance.
Suggested change
original_name = comp.name or os.path.basename(file_path)
ext = os.path.splitext(original_name)[1] or ""
filename = f"{uuid.uuid4()!s}{ext}"
dest_path = os.path.join(attachments_dir, filename)
dest_path = os.path.join(attachments_dir, original_name)
# dedup: if file with same name exists, append a counter
counter = 1
while os.path.exists(dest_path):
name_part, ext_part = os.path.splitext(original_name)
dest_path = os.path.join(attachments_dir, f"{name_part}_{counter}{ext_part}")
counter += 1
original_name = os.path.basename(comp.name or file_path)
name_part, ext_part = os.path.splitext(original_name)
dest_path = os.path.join(attachments_dir, original_name)
# dedup: if file with same name exists, append a counter
counter = 1
while os.path.exists(dest_path):
dest_path = os.path.join(attachments_dir, f"{name_part}_{counter}{ext_part}")
counter += 1

Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 1 issue, and left some high level feedback:

  • When using the original filename, consider normalizing/sanitizing original_name (e.g., stripping directory components, disallowing .. or path separators, enforcing a max length) before os.path.join to avoid potential path traversal and filesystem issues from untrusted filenames.
  • To make the dedup logic clearer and slightly more efficient, you could precompute name_part and ext_part once before the while loop and update only the suffix inside the loop rather than calling os.path.splitext on every iteration.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- When using the original filename, consider normalizing/sanitizing `original_name` (e.g., stripping directory components, disallowing `..` or path separators, enforcing a max length) before `os.path.join` to avoid potential path traversal and filesystem issues from untrusted filenames.
- To make the dedup logic clearer and slightly more efficient, you could precompute `name_part` and `ext_part` once before the `while` loop and update only the suffix inside the loop rather than calling `os.path.splitext` on every iteration.

## Individual Comments

### Comment 1
<location path="astrbot/core/platform/sources/webchat/webchat_event.py" line_range="114" />
<code_context>
-                # save file to local
+                # save file to local with original name
                 file_path = await comp.get_file()
                 original_name = comp.name or os.path.basename(file_path)
-                ext = os.path.splitext(original_name)[1] or ""
-                filename = f"{uuid.uuid4()!s}{ext}"
</code_context>
<issue_to_address>
**🚨 issue (security):** Consider sanitizing `original_name` to avoid unsafe or unexpected path components.

`comp.name` might include path separators or `..` segments, which could create unintended paths when joined with `attachments_dir`. Normalize `original_name` before use (e.g., wrap it with `os.path.basename` and optionally filter out unsafe characters) to avoid path traversal or malformed paths.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment thread astrbot/core/platform/sources/webchat/webchat_event.py
优化版代码:加了 os.path.basename 防路径穿越 + splitext 提到循环外。本地测试一切正常。
name_part, ext_part = os.path.splitext(safe_name)
# dedup: if file with same name exists, append a counter
counter = 1
while os.path.exists(dest_path):
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里可以限制循环上限(比如 100_000),防止一些奇怪的问题

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

谢谢大佬!空了麻烦您再帮忙看看。

起初我只是觉得AI每次回复的文件都被重命名,就很不方便,
于是我问AstrBot,帮忙修改实现的这个效果,

因为我完全没有任何编程经验,我判断不了这个修改是否符合项目的规范,可能会带来哪些风险。

fix: 限制防重名循环上限为 10000
fix: 降低防重名循环上限至 1000,正常情况下使用足够了,修复 ruff 格式检查。
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature:chatui The bug / feature is about astrbot's chatui, webchat size:S This PR changes 10-29 lines, ignoring generated files.

Projects

None yet

2 participants