Skip to content

chore: sync lark-doc skill from online-doc#1749

Open
liuxin-0319 wants to merge 1 commit into
mainfrom
chore/sync-lark-doc-v43-20260704-182205
Open

chore: sync lark-doc skill from online-doc#1749
liuxin-0319 wants to merge 1 commit into
mainfrom
chore/sync-lark-doc-v43-20260704-182205

Conversation

@liuxin-0319

@liuxin-0319 liuxin-0319 commented Jul 4, 2026

Copy link
Copy Markdown
Collaborator

Summary

  • Sync ActionHub lark-doc v43 changes back to public skills/lark-doc.
  • Apply v43 - v34 via restore_online_doc_to_lark_doc.py.
  • Run dir: /Users/bytedance/xiage_workspaces/runs/20260704-182205-lark-doc-v43

Verification

  • python3 -m py_compile skills/lark-doc/scripts/count_chars.py
  • node scripts/skill-format-check/index.js skills
  • git diff --check
  • git diff --cached --check

Summary by CodeRabbit

  • New Features

    • Added clearer guidance for creating and updating documents, including the required latest document API mode and built-in word-count validation.
    • Improved support for whiteboards and diagrams, with more explicit handling for Mermaid and SVG content.
  • Documentation

    • Updated writing rules, self-check steps, numbering guidance, and formatting recommendations to make document output more consistent.
    • Refined reference notes for callouts, grids, and synced references.

@github-actions github-actions Bot added domain/ccm PR touches the ccm domain size/M Single-domain feat or fix with limited business impact labels Jul 4, 2026
@coderabbitai

coderabbitai Bot commented Jul 4, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

This PR updates the lark-doc skill documentation to require Feishu Docs v2 API usage (--api-version v2) across create/fetch/update workflows, revises quick-decision routing, whiteboard/Mermaid insertion rules, style guide checklists, and SubAgent contracts, and adds a new count_chars.py script for word-count validation integrated into create/update workflows.

Changes

Lark Doc v2 documentation and word-count tooling

Layer / File(s) Summary
SKILL.md v2 contract and routing
skills/lark-doc/SKILL.md
Enforces --api-version v2 for docs commands, rewrites quick-decision routing for fetch modes/block IDs/whiteboard insertion and word-count checks, updates synced_reference fetch, and removes history shortcut row.
Reference tag documentation
skills/lark-doc/references/lark-doc-xml.md
Adds pointers to lark-doc-style.md writing principles for <callout> and <grid>/<column> tags.
Writing style guide
skills/lark-doc/references/style/lark-doc-style.md
Refines paragraph/table guidance, removes checkbox rendering rule, clarifies numbering rules with <ol seq="auto">, simplifies color guidance, and rewrites self-check checklist.
count_chars.py script
skills/lark-doc/scripts/count_chars.py
New CLI script fetching document raw_content via lark-cli, counting Hanzi/CJK punctuation/Latin words/digits, computing pass/under/over verdict against min/max/approx targets, and printing JSON output.
Create-workflow integration
skills/lark-doc/references/style/lark-doc-create-workflow.md
Updates create command to v2, rewrites self-check and whiteboard prioritization steps, replaces word-count validation with a count_chars.py-based loop capped at 2 rounds, and adds whiteboard SubAgent input contracts.
Update-workflow integration
skills/lark-doc/references/style/lark-doc-update-workflow.md
Expands context-fetch options, enforces sequential per-section rewriting, adds detailed validation checklist, replaces word-count step with count_chars.py-based procedure, and expands whiteboard SubAgent contracts.

Estimated code review effort: 3 (Moderate) | ~25 minutes

Possibly related PRs

  • larksuite/cli#502: Prior whiteboard-skill refactor in the same SKILL.md file, related to Mermaid insertion/quick-routing guidance.
  • larksuite/cli#1097: Related refactor of Mermaid vs SVG vs existing-board routing and SubAgent responsibilities.
  • larksuite/cli#1697: Adds an analogous word-count helper (doc_word_stat.py) and SKILL.md routing wired similarly to this PR's count_chars.py.

Suggested labels: size/L

Suggested reviewers: SunPeiYang996, caojie0621, fangshuyu-768

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed Accurately summarizes the PR as syncing the lark-doc skill from online-doc.
Description check ✅ Passed Includes a solid summary and verification steps, but it omits the required Changes and Related Issues sections from the template.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch chore/sync-lark-doc-v43-20260704-182205

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

text = fetch_raw_content(args.doc, args.identity)
elif args.file:
try:
text = open(args.file, encoding="utf-8").read()
@codecov

codecov Bot commented Jul 4, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 74.42%. Comparing base (c45ff56) to head (39f1356).

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #1749   +/-   ##
=======================================
  Coverage   74.42%   74.42%           
=======================================
  Files         854      854           
  Lines       88457    88457           
=======================================
  Hits        65832    65832           
  Misses      17556    17556           
  Partials     5069     5069           

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@github-actions

github-actions Bot commented Jul 4, 2026

Copy link
Copy Markdown

🚀 PR Preview Install Guide

🧰 CLI update

npm i -g https://pkg.pr.new/larksuite/cli/@larksuite/cli@39f13561f6a56e55b77afadbad496df0b61a1c0c

🧩 Skill update

npx skills add larksuite/cli#chore/sync-lark-doc-v43-20260704-182205 -y -g

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (1)
skills/lark-doc/scripts/count_chars.py (1)

46-51: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low value

Narrow the blind except Exception.

Line 50 catches all exceptions when parsing raw_content. Narrowing to the expected failure modes (KeyError, TypeError, json.JSONDecodeError) would avoid masking unrelated bugs while still producing a clear error message.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@skills/lark-doc/scripts/count_chars.py` around lines 46 - 51, The raw_content
parsing in count_chars.py uses a blind except Exception around
json.loads(out.stdout)["data"]["content"], so narrow that handler to the
expected failures only. Update the try/except in the parsing block to catch
KeyError, TypeError, and json.JSONDecodeError, and keep the existing sys.exit
message with the exception details so the failure context stays clear without
masking unrelated bugs.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@skills/lark-doc/references/style/lark-doc-create-workflow.md`:
- Around line 51-55: The workflow currently omits support for exact word-count
requests even though “写 N 字” is mentioned as a valid trigger; update the
normalization rules in lark-doc-create-workflow.md so the step explicitly maps
exact counts from the workflow into validator arguments, using the same counting
flow as the `scripts/count_chars.py` command, or remove “写 N 字” from the
supported cases if exact matching is not meant to be handled. Keep the guidance
aligned with the existing `--min/--max/--approx` mapping and ensure the
`verdict` handling still works with the chosen exact-count behavior.

In `@skills/lark-doc/references/style/lark-doc-style.md`:
- Around line 37-44: Make the numbering guidance internally consistent in
lark-doc-style.md: the current examples and rules around `一、→(一)→ 1.→(1)`, `1 →
1.1 → 1.1.1`, and the “标题层级只给章节” guidance conflict with each other. Update the
wording in the numbering section so it clearly distinguishes fully Chinese
numbering from mixed/Arabic numbering, and align the advice on when to use
headings versus manual numbering so authors are not told both to reserve
headings for chapters and to use headings for hand-typed Chinese numbering.

In `@skills/lark-doc/references/style/lark-doc-update-workflow.md`:
- Around line 51-58: The word-count workflow in “步骤四:字数校验” is missing handling
for exact targets like “写 N 字”, so update the parameter mapping in this section
to translate that trigger into the validator’s exact range form using the
existing `scripts/count_chars.py` flow; adjust the supported cases list and the
conversion rules together so `x 字左右`, `x-y`, `>x`, `<y`, and exact `写 N 字` all
map consistently, and keep the instructions aligned with the `verdict`/rerun
logic already described.

In `@skills/lark-doc/scripts/count_chars.py`:
- Around line 39-51: Add a timeout to the lark-cli invocation in
fetch_raw_content so the script cannot hang indefinitely; update the
subprocess.run call to use a reasonable timeout and handle the resulting timeout
exception by exiting with a clear message, alongside the existing
FileNotFoundError and nonzero return code handling. Keep the fix localized to
fetch_raw_content and ensure any caller in the word-count validation flow still
fails fast when raw content retrieval stalls.

In `@skills/lark-doc/SKILL.md`:
- Line 58: The synced reference row in SKILL.md is missing the block-ID detail
needed to resolve src-block-id. Update the docs fetch guidance in that table
entry to use the v2 fetch command with --detail with-ids, and keep the existing
src-token/src-block-id mapping note so readers know how to locate the referenced
block.

---

Nitpick comments:
In `@skills/lark-doc/scripts/count_chars.py`:
- Around line 46-51: The raw_content parsing in count_chars.py uses a blind
except Exception around json.loads(out.stdout)["data"]["content"], so narrow
that handler to the expected failures only. Update the try/except in the parsing
block to catch KeyError, TypeError, and json.JSONDecodeError, and keep the
existing sys.exit message with the exception details so the failure context
stays clear without masking unrelated bugs.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 50392c30-4a98-4436-bf1b-d1726e069dcc

📥 Commits

Reviewing files that changed from the base of the PR and between c45ff56 and 39f1356.

📒 Files selected for processing (6)
  • skills/lark-doc/SKILL.md
  • skills/lark-doc/references/lark-doc-xml.md
  • skills/lark-doc/references/style/lark-doc-create-workflow.md
  • skills/lark-doc/references/style/lark-doc-style.md
  • skills/lark-doc/references/style/lark-doc-update-workflow.md
  • skills/lark-doc/scripts/count_chars.py

Comment on lines +51 to +55
**仅当**用户给了明确字数要求(写 N 字 / x-y 字 / x 字左右 / 上下浮动)时执行;否则**跳过本步**。字数必须用脚本量,不要自己估。

1. 把要求归一成参数:`>x`→`--min x`;`<y`→`--max y`;`x-y`→`--min x --max y`;`x 字左右`→`--approx x`(自动 ±10%)
2. 量实际字数(对齐飞书「总字数」):`uv run scripts/count_chars.py --doc <document_id> <上面的目标参数>`(脚本在 lark-doc skill 根的 `scripts/` 下)
3. 看输出 `verdict`:`pass` 即通过;`under` → 在最该展开的节补**实质内容**(非注水);`over` → 从最长/最冗余处删减。改完**重新跑脚本复测**

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win

Handle exact word-count targets.

写 N 字 is listed as a supported trigger, but this step never maps it to validator args. If exact counts are intended, translate it to --min N --max N; otherwise drop it from the supported cases so the workflow stays executable.

Proposed fix
-1. 把要求归一成参数:`>x`→`--min x`;`<y`→`--max y`;`x-y`→`--min x --max y`;`x 字左右`→`--approx x`(自动 ±10%)
+1. 把要求归一成参数:`写 N 字`→`--min N --max N`;`>x`→`--min x`;`<y`→`--max y`;`x-y`→`--min x --max y`;`x 字左右`→`--approx x`(自动 ±10%)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
**仅当**用户给了明确字数要求(写 N 字 / x-y 字 / x 字左右 / 上下浮动)时执行;否则**跳过本步**。字数必须用脚本量,不要自己估。
1. 把要求归一成参数:`>x``--min x``<y``--max y``x-y``--min x --max y``x 字左右``--approx x`(自动 ±10%)
2. 量实际字数(对齐飞书「总字数」):`uv run scripts/count_chars.py --doc <document_id> <上面的目标参数>`(脚本在 lark-doc skill 根的 `scripts/` 下)
3. 看输出 `verdict``pass` 即通过;`under` → 在最该展开的节补**实质内容**(非注水);`over` → 从最长/最冗余处删减。改完**重新跑脚本复测**
**仅当**用户给了明确字数要求(写 N 字 / x-y 字 / x 字左右 / 上下浮动)时执行;否则**跳过本步**。字数必须用脚本量,不要自己估。
1. 把要求归一成参数:`写 N 字``--min N --max N``>x``--min x``<y``--max y``x-y``--min x --max y``x 字左右``--approx x`(自动 ±10%)
2. 量实际字数(对齐飞书「总字数」):`uv run scripts/count_chars.py --doc <document_id> <上面的目标参数>`(脚本在 lark-doc skill 根的 `scripts/` 下)
3. 看输出 `verdict``pass` 即通过;`under` → 在最该展开的节补**实质内容**(非注水);`over` → 从最长/最冗余处删减。改完**重新跑脚本复测**
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@skills/lark-doc/references/style/lark-doc-create-workflow.md` around lines 51
- 55, The workflow currently omits support for exact word-count requests even
though “写 N 字” is mentioned as a valid trigger; update the normalization rules
in lark-doc-create-workflow.md so the step explicitly maps exact counts from the
workflow into validator arguments, using the same counting flow as the
`scripts/count_chars.py` command, or remove “写 N 字” from the supported cases if
exact matching is not meant to be handled. Keep the guidance aligned with the
existing `--min/--max/--approx` mapping and ensure the `verdict` handling still
works with the chosen exact-count behavior.

Comment on lines +37 to 44
- **一套编号体例、全篇一致;最忌中文 + 阿拉伯混用。**
- 公文 / 正式材料:「一、→(一)→ 1.→(1)」(全中文层级)。
- 学术 / 技术 / 商业报告:「1 → 1.1 → 1.1.1」或「一、→(一)→ 1.」,**择一**。
- ⚠️ **「一、」只能配「(一)」;要用阿拉伯小数就从顶层全用「1 / 1.1」。绝不「一、」配「1.1 / 2.1」**——这是最常见的混用。
- **不混用**多套(别"第X部分"+"一、"+"1."混着来);**同级不跳号**;**不跳级**。
- **编号 / 标题层级只给"章节"**,不要为了凑齐体例把每个小项都编上「(一)」、升成标题(小项处理方式见上文「二、默认写连贯段落」)。
- 简单的 1.2.3 并列项用原生 `<ol><li seq="auto">…</li></ol>` 让飞书自动编号、自动对齐;「一、(一)」原生产不出,才手打成文字——此时用标题级别表达层次,**不靠手动缩进**、各级顶格(全角括号「()」叠手动缩进会视觉错位)。
- **编号 / 标题层级只给"章节"**,不要为了凑齐体例把每个小项都编上「(一)」、升成标题(小项怎么放见 §二)。
- 简单的 1.2.3 并列项用**原生 `<ol seq="auto">`** 让飞书自动编号、自动对齐;「一、(一)」原生产不出,才手打成文字——此时用标题级别表达层次,**不靠手动缩进**、各级顶格(全角括号「()」叠手动缩进会视觉错位)。

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win

Make the numbering guidance consistent.

Line 38 labels 一、→(一)→ 1.→(1) as “全中文层级” even though it mixes in Arabic tiers, and Line 43 then suggests using headings for hand-typed Chinese numbering. That clashes with the earlier “标题层级只给章节” rule and will confuse authors.

🧰 Tools
🪛 LanguageTool

[uncategorized] ~40-~40: 动词的修饰一般为‘形容词(副词)+地+动词’。您的意思是否是:常见"地"混
Context: ...层全用「1 / 1.1」。绝不「一、」配「1.1 / 2.1」**——这是最常见的混用。 - 不混用多套(别"第X部分"+"一、"+"1."混着来);...

(wb4)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@skills/lark-doc/references/style/lark-doc-style.md` around lines 37 - 44,
Make the numbering guidance internally consistent in lark-doc-style.md: the
current examples and rules around `一、→(一)→ 1.→(1)`, `1 → 1.1 → 1.1.1`, and the
“标题层级只给章节” guidance conflict with each other. Update the wording in the
numbering section so it clearly distinguishes fully Chinese numbering from
mixed/Arabic numbering, and align the advice on when to use headings versus
manual numbering so authors are not told both to reserve headings for chapters
and to use headings for hand-typed Chinese numbering.

Comment on lines +51 to +58
### 步骤四:字数校验(无明确字数要求则跳过)

**上下文节省提示**:主 Agent 改某节时如需重新读取,优先用 `docs +fetch --scope section --start-block-id <章节标题id>`(自动覆盖整节),或 `--scope range --start-block-id xxx --end-block-id yyy` 精确区间,只拉当前章节,不要重复拉全文。
**仅当**用户给了明确字数要求(写 N 字 / x-y 字 / x 字左右 / 上下浮动)时执行;否则**跳过本步**。字数必须用脚本量,不要自己估。

1. 把要求归一成参数:`>x`→`--min x`;`<y`→`--max y`;`x-y`→`--min x --max y`;`x 字左右`→`--approx x`(自动 ±10%)
2. 量实际字数(对齐飞书「总字数」):`uv run scripts/count_chars.py --doc <document_id> <上面的目标参数>`(脚本在 lark-doc skill 根的 `scripts/` 下)
3. 看输出 `verdict`:`pass` 即通过;`under` → 在最该展开处补**实质内容**(非注水);`over` → 从最长/最冗余处删减。改完**重新跑脚本复测**
4. **最多 2 轮**。2 轮后仍不达标:停止,不得为达标而注水或删关键内容;如实汇报【目标区间 / 当前字数 / 差值与方向 / 已试 2 轮 / 未达原因】并交付文档链接,**禁止谎称达标**

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win

Handle exact word-count targets.

写 N 字 is listed as a supported trigger, but this step never maps it to validator args. If exact counts are intended, translate it to --min N --max N; otherwise drop it from the supported cases so the workflow stays executable.

Proposed fix
-1. 把要求归一成参数:`>x`→`--min x`;`<y`→`--max y`;`x-y`→`--min x --max y`;`x 字左右`→`--approx x`(自动 ±10%)
+1. 把要求归一成参数:`写 N 字`→`--min N --max N`;`>x`→`--min x`;`<y`→`--max y`;`x-y`→`--min x --max y`;`x 字左右`→`--approx x`(自动 ±10%)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
### 步骤四:字数校验(无明确字数要求则跳过)
**上下文节省提示**:主 Agent 改某节时如需重新读取,优先用 `docs +fetch --scope section --start-block-id <章节标题id>`(自动覆盖整节),或 `--scope range --start-block-id xxx --end-block-id yyy` 精确区间,只拉当前章节,不要重复拉全文。
**仅当**用户给了明确字数要求(写 N 字 / x-y 字 / x 字左右 / 上下浮动)时执行;否则**跳过本步**。字数必须用脚本量,不要自己估。
1. 把要求归一成参数:`>x``--min x``<y``--max y``x-y``--min x --max y``x 字左右``--approx x`(自动 ±10%)
2. 量实际字数(对齐飞书「总字数」):`uv run scripts/count_chars.py --doc <document_id> <上面的目标参数>`(脚本在 lark-doc skill 根的 `scripts/` 下)
3. 看输出 `verdict``pass` 即通过;`under` → 在最该展开处补**实质内容**(非注水);`over` → 从最长/最冗余处删减。改完**重新跑脚本复测**
4. **最多 2 轮**。2 轮后仍不达标:停止,不得为达标而注水或删关键内容;如实汇报【目标区间 / 当前字数 / 差值与方向 / 已试 2 轮 / 未达原因】并交付文档链接,**禁止谎称达标**
### 步骤四:字数校验(无明确字数要求则跳过)
**仅当**用户给了明确字数要求(写 N 字 / x-y 字 / x 字左右 / 上下浮动)时执行;否则**跳过本步**。字数必须用脚本量,不要自己估。
1. 把要求归一成参数:`写 N 字``--min N --max N``>x``--min x``<y``--max y``x-y``--min x --max y``x 字左右``--approx x`(自动 ±10%)
2. 量实际字数(对齐飞书「总字数」):`uv run scripts/count_chars.py --doc <document_id> <上面的目标参数>`(脚本在 lark-doc skill 根的 `scripts/` 下)
3. 看输出 `verdict``pass` 即通过;`under` → 在最该展开处补**实质内容**(非注水);`over` → 从最长/最冗余处删减。改完**重新跑脚本复测**
4. **最多 2 轮**。2 轮后仍不达标:停止,不得为达标而注水或删关键内容;如实汇报【目标区间 / 当前字数 / 差值与方向 / 已试 2 轮 / 未达原因】并交付文档链接,**禁止谎称达标**
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@skills/lark-doc/references/style/lark-doc-update-workflow.md` around lines 51
- 58, The word-count workflow in “步骤四:字数校验” is missing handling for exact
targets like “写 N 字”, so update the parameter mapping in this section to
translate that trigger into the validator’s exact range form using the existing
`scripts/count_chars.py` flow; adjust the supported cases list and the
conversion rules together so `x 字左右`, `x-y`, `>x`, `<y`, and exact `写 N 字` all
map consistently, and keep the instructions aligned with the `verdict`/rerun
logic already described.

Comment on lines +39 to +51
def fetch_raw_content(doc_id, identity):
cmd = ["lark-cli", "api", "GET",
f"/open-apis/docx/v1/documents/{doc_id}/raw_content", "--as", identity]
try:
out = subprocess.run(cmd, capture_output=True, text=True)
except FileNotFoundError:
sys.exit("未找到 lark-cli:请先安装/配置 lark-cli,或改用 --file / stdin 传入文本")
if out.returncode != 0:
sys.exit(f"取 raw_content 失败: {out.stderr or out.stdout}")
try:
return json.loads(out.stdout)["data"]["content"]
except Exception as e:
sys.exit(f"解析 raw_content 失败: {e}\n{out.stdout[:300]}")

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🩺 Stability & Availability | 🟠 Major | ⚡ Quick win

Add a timeout to the lark-cli subprocess call.

subprocess.run at Line 43 has no timeout, so if lark-cli hangs (network stall, auth prompt, etc.) this script — and the create/update workflow's word-count validation loop that depends on it — will block indefinitely.

🔧 Proposed fix
     try:
-        out = subprocess.run(cmd, capture_output=True, text=True)
+        out = subprocess.run(cmd, capture_output=True, text=True, timeout=30)
     except FileNotFoundError:
         sys.exit("未找到 lark-cli:请先安装/配置 lark-cli,或改用 --file / stdin 传入文本")
+    except subprocess.TimeoutExpired:
+        sys.exit("lark-cli 调用超时")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def fetch_raw_content(doc_id, identity):
cmd = ["lark-cli", "api", "GET",
f"/open-apis/docx/v1/documents/{doc_id}/raw_content", "--as", identity]
try:
out = subprocess.run(cmd, capture_output=True, text=True)
except FileNotFoundError:
sys.exit("未找到 lark-cli:请先安装/配置 lark-cli,或改用 --file / stdin 传入文本")
if out.returncode != 0:
sys.exit(f"取 raw_content 失败: {out.stderr or out.stdout}")
try:
return json.loads(out.stdout)["data"]["content"]
except Exception as e:
sys.exit(f"解析 raw_content 失败: {e}\n{out.stdout[:300]}")
def fetch_raw_content(doc_id, identity):
cmd = ["lark-cli", "api", "GET",
f"/open-apis/docx/v1/documents/{doc_id}/raw_content", "--as", identity]
try:
out = subprocess.run(cmd, capture_output=True, text=True, timeout=30)
except FileNotFoundError:
sys.exit("未找到 lark-cli:请先安装/配置 lark-cli,或改用 --file / stdin 传入文本")
except subprocess.TimeoutExpired:
sys.exit("lark-cli 调用超时")
if out.returncode != 0:
sys.exit(f"取 raw_content 失败: {out.stderr or out.stdout}")
try:
return json.loads(out.stdout)["data"]["content"]
except Exception as e:
sys.exit(f"解析 raw_content 失败: {e}\n{out.stdout[:300]}")
🧰 Tools
🪛 ast-grep (0.44.0)

[error] 42-42: Use of unsanitized data to create processes
Context: subprocess.run(cmd, capture_output=True, text=True)
Note: [CWE-78] Improper Neutralization of Special Elements used in an OS Command ('OS Command Injection').

(os-system-unsanitized-data)


[error] 42-42: Command coming from incoming request
Context: subprocess.run(cmd, capture_output=True, text=True)
Note: [CWE-78] Improper Neutralization of Special Elements used in an OS Command ('OS Command Injection').

(subprocess-from-request)

🪛 Ruff (0.15.20)

[error] 43-43: subprocess call: check for execution of untrusted input

(S603)


[warning] 45-45: String contains ambiguous (FULLWIDTH COLON). Did you mean : (COLON)?

(RUF001)


[warning] 45-45: String contains ambiguous (FULLWIDTH COMMA). Did you mean , (COMMA)?

(RUF001)


[warning] 50-50: Do not catch blind exception: Exception

(BLE001)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@skills/lark-doc/scripts/count_chars.py` around lines 39 - 51, Add a timeout
to the lark-cli invocation in fetch_raw_content so the script cannot hang
indefinitely; update the subprocess.run call to use a reasonable timeout and
handle the resulting timeout exception by exiting with a clear message,
alongside the existing FileNotFoundError and nonzero return code handling. Keep
the fix localized to fetch_raw_content and ensure any caller in the word-count
validation flow still fails fast when raw content retrieval stalls.

Comment thread skills/lark-doc/SKILL.md
| `<cite type="doc" file-type="bitable" token="..." table-id="...">` | 同 `<bitable>` | [`lark-base`](../lark-base/SKILL.md) |
| `<vc-transcribe-tab vc-node-id="...">` | `vc-node-id` -> note_id | [`lark-note`](../lark-note/SKILL.md):先 `note +detail --note-id <vc-node-id>` |
| `<synced_reference src-token="..." src-block-id="...">` | `src-token` -> doc_token, `src-block-id` -> block_id | 用 `docs +fetch` 读取 src-token 文档,定位 block |
| `<synced_reference src-token="..." src-block-id="...">` | `src-token` -> doc_token, `src-block-id` -> block_id | 用 `docs +fetch --api-version v2` 读取 src-token 文档,定位 block |

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

printf '\n## File around line 58\n'
nl -ba skills/lark-doc/SKILL.md | sed -n '45,75p'

printf '\n## Search related terms\n'
rg -n "docs \+fetch|api-version v2|src-block-id|src-token|IDs-bearing|block_id|doc_token|fetch" skills/lark-doc -S

Repository: larksuite/cli

Length of output: 215


🏁 Script executed:

#!/bin/bash
set -euo pipefail

printf '\n## File around line 58\n'
sed -n '45,75p' skills/lark-doc/SKILL.md | cat -n

printf '\n## Search related terms\n'
python3 - <<'PY'
from pathlib import Path
import re

root = Path('skills/lark-doc')
for path in root.rglob('*'):
    if path.is_file():
        text = path.read_text(encoding='utf-8', errors='ignore')
        for i, line in enumerate(text.splitlines(), 1):
            if any(term in line for term in ['docs +fetch', 'api-version v2', 'src-block-id', 'src-token', 'block_id', 'doc_token', 'fetch']):
                print(f"{path}:{i}:{line}")
PY

Repository: larksuite/cli

Length of output: 17264


Use --detail with-ids for synced references. <synced_reference src-token="..." src-block-id="..."> needs block IDs to resolve src-block-id, so this row should say docs +fetch --api-version v2 --detail with-ids instead of plain fetch.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@skills/lark-doc/SKILL.md` at line 58, The synced reference row in SKILL.md is
missing the block-ID detail needed to resolve src-block-id. Update the docs
fetch guidance in that table entry to use the v2 fetch command with --detail
with-ids, and keep the existing src-token/src-block-id mapping note so readers
know how to locate the referenced block.

@github-actions

github-actions Bot commented Jul 4, 2026

Copy link
Copy Markdown

PR Quality Summary

CI did not complete successfully. Use the failed check links below to decide whether this PR needs a code change or a rerun.

Failed checks

  • license-header — failure — details
  • results — failure — details

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

domain/ccm PR touches the ccm domain size/M Single-domain feat or fix with limited business impact

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant