Skip to content

fix(tools): lenient UpdateTodoList tasks JSON in normalize_json_array#13

Closed
JessicaMulein wants to merge 1 commit into
dev-integrationfrom
fix/lenient-task-json
Closed

fix(tools): lenient UpdateTodoList tasks JSON in normalize_json_array#13
JessicaMulein wants to merge 1 commit into
dev-integrationfrom
fix/lenient-task-json

Conversation

@JessicaMulein

@JessicaMulein JessicaMulein commented Jun 2, 2026

Copy link
Copy Markdown
Member

Summary

  • Add try_parse_lenient_task_array for local-model UpdateTodoList tasks args with unescaped inner JSON quotes
  • Wire lenient repair through normalize_json_array before raising ToolError
  • Test: unescaped [{"done": true, "task": "..."}, ...] array strings

Context

BrightVision dogfood on qwen3.6 hit Invalid tasks parameter JSON when the model glued { "tasks": "[{...}]" }{ "path": "..." }. UI-side JSON tree rendering is separate (BrightVision head); this fixes tool execution in cecli.

Test plan

  • pytest tests/tools/test_tool_arguments.py::test_normalize_json_array_lenient_unescaped_tasks -q
  • pytest tests/tools/test_update_todo_list.py -q

Made with Cursor

PR Summary by Typo

Overview

This PR addresses an issue where UpdateTodoList tasks JSON might contain unescaped inner quotes, causing parsing failures. It introduces a more lenient parsing mechanism to correctly handle these malformed JSON strings, particularly from models like local Qwen.

Key Changes

  • Added a new function try_parse_lenient_task_array in cecli/helpers/responses.py to robustly parse UpdateTodoList tasks with unescaped inner JSON quotes.
  • Modified normalize_json_array in cecli/tools/utils/helpers.py to utilize this new lenient parser when standard JSON parsing fails for tasks parameters, and to handle nested tasks objects.
  • Included a new test case test_normalize_json_array_lenient_unescaped_tasks to validate the improved parsing logic.

Work Breakdown

Category Lines Changed
New Work 60 (93.8%)
Rework 4 (6.2%)
Total Changes 64
To turn off PR summary, please visit Notification settings.

@typo-app

typo-app Bot commented Jun 2, 2026

Copy link
Copy Markdown

Static Code Review 📊

🛑 3 quality checks failed!

@coderabbitai

coderabbitai Bot commented Jun 2, 2026

Copy link
Copy Markdown

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d1afb157-d615-450d-b1a8-e8fd992d0113

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/lenient-task-json

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@qodo-code-review

Copy link
Copy Markdown

Review Summary by Qodo

Fix lenient UpdateTodoList tasks JSON parsing for local models

🐞 Bug fix

Grey Divider

Walkthroughs

Description
• Add lenient JSON parser for UpdateTodoList tasks with unescaped inner quotes
• Handle local model glued JSON objects in tool arguments
• Wire lenient repair through normalize_json_array before raising errors
• Add test coverage for unescaped task array parsing
Diagram
flowchart LR
  A["Local Model Output<br/>Glued/Unescaped JSON"] -->|"normalize_json_array"| B["try_parse_lenient_task_array"]
  B -->|"Regex Pattern Match"| C["Parsed Task Array"]
  B -->|"Fallback"| D["Standard json.loads"]
  C --> E["UpdateTodoList Success"]
  D --> E

Loading

Grey Divider

File Changes

1. cecli/helpers/responses.py ✨ Enhancement +38/-0

Add lenient task array JSON parser function

• Add try_parse_lenient_task_array function to parse UpdateTodoList tasks with unescaped inner
 JSON quotes
• Handles dict inputs with nested tasks, string inputs with regex pattern matching
• Supports glued JSON objects via merge_glued_json_objects and split_concatenated_json
• Extracts task objects using regex pattern for done boolean and task string fields

cecli/helpers/responses.py


2. cecli/tools/utils/helpers.py ✨ Enhancement +14/-4

Wire lenient parsing into normalize_json_array

• Integrate try_parse_lenient_task_array into normalize_json_array workflow
• Attempt lenient parsing before falling back to standard json.loads
• Add nested task extraction logic for dict inputs with tasks parameter
• Preserve backward compatibility with existing JSON parsing behavior

cecli/tools/utils/helpers.py


3. tests/tools/test_tool_arguments.py 🧪 Tests +12/-0

Add test for lenient unescaped tasks parsing

• Add test_normalize_json_array_lenient_unescaped_tasks test case
• Verify parsing of unescaped task array with multiple task objects
• Assert correct boolean and string field extraction from parsed tasks

tests/tools/test_tool_arguments.py


Grey Divider

Qodo Logo

@qodo-code-review

qodo-code-review Bot commented Jun 2, 2026

Copy link
Copy Markdown

Code Review by Qodo

🐞 Bugs (3) 📘 Rule violations (0) 🔗 Cross-repo conflicts (0)

Grey Divider


Action required

1. Partial task array accepted 🐞 Bug ☼ Reliability
Description
try_parse_lenient_task_array returns any regex matches without verifying the full array was
parsed, so malformed inputs can be partially accepted and silently drop tasks. This can cause
UpdateTodoList to write an incomplete todo list without surfacing an error.
Code

cecli/helpers/responses.py[R526-530]

Evidence
The implementation returns rows whenever at least one match is found, with no check that the
entire string was parsed; thus malformed inputs can result in partial task lists being accepted.

cecli/helpers/responses.py[522-530]
cecli/tools/utils/helpers.py[360-393]
tests/tools/test_tool_arguments.py[248-259]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
The lenient task parser builds `rows` from `pattern.finditer(raw)` and returns them as long as there’s at least one match. It does not validate that the remaining text is only brackets/commas/whitespace, so garbage between/after objects is ignored and tasks can be dropped silently.

### Issue Context
This function is invoked as a recovery path before raising `ToolError`, so returning partial results turns what should be a hard failure into a silent data-loss behavior.

### Fix Focus Areas
- After collecting matches, validate that the input contains *only*:
 - the matched objects
 - commas
 - surrounding `[` `]`
 - whitespace
 If anything else remains, return `None` (so the caller raises a ToolError).
- Add a test that includes trailing/embedded garbage and asserts an error is raised (no partial success).

### Fix Focus Areas (code pointers)
- cecli/helpers/responses.py[522-530]
- cecli/tools/utils/helpers.py[360-393]
- tests/tools/test_tool_arguments.py[248-259]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Remediation recommended

2. Lenient parse not scoped 🐞 Bug ≡ Correctness
Description
normalize_json_array attempts try_parse_lenient_task_array for any parameter when JSON parsing
fails, so non-tasks list params can be mis-parsed into todo-task dicts instead of raising
ToolError. This can silently change tool behavior if another tool ever passes a malformed array
string containing {"done": ..., "task": ...}-shaped objects.
Code

cecli/tools/utils/helpers.py[R366-375]

Evidence
normalize_json_array unconditionally calls the task-specific lenient parser on any string that
fails try_parse_json_value, while BaseTool applies normalize_json_array to every tool’s
LIST_PARAMS, not just UpdateTodoList.

cecli/tools/utils/helpers.py[360-382]
cecli/tools/utils/base_tool.py[125-128]
cecli/helpers/responses.py[495-531]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`normalize_json_array()` calls `responses.try_parse_lenient_task_array(text)` even when `param_name` is not `"tasks"`. This makes the lenient, task-specific recovery logic reachable for unrelated tools/params whenever JSON parsing fails.

### Issue Context
`BaseTool` normalizes *all* tool list params through `normalize_json_array`, so this behavior is not limited to `UpdateTodoList`.

### Fix Focus Areas
- Only attempt `try_parse_lenient_task_array` when `param_name == "tasks"` (or when the tool name is UpdateTodoList, if available at call site).
- Keep current behavior unchanged for all other params.

### Fix Focus Areas (code pointers)
- cecli/tools/utils/helpers.py[360-382]
- cecli/tools/utils/base_tool.py[125-128]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


3. Tasks regex drops fields 🐞 Bug ≡ Correctness
Description
try_parse_lenient_task_array only matches objects that end immediately after the task string, so
any task object with additional fields (e.g., current) won’t be parsed and will still trigger
ToolError in the unescaped-quote scenario. This breaks valid UpdateTodoList payloads because the
tool schema supports current and formatting logic depends on it.
Code

cecli/helpers/responses.py[R522-530]

Evidence
The regex requires the } right after the task string, so objects containing additional fields
(like UpdateTodoList’s current) won’t match. UpdateTodoList’s schema explicitly allows current
and its formatter checks it.

cecli/helpers/responses.py[495-530]
cecli/tools/update_todo_list.py[36-47]
cecli/tools/update_todo_list.py[50-83]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
The lenient task parser uses a regex that only matches `{"done": <bool>, "task": <string>}` and requires the object to close right after `task`. Legitimate task objects with extra keys like `current` (supported by UpdateTodoList) won’t match and therefore won’t be repaired.

### Issue Context
UpdateTodoList supports `current` in its JSON schema and uses it to decide how tasks are rendered.

### Fix Focus Areas
- After stripping outer quotes / handling glued objects, try parsing `raw` via `responses.try_parse_json_value(raw)` / `json.loads(raw)` first.
- If parsed value is a list of dicts, return it directly (preserving extra fields like `current`).
- Keep regex as a last-resort fallback only when JSON parsing still fails.

### Fix Focus Areas (code pointers)
- cecli/helpers/responses.py[495-530]
- cecli/tools/update_todo_list.py[36-47]
- cecli/tools/update_todo_list.py[50-83]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

Qodo Logo

@typo-app typo-app Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AI Code Review 🤖

Files Reviewed: 3
Comments Added: 0
Lines of Code Analyzed: 68
Critical Issues: 0

PR Health: Excellent 🔥

Give 👍 or 👎 on each review comment to help us improve.

Comment on lines +526 to +530
rows = [
{"done": m.group(1).lower() == "true", "task": m.group(2).replace('\\"', '"')}
for m in pattern.finditer(raw)
]
return rows if rows else None

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

1. Partial task array accepted 🐞 Bug ☼ Reliability

try_parse_lenient_task_array returns any regex matches without verifying the full array was
parsed, so malformed inputs can be partially accepted and silently drop tasks. This can cause
UpdateTodoList to write an incomplete todo list without surfacing an error.
Agent Prompt
### Issue description
The lenient task parser builds `rows` from `pattern.finditer(raw)` and returns them as long as there’s at least one match. It does not validate that the remaining text is only brackets/commas/whitespace, so garbage between/after objects is ignored and tasks can be dropped silently.

### Issue Context
This function is invoked as a recovery path before raising `ToolError`, so returning partial results turns what should be a hard failure into a silent data-loss behavior.

### Fix Focus Areas
- After collecting matches, validate that the input contains *only*:
  - the matched objects
  - commas
  - surrounding `[` `]`
  - whitespace
  If anything else remains, return `None` (so the caller raises a ToolError).
- Add a test that includes trailing/embedded garbage and asserts an error is raised (no partial success).

### Fix Focus Areas (code pointers)
- cecli/helpers/responses.py[522-530]
- cecli/tools/utils/helpers.py[360-393]
- tests/tools/test_tool_arguments.py[248-259]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Local models emit glued tool args and unescaped inner quotes on tasks arrays.
Add try_parse_lenient_task_array and wire it through normalize_json_array so
UpdateTodoList survives qwen-style { "tasks": "[{...}]" }{ "path": "..." }.

Co-authored-by: Cursor <cursoragent@cursor.com>
@JessicaMulein

Copy link
Copy Markdown
Member Author

Superseded by upstream PR to cecli-dev/cecli (branch rebased on upstream/main).

@JessicaMulein

Copy link
Copy Markdown
Member Author

Upstream PR opened: cecli-dev#550

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant