Skip to content

change(web): adjust TokenizationCorrector spec 🚂 🔪#15955

Open
jahorton wants to merge 1 commit into
change/web/multi-token-prediction-intermediatesfrom
change/web/adjust-tokenization-corrector-spec
Open

change(web): adjust TokenizationCorrector spec 🚂 🔪#15955
jahorton wants to merge 1 commit into
change/web/multi-token-prediction-intermediatesfrom
change/web/adjust-tokenization-corrector-spec

Conversation

@jahorton

@jahorton jahorton commented May 13, 2026

Copy link
Copy Markdown
Contributor

This PR makes a number of smaller adjustment to the TokenizationCorrector type to prepare it further for direct use in actual correction-searching processes:

  1. When part of the corrector's range is unable to yield a viable correction, the TokenizationCorrector will behave as follows:
    • If NO part can yield a viable correction, it will not return any correction result, signaling 'none'.
    • If there at least one correction has been found that was viable, results will be produced, but with no actual correction et for the uncorrectable part.
  2. The corrector now tracks how many codepoints are currently considered 'correctable' - useful for comparison against the total number of detected correction-edits.
    • A current standing engine principle - if there are edits, and the number of edits matches the full codepoint length, that's a 100% replacement, with none of the original text remaining - a case very inefficient to consider that should thus be ignored.

Build-bot: skip build:web
Test-bot: skip

@keymanapp-test-bot

keymanapp-test-bot Bot commented May 13, 2026

Copy link
Copy Markdown

User Test Results

Test specification and instructions

User tests are not required

Test Artifacts

  • Web
    • KeymanWeb Test Home - build : all tests passed (no artifacts on BuildLevel "build")

@keymanapp-test-bot keymanapp-test-bot Bot changed the title change(web): adjust TokenizationCorrector spec change(web): adjust TokenizationCorrector spec 🚂 May 13, 2026
@keymanapp-test-bot keymanapp-test-bot Bot added this to the A19S29 milestone May 13, 2026
@github-actions github-actions Bot added the change Minor change in functionality, but not new label May 13, 2026
@keyman-server keyman-server modified the milestones: A19S29, A19S30 May 23, 2026
@jahorton jahorton force-pushed the change/web/multi-token-prediction-intermediates branch 2 times, most recently from 71a9162 to e1bba09 Compare June 2, 2026 21:30
@jahorton jahorton force-pushed the change/web/adjust-tokenization-corrector-spec branch from 1e505fe to 88298ef Compare June 2, 2026 21:48
@keymanapp-test-bot keymanapp-test-bot Bot changed the title change(web): adjust TokenizationCorrector spec 🚂 change(web): adjust TokenizationCorrector spec 🚂 🔪 Jun 2, 2026
@keyman-server keyman-server modified the milestones: A19S30, A19S31 Jun 8, 2026
@jahorton jahorton force-pushed the change/web/multi-token-prediction-intermediates branch from e1bba09 to acec50b Compare June 12, 2026 14:25
@jahorton jahorton force-pushed the change/web/adjust-tokenization-corrector-spec branch 3 times, most recently from bce38c7 to a6f3bc7 Compare June 12, 2026 16:05
Build-bot: skip build:web
Test-bot: skip
@jahorton jahorton force-pushed the change/web/adjust-tokenization-corrector-spec branch 2 times, most recently from 01b7965 to 245f43f Compare June 12, 2026 17:06
@jahorton jahorton requested a review from ermshiperete June 12, 2026 17:36
@jahorton jahorton marked this pull request as ready for review June 12, 2026 17:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: Todo

Development

Successfully merging this pull request may close these issues.

2 participants