Skip to content

[3.13] gh-149079: Fix O(n^2) canonical ordering in unicodedata.normalize() (GH-149080)#150780

Merged
encukou merged 1 commit into
python:3.13from
encukou:backport-991224b-3.13
Jun 2, 2026
Merged

[3.13] gh-149079: Fix O(n^2) canonical ordering in unicodedata.normalize() (GH-149080)#150780
encukou merged 1 commit into
python:3.13from
encukou:backport-991224b-3.13

Conversation

@encukou
Copy link
Copy Markdown
Member

@encukou encukou commented Jun 2, 2026

Replace the insertion sort used for canonical ordering of combining
characters with a hybrid approach: insertion sort for short runs (< 20)
and counting sort for longer runs, reducing worst-case complexity from
O(n^2) to O(n). This prevents denial of service via crafted Unicode
strings with many combining characters in alternating CCC order.
(cherry picked from commit 991224b)

Co-authored-by: Seth Larson seth@python.org
Co-authored-by: ch4n3-yoon ch4n3.yoon@gmail.com
Co-authored-by: Seokchan Yoon 13852925+ch4n3-yoon@users.noreply.github.com
Co-authored-by: Stan Ulbrych stan@python.org
Co-authored-by: Bénédikt Tran 10796600+picnixz@users.noreply.github.com
Co-authored-by: Petr Viktorin encukou@gmail.com
Co-authored-by: Serhiy Storchaka storchaka@gmail.com
Co-authored-by: Maurycy Pawłowski-Wieroński maurycy@maurycy.com

…normalize() (pythonGH-149080)

Replace the insertion sort used for canonical ordering of combining
characters with a hybrid approach: insertion sort for short runs (< 20)
and counting sort for longer runs, reducing worst-case complexity from
O(n^2) to O(n). This prevents denial of service via crafted Unicode
strings with many combining characters in alternating CCC order.
(cherry picked from commit 991224b)

Co-authored-by: Seth Larson <seth@python.org>
Co-authored-by: ch4n3-yoon <ch4n3.yoon@gmail.com>
Co-authored-by: Seokchan Yoon <13852925+ch4n3-yoon@users.noreply.github.com>
Co-authored-by: Stan Ulbrych <stan@python.org>
Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
Co-authored-by: Petr Viktorin <encukou@gmail.com>
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
Co-authored-by: Maurycy Pawłowski-Wieroński <maurycy@maurycy.com>
@StanFromIreland
Copy link
Copy Markdown
Member

Oh, ya' beat me to it! :-) I just wrote the same PR.

@encukou
Copy link
Copy Markdown
Member Author

encukou commented Jun 2, 2026

Sorry!

@encukou encukou enabled auto-merge (squash) June 2, 2026 11:56
@StanFromIreland
Copy link
Copy Markdown
Member

Sorry!

Oh no worries, I'm the one who's sorry!

@encukou encukou merged commit ba785b8 into python:3.13 Jun 2, 2026
83 of 85 checks passed
@miss-islington-app
Copy link
Copy Markdown

Thanks @encukou for the PR 🌮🎉.. I'm working now to backport this PR to: 3.10, 3.11, 3.12.
🐍🍒⛏🤖

@miss-islington-app
Copy link
Copy Markdown

Sorry, @encukou, I could not cleanly backport this to 3.12 due to a conflict.
Please backport using cherry_picker on command line.

cherry_picker ba785b88add96acbf403d65cb157fb2743a33a32 3.12

@miss-islington-app
Copy link
Copy Markdown

Sorry, @encukou, I could not cleanly backport this to 3.11 due to a conflict.
Please backport using cherry_picker on command line.

cherry_picker ba785b88add96acbf403d65cb157fb2743a33a32 3.11

@miss-islington-app
Copy link
Copy Markdown

Sorry, @encukou, I could not cleanly backport this to 3.10 due to a conflict.
Please backport using cherry_picker on command line.

cherry_picker ba785b88add96acbf403d65cb157fb2743a33a32 3.10

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

needs backport to 3.10 only security fixes needs backport to 3.11 only security fixes needs backport to 3.12 only security fixes topic-unicode type-security A security issue

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants