Fix tag_no_case panic on case-folding byte-length changes#1885
Open
Booyaka101 wants to merge 1 commit into
Open
Fix tag_no_case panic on case-folding byte-length changes#1885Booyaka101 wants to merge 1 commit into
Booyaka101 wants to merge 1 commit into
Conversation
tag_no_case split the input at the tag's byte length, but the matched prefix can be a different length when case folding changes a character's UTF-8 length (e.g. U+212A KELVIN SIGN folds to ASCII 'k'), slicing inside a multi-byte char and panicking. Resolve the split point with Input::slice_index over the input's own elements instead. Fixes rust-bakery#1883
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
tag_no_caseon&strpanics with "byte index N is not a char boundary" when the matched input prefix has a different byte length than the tag — e.g. U+212A KELVIN SIGN (3 bytes) folds to ASCII'k'(1 byte).compare_no_casecorrectly matches the two, butTagNoCasethen split the input attag.input_len()(the tag's byte length) instead of the matched prefix's, slicing inside the multi-byte character.This resolves the split point with
Input::slice_indexover the input's own elements, so the cut always lands on a boundary.&[u8]and ASCII&strare unaffected (element count equals byte length there). Bothcompleteandstreaminggo through the sharedTagNoCase, so both are fixed.Reachable from any parser using
tag_no_caseon attacker-controlled&str— a single character can crash the parser — so it carries a small DoS angle.Reproducing test added to
tests/issues.rs.Fixes #1883, fixes #1884. The diagnosis is @zhangjiashuo-cs's from the issues; happy to step aside if they'd rather submit it.