Parse SVG arc paths with packed large-arc/sweep flags (fixes #129)#245
Open
gaoflow wants to merge 1 commit into
Open
Parse SVG arc paths with packed large-arc/sweep flags (fixes #129)#245gaoflow wants to merge 1 commit into
gaoflow wants to merge 1 commit into
Conversation
parse_path() raised IndexError on valid arc commands whose large-arc-flag and sweep-flag are written without a separator before the next number (e.g. 'A 5 5 0 0110 0', as emitted by SVGO and accepted by every browser). The tokenizer used FLOAT_RE.findall for every command, so it greedily read the packed flag pair '0110' as the single number 110, leaving the arc branch's seven pops to run the token list dry. Tokenize arc arguments flag-aware: read the two flag fields one '0'/'1' character at a time. The token stream for a packed arc is then identical to its spaced-out form, which the existing parser already handles. Fixes mathandy#129.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #129.
The bug
parse_path()raisesIndexErroron valid SVG arc commands whoselarge-arc-flag/sweep-flagare written without a separator before the following number:Per the SVG path grammar, each flag is a single
"0"/"1"token that may be immediately followed by the next number with no separator, so0110 0means flag0, flag1, number10, number0. Every browser and librsvg accept this, and SVGO emits it to save bytes — which is exactly the real-world path in #129 (the simple-icons "emlakjet" icon,…a3.543 3.543 0 00-1.267…).Root cause
_tokenize_pathtokenizes every command's arguments withFLOAT_RE.findall:FLOAT_REis greedy and command-agnostic, so for an arc it reads the packed flag pair0110as the single number110. The arc branch then pops seven values (rx, ry, rotation, large_arc, sweep, x, y) but only five tokens exist, soelements.pop()empties the list →IndexErroratpath.py:3361.The fix
Tokenize arc arguments flag-aware: the two flag fields are read one
"0"/"1"character at a time, while every other field stays a normal (greedy) number. The resulting token stream for a packed arc is byte-identical to its spaced-out form, so the existing pop-based parser is unchanged — only the tokenizer learned that arc flags are single characters. Non-arc commands keep usingFLOAT_RE.findallexactly as before.This mirrors the upstream
svg.pathlibrary (from which this parser was originally copied), which already tokenizes arc flags this way.Tests
test_arc_flag_packingasserts that six packed forms (absolute and relative, a negative coordinate directly after a flag, and a two-arc group with the second packed) parse identically to their spaced-out equivalents, that the flags carry through to the resultingArc, and that the real #129 simple-icons path parses instead of raising. The new test fails onmasterwith the originalIndexError; the full suite (102 tests) passes with the fix.