https://bugzilla.redhat.com/show_bug.cgi?id=978233
Bug ID: 978233
Summary: perl-5.18: Regex \8 and \9 after literals no longer
work
Product: Fedora
Version: rawhide
Component: perl
Severity: unspecified
Priority: unspecified
Assignee: mmaslano(a)redhat.com
Reporter: ppisar(a)redhat.com
QA Contact: extras-qa(a)fedoraproject.org
CC: cweyl(a)alumni.drew.edu, iarnell(a)gmail.com,
jplesnik(a)redhat.com, kasal(a)ucw.cz, lkundrak(a)v3.sk,
mmaslano(a)redhat.com,
perl-devel(a)lists.fedoraproject.org, ppisar(a)redhat.com,
psabata(a)redhat.com, rc040203(a)freenet.de,
tcallawa(a)redhat.com
There is a regression about \8 and \9 back-references not working since
v5.17.0-543-g726ee55. This has been somewhat fixed with:
commit f1e1b256c5c1773d90e828cca6323c53fa23391b
Author: Yves Orton <demerphq(a)gmail.com>
Date: Tue Jun 25 21:01:27 2013 +0200
Fix rules for parsing numeric escapes in regexes
Commit 726ee55d introduced better handling of things like \87 in a
regex, but as an unfortunate side effect broke latex2html.
The rules for handling backslashes in regexen are a bit arcane.
Anything starting with \0 is octal.
The sequences \1 through \9 are always backrefs.
Any other sequence is interpreted as a decimal, and if there
are that many capture buffers defined in the pattern at that point
then the sequence is a backreference. If however it is larger
than the number of buffers the sequence is treated as an octal digit.
A consequence of this is that \118 could be a backreference to
the 118th capture buffer, or it could be the string "\11" . "8".
In
other words depending on the context we might even use a different
number of digits for the escape!
This also left an awkward edge case, of multi digit sequences
starting with 8 or 9 like m/\87/ which would result in us parsing
as though we had seen /87/ (iow a null byte at the start) or worse
like /\x{00}87/ which is clearly wrong.
This patches fixes the cases where the capture buffers are defined,
and causes things like the \87 or \97 to throw the same error that
/\8/ would. One might argue we should complain about an illegal
octal sequence, but this seems more consistent with an error like
/\9/ and IMO will be less surprising in an error message.
This patch includes exhaustive tests of patterns of the form
/(a)\1/, /((a))\2/ etc, so that we dont break this again if we
change the logic more.
--
You are receiving this mail because:
You are on the CC list for the bug.
Unsubscribe from this bug
https://bugzilla.redhat.com/token.cgi?t=wixX3ZHmwA&a=cc_unsubscribe