[icu] apply upstream fix to regexcmp code which was causing Chromium crashes

Fri Aug 31 14:12:27 UTC 2012

commit ed001727b7376fabd4c6d11fe6bb6c71b8025e09
Author: Tom Callaway <spot at fedoraproject.org>
Date:   Fri Aug 31 10:13:19 2012 -0400

    apply upstream fix to regexcmp code which was causing Chromium crashes

 icu.9283.regexcmp.crash.patch |   36 ++++++++++++++++++++++++++++++++++++
 icu.spec                      |    7 ++++++-
 2 files changed, 42 insertions(+), 1 deletions(-)
---

diff --git a/icu.9283.regexcmp.crash.patch b/icu.9283.regexcmp.crash.patch
new file mode 100644
index 0000000..9cf7e3e
--- /dev/null
+++ b/icu.9283.regexcmp.crash.patch
@@ -0,0 +1,36 @@
+--- icu/source/i18n/regexcmp.cpp	(revision 31398)
++++ icu/source/i18n/regexcmp.cpp	(revision 31782)
+@@ -3307,8 +3307,29 @@
+ 
+         case URX_STRING_I:
+-            // TODO:  Is the case-folded string the longest?
+-            //        If so we can optimize this the same as URX_STRING.
+-            loc++;
+-            currentLen = INT32_MAX;
++            // TODO:  This code assumes that any user string that matches will be no longer
++            //        than our compiled string, with case insensitive matching.
++            //        Our compiled string has been case-folded already.
++            //
++            //        Any matching user string will have no more code points than our
++            //        compiled (folded) string.  Folding may add code points, but
++            //        not remove them.
++            //
++            //        There is a potential problem if a supplemental code point 
++            //        case-folds to a BMP code point.  In this case our compiled string
++            //        could be shorter (in code units) than a matching user string.
++            //
++            //        At this time (Unicode 6.1) there are no such characters, and this case
++            //        is not being handled.  A test, intltest regex/Bug9283, will fail if
++            //        any problematic characters are added to Unicode.
++            //
++            //        If this happens, we can make a set of the BMP chars that the
++            //        troublesome supplementals fold to, scan our string, and bump the
++            //        currentLen one extra for each that is found.
++            //
++            {
++                loc++;
++                int32_t stringLenOp = (int32_t)fRXPat->fCompiledPat->elementAti(loc);
++                currentLen = safeIncrement(currentLen, URX_VAL(stringLenOp));
++            }
+             break;
+ 
diff --git a/icu.spec b/icu.spec
index f429816..dc02c2a 100644
--- a/icu.spec
+++ b/icu.spec
@@ -1,6 +1,6 @@
 Name:      icu
 Version:   49.1.1
-Release:   4%{?dist}
+Release:   5%{?dist}
 Summary:   International Components for Unicode
 Group:     Development/Tools
 License:   MIT and UCD and Public Domain
@@ -13,6 +13,7 @@ Requires: lib%{name} = %{version}-%{release}
 Patch1: icu.8198.revert.icu5431.patch
 Patch2: icu.8800.freeserif.crash.patch
 Patch3: icu.7601.Indic-ccmp.patch
+Patch4: icu.9283.regexcmp.crash.patch
 
 %description
 Tools and utilities for developing with icu.
@@ -55,6 +56,7 @@ BuildArch: noarch
 %patch1 -p2 -R -b .icu8198.revert.icu5431.patch
 %patch2 -p1 -b .icu8800.freeserif.crash.patch
 %patch3 -p1 -b .icu7601.Indic-ccmp.patch
+%patch4 -p1 -b .icu9283.regexcmp.crash.patch
 
 %build
 cd source
@@ -151,6 +153,9 @@ make %{?_smp_mflags} -C source check
 %doc source/__docs/%{name}/html/*
 
 %changelog
+* Fri Aug 31 2012 Tom Callaway <spot at fedoraproject.org> - 49.1.1-5
+- apply upstream fix (bug 9283) for regexcmp crash causing Chromium segfaults
+
 * Thu Jul 19 2012 Fedora Release Engineering <rel-eng at lists.fedoraproject.org> - 49.1.1-4
 - Rebuilt for https://fedoraproject.org/wiki/Fedora_18_Mass_Rebuild