[Fedora-i18n-bugs] [Bug 499220] Coreutils i18n patch terribly affects performance with UTF-8 locales for sort, cut and others

bugzilla at redhat.com bugzilla at redhat.com
Mon Jan 30 10:16:38 UTC 2012


Please do not reply directly to this email. All additional
comments should be made in the comments box of this bug.


https://bugzilla.redhat.com/show_bug.cgi?id=499220

István Tóth <stoty at tvnet.hu> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |stoty at tvnet.hu

--- Comment #16 from István Tóth <stoty at tvnet.hu> 2012-01-30 05:16:33 EST ---
I've run the Maurizio's testcase on Fedora 16, as well as my own scripts that
were affected by this problem, and found that the performance regression is
fixed now.

export LANG=en_US
time grep [0] test.txt >/dev/null

real 0m0.004s
user 0m0.002s
sys 0m0.002s
export LANG=en_US.UTF-8
time grep [0] test.txt >/dev/null

real 0m0.004s
user 0m0.003s
sys 0m0.001s

I have also run the attached textparse.sh, and the only significant differences
between en_US and en_US.UTF-8 were in 
sed (5 sec vs 9 sec), cut (13 sec vs 1.5sec), and perl (108 sec vs 44 sec)

looks like cut is the only program in coreutils that still has a serious
performace problem with UTF-8. (At least among those tested in textparse.sh)

-- 
Configure bugmail: https://bugzilla.redhat.com/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


More information about the i18n-bugs mailing list