glibc January 2016

glibc@lists.fedoraproject.org

2 participants
3 discussions

Adding a C.UTF-8 locale to our glibc packages

by Mike FABIAN

This patch adds a C.UTF-8 locale as a folder /usr/lib/locale/C.utf8/ to our glibc packages. This way, the locale is sort of “unremovable” because it is not affected by the --install-langs option of build-locale-archive. build-locale-archive completely ignores folders in /usr/lib/locale/ which do not have a “_” in their name. This is similar to how Debian did it. I added the LC_* sections which are missing in the Debian source though because when sections are missing, localedef prints a warning and I don’t want to use the “-c” option to force the output in spite of the warnings. This locale is very close in behaviour to C/POSIX but LC_CTYPE is just copying from glibc/localedata/locales/i18n (which we have just updated for Unicode 8.0.0). So this C.UTF-8 gives us a locale which is very much like the C locale but uses UTF-8 encoding and tools like "ls" will display all printable characters from Unicode instead of displaying question marks for everything non-ASCII. Sorting (LC_COLLATE) is done strictly via Unicode code point order which gives the same sorting for the ASCII range as the traditional C locale. Debian does this like this: LC_COLLATE order_start forward <U0000> <U0001> all code points listed individually leaving out the unassigned ranges <U10FFFE> <U10FFFF> UNDEFINED order_end END LC_COLLATE (more than 300000 lines of code points listed). I used this instead: LC_COLLATE order_start forward <U0000> .. <UFFFF> <U10000> .. <U1FFFF> <U20000> .. <U2FFFF> <UE0000> .. <UEFFFF> <UF0000> .. <UFFFFF> <U100000> .. <U10FFFF> UNDEFINED order_end END LC_COLLATE Which makes the source much shorter and more readable, the result in the binary is the same, the size of the binary is the same as well, the complete binary locale needs about 1.8M almost all because of LC_COLLATE (same on Debian). Not skipping the ranges currently not assigned in Unicode would make the locale about 6.5M big. This seems theoretically better to me but it has probably little benefit sorting unassigned code points by code point order as well. Actually I think this should be enough: LC_COLLATE order_start forward UNDEFINED order_end END LC_COLLATE because of: http://pubs.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap07.html opengroup> The symbol UNDEFINED shall be interpreted as including all coded opengroup> character set values not specified explicitly or via the ellipsis opengroup> symbol. Such characters shall be inserted in the character collation opengroup> order at the point indicated by the symbol, and in ascending order opengroup> according to their coded character set values. If no UNDEFINED symbol is opengroup> specified, and the current coded character set contains characters not opengroup> specified in this section, the utility shall issue a warning message and opengroup> place such characters at the end of the character collation order. But unfortunately it does not work like that in glibc, which is probably a bug. I tried to look at the code to find out why UNDEFINED does not work as specified in the standard but could not figure it out yet. (UNDEFINED currently does not work at all as specified, most locale sources have UNDEFINED somewhere, but the characters not specified explicitly do not get inserted where UNDEFINED is but instead right at the top (before all other charcters). -- Mike FABIAN <mfabian(a)redhat.com> ☏ Office: +49-69-365051027, internal 8875027 睡眠不足はいい仕事の敵だ。

8 years, 2 months

2
4
0 / 0

Fedora Rawhide has group merge support.

by Carlos O'Donell

Stephen, Your group merge feature has been added to Fedora Rawhide. Official rawhide build in progress: http://koji.fedoraproject.org/koji/taskinfo?taskID=12536544 Even if it doesn't go into 2.23 I expect it will go into 2.24, and because it doesn't add new ABI/API I plan to keep it in Rawhide and include it in F24, even if it's not in upstream yet. Cheers, Carlos.

8 years, 3 months

1
0
0 / 0

Tunables in rawhide?

by Siddhesh Poyarekar

Carlos, Florian, How adventurous are you guys about me adding the tunables patches to rawhide? It is relatively harmless since it does not have any ABI implications and the patches can be backed out if they cause problems. The only implication is that of API; we haven't decided on the method to get input from users (GLIBC_TUNABLES vs envvar for each tunable vs some other method) and that could change as discussions progress upstream. I think we could manage that by specifically announcing on the fedora-devel mailing list that it could change in future. Or I could just add the first patch to mass-smoke-test it in rawhide and hold on to the second patch till there is consensus upstream. Thoughts? Siddhesh

8 years, 3 months

2
2
0 / 0

← Newer
1
Older →

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

glibc January 2016