Anyone using non-UTF-8 locale(s)?

Fri Jul 9 01:25:37 UTC 2004

On Thu, 2004-07-08 at 15:19, Ian Pilcher wrote:
> I ran into a problem setting up an account for my wife with
> system-config-users.
> 
> Her name includes a non-ASCII character (U+00E9).  I was able to use the
> KDE character selector to enter the character, into the "Full Name" text
> field, but when I pressed the "OK" button, it complained about a non-
> ASCII character and refused to accept it.
> 
> Undaunted, I manually edited /etc/passwd, and everything appears to be
> working just fine ... except system-config-users that is.  Instead of a
> small E with an acute accent, it shows a capital A with a tilde followed
> by a copyright symbol.  Clearly, it is interpreting the UTF-8
> representation of U+00E9 (0xC3 0xA9) as some sort of 8-bit encoding
> (although I'm not sure what encoding uses 0xA9 to represent a copyright
> symbol).
> 
> I found the following in bugzilla:
> 
>      https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=74058
> 
> Comment #8 indicates that the ASCII-only restriction exists because
> there's no such thing as a global locale on a UNIX system, and "trying
> to deal with all these different encodings is just a nightmare."
> 
> I don't thing that my wife will care, however.
> 
> If everyone uses UTF-8, however, this isn't a problem.  So I'd like to
> hear from anyone who's using a non-UTF-8 locale on Fedora Core 2.
> 
> Thanks!
> 

This sounds like the tool writing the passwd file is refusing to accept
the "invalid" character because it is not standard ascii.  The comment
in bugzilla applies globally.

Be very careful about insisting that non-ascii characters be accepted in
the passwd file.  This file contains data that by design must be
universal across all locales, and making it local specific will generate
nightmares for system admins and programmers. (As well as distros that
are 'locale specific'.)  I can see that it would cause maintenance
headaches that would be huge.

An example would be a company that has offices globally, and uses a
single site for access control.  Users in one locale have their machines
set specific to that locale, and in other locales they are set
similarly.  How is the access control part supposed to know which locale
specifics to use before granting access??  The universal standard ascii
restrictions on critical data used for this is much better than locale
specific.

Machines that are in one locale and never access another locale may be
able to use it but maintenance would be non-standard.