[Bug 243541] default encoding is ascii, should be UTF-8, produces exceptions for i18n applications

bugzilla at redhat.com bugzilla at redhat.com
Fri Jan 22 16:15:26 UTC 2010


Please do not reply directly to this email. All additional
comments should be made in the comments box of this bug.


https://bugzilla.redhat.com/show_bug.cgi?id=243541

Dave Malcolm <dmalcolm at redhat.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |CLOSED
         Resolution|                            |CANTFIX

--- Comment #7 from Dave Malcolm <dmalcolm at redhat.com> 2010-01-22 11:15:20 EST ---
I raised this on the upstream mailing list (python-dev), and a patch to change
the tty/non-tty variation (http://bugs.python.org/issue7745)

Upstream strongly requested that I not make this change, and I'm going to honor
that request; I've withdrawn the feature request mentioned in comment #6.

An (over) simplified summary is that the situation is what it is, and that
consistency of behavior between different downstream distributions is more
important than making changes at this point in the lifecycle of Python 2. 
Upstream feel that attempting to print a <unicode> instance containing code
points > U+007F to a standard stream can be wrong when that stream isn't a
terminal, and would prefer that application code was explicit about the
encoding to be used and thus fail (I'm not sure I agree with this, but I don't
want to diverge from upstream).

Unfortunately, this can lead to hidden bugs.  One way of ensuring better
consistency between the tty/non-tty development/deployment cases is to use the
PYTHONIOENCODING environment variable.  This value overrides the default
encoding of sys.std[in|out|err]; in pseudocode:

if PYTHONIOENCODING set:
  encoding = PYTHONIOENCODING
else:
  if tty:
     encoding = locale # UTF-8
  else:
     encoding = ascii

so that it uses the supplied value in both cases without having this
tty/non-tty inconsistency.

By setting PYTHONIOENCODING=ascii in the environment, you force Python to use
ascii for the encoding of the standard streams, and thus any errors that might
occur when deploying the script as a daemon/cronjob will fail immediately
during development, rather than during deployment.

(Alternatively, you could set PYTHONIOENCODING=UTF-8 during both development
and deployment, but that assumption needs to be clearly stated in the
appliation's documentation; I haven't tested this latter approach).

Closing CANTFIX.

-- 
Configure bugmail: https://bugzilla.redhat.com/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


More information about the triage mailing list