Login scripts?

Tomas Larsson tomas at tlec.se
Tue Apr 11 13:47:30 UTC 2006


> -----Original Message-----
> From: fedora-list-bounces at redhat.com 
> [mailto:fedora-list-bounces at redhat.com] On Behalf Of Tim
> Sent: Tuesday, April 11, 2006 2:19 PM
> To: For users of Fedora Core releases
> Subject: RE: Login scripts?
> 
> Tomas Larsson:
> >>> It seems that I'm using UTF8 and so on, however how to set the 
> >>> character set to 8859-1,
> 
> Tim:
> >> Are you sure that you need to.  Anything that you can display in
> >> iso-8859-1 is also a part of UTF-8 (and in the same location).
> 
> Tomas Larsson:
> > Yes, but the encoding is quite different, even in the lower 
> 256 bytes, 
> > where if I'm correct (after some extensive searches) it 
> seems that, if 
> > something is using ISO8859, it is not possible to change it 
> directly 
> > to UTF8, and vice verca, without further proccessing.
> 
> Well, my reading of that issue over the years is as follows:
> 
> Begining with ASCII (the real defined one, not others' 
> redefinitions, like those who like to refer to something that 
> doesn't really exist, calling it extended ASCII), ISO-8859-1 
> extends it (it starts the same, and adds onto the end of it). 
>  Then, UTF-8 does the same (it starts the
> *same* as ISO-8859-1 and adds onto the end of it).  The 
> characters are in the same positions, and up until you exceed 
> 255 is using the same codes (character number 255 in 
> ISO-8859-1 is the same as character number 255 in UTF-8, and 
> the same number is used to represent it).  It's only when you 
> refer to higher numbers, such as 256, that you need to use more bits.
> 
> If correct, then for data that is ASCII or ISO-8859-1 it's 
> directly equivelent with UTF-8 (for the same characters).  I 
> know it's certainly true for ASCII and UTF-8 
> interchangeability, and can find documentation detailing it.  
> I'm 99% sure for the ISO-8859-1 side of things, but only have 
> hearsay evidence about it to hand at the moment.  The nearest 
> I can come to documention about that is that the first 256 
> code points in "Unicode" are the same as ISO-8859-1, and 
> extrapolating what I know about UTF-8 encoding of Unicode 
> supports what I've said (it's a single byte up until 255, it 
> only starts using more than one byte to represent characters 
> above 255).
> 

Yes, the chars are placed at the same place, but UTF8 uses two bytes to
represent the char, when ISO8859 uses one byte, for the lower 256 bytes. And
since the swedish ÅÄÖ is below the 256 boundary, it is represented by one
byte, two bytes in UTF8, that created a problem for me.

Since MYSQL is using the system settings, when inputting a Ö it was
transformed to the 2-byte UTF8 format. And subsequently it was
mis-interpreted. It could be due to internal stuff in KDE and MySQL together
with QT, however, since I have managed to solve it (maybe in not a so nice
way) I have to live with it.

This info I got from
http://gedcom-parse.sourceforge.net/doc/encoding.html#The_character_encoding
_problem
 http://www.intertwingly.net/stories/2004/04/14/i18n.html
http://www.debian.org/doc/manuals/intro-i18n/

With best regards

Tomas Larsson
Sweden
http://www.naks.mine.nu for downloads etc.
ftp://ktl.mine.nu for uploads. Or use the free www.yousendit.com service.

Verus Amicus Est Tamquam Alter Idem 






More information about the users mailing list