Hi,
Ok, what have you done this time with charsets..
man less:
-Kcharset Causes less to use this charset instead of a charset defined in the JLESSCHARSET or LESSCHARSET environment variable.
($:~) echo "åöä hi mom" >moreisless ($:~) less moreisless &&& hi mom ($:~) more moreisless åöä hi mom
So how to fix this?
Jani Ollikainen wrote:
Ok, what have you done this time with charsets..
'less' works fine here, on a fresh "Personal Desktop" installation (well, that plus Japanese language support).
[gordon@vagabond:~]$ set | egrep 'LANG|^LC|LESS' LANG=en_US.UTF-8 LANGVAR=en_US.UTF-8 LESSOPEN='|/usr/bin/lesspipe.sh %s'
[gordon@vagabond:~]$ echo "åöä hi mom" >moreisless
'less' displays what it should.
man less: -Kcharset
Not sure why that's relevant given your message.
($:~) echo "åöä hi mom" >moreisless
What shell is that?
pe, 2003-11-28 kello 20:17, Jani Ollikainen kirjoitti:
Hi,
Ok, what have you done this time with charsets..
Itseasiassa kysymys on siitä mitä sinä olet tehnyt merkistöilläsi. Actually the question is what you have done with your charsets.
($:~) echo "åöä hi mom" >moreisless ($:~) less moreisless
Oletusarvoisesti tuossa ekkoat utf-8 merkkejä, siis kaksitavuisia merkkejä tiedostoon ja LATIN-1 on yhden tavun merkistö.
By default you are echoing utf-8 characters, that is characters with two-byte representation to a file and LATIN-1 is charset with one byte characters.
&&& hi mom ($:~) more moreisless åöä hi mom
So how to fix this?
Poista LATIN1 tai käytä LANG=fi_FI@euro:a lokaalinasi.
Remove LATIN1 or use LANG=fi_FI@euro as your locale.
On Sun, Nov 30, 2003 at 11:02:55AM +0200, Mauri Sahlberg wrote:
($:~) echo "åöä hi mom" >moreisless
By default you are echoing utf-8 characters, that is characters with two-byte representation to a file and LATIN-1 is charset with one byte characters.
Not really, i was a little bit drunk so forgot to mention my locale settings:)
($:~) locale LANG=C LC_CTYPE=fi_FI@euro LC_NUMERIC="C" LC_TIME="C" LC_COLLATE=fi_FI@euro LC_MONETARY=fi_FI@euro LC_MESSAGES="C" LC_PAPER="C" LC_NAME="C" LC_ADDRESS="C" LC_TELEPHONE="C" LC_MEASUREMENT="C" LC_IDENTIFICATION="C" LC_ALL=
So they should be only one byte characters.
Remove LATIN1 or use LANG=fi_FI@euro as your locale.
Ok.
($:~) export LESSCHARSET=""
($:~) less moreisless åöä hi mom
It works.. So ok, but i don't get to logic in this.. RH7.3 did need that and now that isn't needed with same locale settings.
ps. very annoying to speak two languages at the same time..
Jani Ollikainen wrote:
It works.. So ok, but i don't get to logic in this.. RH7.3 did need that and now that isn't needed with same locale settings.
That's exactly the logic. If you use a UTF-8 locale, then you no longer need any additional "special" locale settings for stuff to work right.
On Sun, Nov 30, 2003 at 01:57:11PM -0800, Gordon Messmer wrote:
It works.. So ok, but i don't get to logic in this.. RH7.3 did need that and now that isn't needed with same locale settings.
That's exactly the logic. If you use a UTF-8 locale, then you no longer need any additional "special" locale settings for stuff to work right.
($:~) cat /etc/sysconfig/i18n LANG="C" SYSFONT="lat9w-08" SYSFONTACM="iso15" LC_CTYPE="fi_FI@euro" LC_COLLATE="fi_FI@euro" LC_MONETARY="fi_FI@euro"
($:~) locale LANG=C LC_CTYPE=fi_FI@euro LC_NUMERIC="C" LC_TIME="C" LC_COLLATE=fi_FI@euro LC_MONETARY=fi_FI@euro LC_MESSAGES="C" LC_PAPER="C" LC_NAME="C" LC_ADDRESS="C" LC_TELEPHONE="C" LC_MEASUREMENT="C" LC_IDENTIFICATION="C" LC_ALL=
($:~) echo -n ä>thingie ($:~) ls -la thingie -rw-r--r-- 1 bestis users 1 Dec 1 00:54 thingie
Where's the UTF-8? Seems only 1 byte to me.
Jani Ollikainen wrote:
On Sun, Nov 30, 2003 at 01:57:11PM -0800, Gordon Messmer wrote:
That's exactly the logic. If you use a UTF-8 locale, then you no longer need any additional "special" locale settings for stuff to work right.
($:~) cat /etc/sysconfig/i18n LANG="C" LC_CTYPE="fi_FI@euro"
...
($:~) echo -n ä>thingie ($:~) ls -la thingie -rw-r--r-- 1 bestis users 1 Dec 1 00:54 thingie
Where's the UTF-8? Seems only 1 byte to me.
You tell me. That doesn't look like a Fedora-installed i18n file to me:
[gordon@vagabond:~]$ cat /etc/sysconfig/i18n LANG="en_US.UTF-8" SUPPORTED="en_US.UTF-8:en_US:en:ja_JP.UTF-8:ja_JP:ja" SYSFONT="latarcyrheb-sun16"
Your LANG should probably be set to fi_FI.UTF-8. Once you fix it, you'll have to log out and probably restart gdm to get your X session in a UTF-8 locale.
On Sun, Nov 30, 2003 at 05:54:18PM -0800, Gordon Messmer wrote:
Your LANG should probably be set to fi_FI.UTF-8. Once you fix it, you'll have to log out and probably restart gdm to get your X session in a UTF-8 locale.
Why would I want to use my machine in finnish language? Hate those translations, and i don't think that utf-8 is here atm. Maybe in 2010 :) But this conversation is getting so much of topic, i got it working but i don't understand the logic why it works w/o lesscharset env, but maybe it some changes in less which comes with fedora.
So I'm happy now, and will be using my machine in english and w/o utf-8.
Jani Ollikainen wrote:
On Sun, Nov 30, 2003 at 05:54:18PM -0800, Gordon Messmer wrote:
Your LANG should probably be set to fi_FI.UTF-8. Once you fix it, you'll have to log out and probably restart gdm to get your X session in a UTF-8 locale.
Why would I want to use my machine in finnish language?
My fault. I assumed that's what you wanted based on your LC_CTYPE, etc. If you want english, then it should be set to en_US.UTF-8, as in the default.
Hate those translations, and i don't think that utf-8 is here atm.
And why do you think that? Based on your own experience, I'd say that locale-specific encodings are "here" atm. UTF-8 is the one true path.
But this conversation is getting so much of topic
I think it's very much on topic. If there is any reason why UTF-8 should not be used, then it should be fixed. UTF-8 is the default, and will be the default character encoding for all future releases of the distribution. Right?
, i got it working but i don't understand the logic why it works w/o lesscharset env,
Less was probably converting the characters to something it thought would display, and your terminal was interpreting them as an unknown encoding. This is exactly why UTF-8 is the default. It's the only viable way to reliably encode character data and have it display right under any locale (as long as it's a UTF-8 locale).
Gordon Messmer wrote:
Jani Ollikainen wrote:
Hate those translations, and i don't think that utf-8 is here atm.
And why do you think that? Based on your own experience, I'd say that locale-specific encodings are "here" atm. UTF-8 is the one true path.
I meant "are not 'here'", of course.