Is there a tool that converts, for example, the u-umlaut into the ü symbol? Would be nice if it could convert any national-language characters.
Thanks and regards, Albert
On Wed, 2007-01-31 at 11:50 +0100, Albert A. Modderkolk wrote:
Is there a tool that converts, for example, the u-umlaut into the ü symbol? Would be nice if it could convert any national-language characters.
If you mean within HTML files, you can use HTML tidy and specify options about input and output encoding. Though, in this day and age, we've probably got to the stage where you're better off using UTF-8, rather than entities.
On 31/01/07, Tim ignored_mailbox@yahoo.com.au wrote:
On Wed, 2007-01-31 at 11:50 +0100, Albert A. Modderkolk wrote:
Is there a tool that converts, for example, the u-umlaut into the ü symbol? Would be nice if it could convert any national-language characters.
If you mean within HTML files, you can use HTML tidy and specify options about input and output encoding. Though, in this day and age, we've probably got to the stage where you're better off using UTF-8, rather than entities.
I second that. Set your locale to utf-8 and enjoy all it's benefits.
Dotan Cohen
Albert A. Modderkolk:
Is there a tool that converts, for example, the u-umlaut into the ü symbol?
Tim:
If you mean within HTML files, you can use HTML tidy and specify options about input and output encoding. Though, in this day and age, we've probably got to the stage where you're better off using UTF-8, rather than entities.
Dotan Cohen:
I second that. Set your locale to utf-8 and enjoy all it's benefits.
Yes, definitely. The HTML &entity; scheme is only usable in HTML, and HTML-like formats. UTF-8 can be used in many more formats.
Tim wrote:
Though, in this day and age, we've probably got to the stage where you're better off using UTF-8, rather than entities.
This is certainly the best advise. If for some reason it's not acceptable, there is recode:
$ echo Ä | recode -d ..HTML Ä
On 31/01/07, Ulrich Drepper drepper@redhat.com wrote:
Tim wrote:
Though, in this day and age, we've probably got to the stage where you're better off using UTF-8, rather than entities.
This is certainly the best advise. If for some reason it's not acceptable, there is recode:
$ echo Ä | recode -d ..HTML Ä
Wow, recode is _nice_. I didn't know about this package before:
recode.i386 3.6-22.fc6 extras Matched from: recode The `recode' converts files between character sets and usages. It recognises or produces nearly 150 different character sets and is able to transliterate files between almost any pair. When exact transliteration are not possible, it may get rid of the offending characters or fall back on approximations. Most RFC 1345 character sets are supported. http://recode.progiciels-bpi.ca/
Dotan Cohen
http://lyricslist.com/lyrics/lyrics/137/12/aaliyah/age_ain_t_nothing_but_a_n... http://what-is-what.com/what_is/html_email.html