RFC: Description text in packages

Tue Dec 16 20:02:54 UTC 2008

2008/12/16 Nicolas Mailhot <nicolas.mailhot at laposte.net>:
> Le mardi 16 décembre 2008 à 20:38 +0200, Nikolay Vladimirov a écrit :
>
>> >> >  Currently, I'm opposed to having a Guideline that mandates UTF-8 over ASCII.
>> +1
>>
>> It's not piles of quirks it's a simple parser.
>
> ROTFL. Sorry.

uh?
It's a development discussion forum if you think I said something
stupid or plain braindamadge. Explain why it's stupid. Don't ROTFL and
stuff, please.

>> Take wiki syntax for example. All wikis(i know) rely on simple syntax
>> that uses ASCII characters to display somewhat structured and
>> formatted content.
>
> Wiki support UTF-8 just fine

yes. i ment the actual syntax like it uses "*" to mark unordered list
but it displays "unicode bullet" ( i can't find it on my layout)
Like it detects depth with ident and replaces '*' to look nicely.

>> And it's common to use "*" to mark unordered list.
>
> And it's common to use xml tags and all kinds of other stuff but that's
> irrelevant because spec syntax is plain text not wiki markup.
>

I'm not talking about syntax in specs, i'm talking about package kit.
I agree that there must be some standard for writing descriptions with
limitations like:
length, ident, spacing ... I don't agree with using UTF characters in
summaries .
Like in mutt i can use an utf character to display threads ( it's some
angle symbol)
but i can't see this symbol with some ssh or something ( yes, my
software is buggy )
My point is that mutt allowed me to use only ASCII chars for
displaying the threads.

So  if i'm using some really outdated client to connect to my fedora
host ( like telnet or serial or something)
And I use command-line tools to browse packages and read summaries I
will not see these UTF symbols.

As I understand the problem is that PackageKit can't display stuff. So
let's make stuff more standard and leave PackageKit to do all the
friendly displaying.

>> And also different languages have different types of quotes for
>> example in Bulgarian the quoted text looks like this : ,, quote " .
>
> Converting text to the appropriate typography rules and symbols is part
> of the job of the translator (just like applying the correct grammar and
> syntax ordering rules). Translating has never been limited to
> word-by-word conversion.
>

Yes. But why do I have to do all the stuff when a machine can do it.

>> And if someone wants to translate the package summary he must go
>> trough the UTF symbol table to find the specific quote symbol.
>
> If someone does not have the quote symbols appropriate to his language
> in his keyboard layout he should get the maintainer of this layout to
> fix it.
>
>> This can be simply solved with a parser.
>
> It can not. Unicode is a monster but people jumped on it because it was
> still simpler than all the parsers and quirks and "smart" parser rules
> that antedated it.
>
>> The representation should be different from the content.
>
> This is not representation this is text encoding.
> Chosing the font style and size is representation.
>

It is representation since the encoding will be changed mainly because
in the current summaries don't look so great and are pretty chaotic in
style. Using UTF isn't going to make them more structured and
standard. Maintainers will.

>> And if a machine will display some text to a person then machine code
>> must be written.
>
> And the spec to write machine code in this context is plain text that
> uses the UTF-8 standard.
>
> --
> Nicolas Mailhot
>

--
NV