Please do not reply directly to this email. All additional comments should be made in the comments box of this bug.
Summary: font autoprovides can be a bit overzealous
https://bugzilla.redhat.com/show_bug.cgi?id=670615
Summary: font autoprovides can be a bit overzealous Product: Fedora Version: rawhide Platform: Unspecified OS/Version: Unspecified Status: NEW Severity: medium Priority: low Component: fontpackages AssignedTo: nicolas.mailhot@laposte.net ReportedBy: notting@redhat.com QAContact: extras-qa@fedoraproject.org CC: tagoh@redhat.com, nicolas.mailhot@laposte.net, fonts-bugs@lists.fedoraproject.org Classification: Fedora Target Release: ---
Description of problem:
# repoquery -q --whatprovides "font(:lang=xh)" | wc -l 333
That seems high. Especially when something requiring fonts that support xhosa can have the dependencies satisfied by, for example:
... Name : cave9-mutante-fonts Fantasy/display font used by the cave9 game, this font has only the basic characters used in the Portuguese language was made as an experiment by the designer Jonas Kühner (http://www.criatipos.com/) the font was altered by the game developer to also include numbers. ...
Version-Release number of selected component (if applicable):
1.44-1.fc14
How reproducible:
100%
Additional info:
Was discovered when using pungi, which pulls all providers of a particular require when composing trees.
Please do not reply directly to this email. All additional comments should be made in the comments box of this bug.
https://bugzilla.redhat.com/show_bug.cgi?id=670615
--- Comment #1 from Akira TAGOH tagoh@redhat.com 2011-01-18 23:38:58 EST --- Speaking of the above font, this is because fc-query says it has enough coverage for Xhosa. according to the orthography file for Xhosa in fontconfig, if a font has glyphs for [a-zA-Z], fontconfig seems considering it covers Xhosa. so if this orthography is wrong, it's a fontconfig bug.
Please do not reply directly to this email. All additional comments should be made in the comments box of this bug.
https://bugzilla.redhat.com/show_bug.cgi?id=670615
--- Comment #2 from Bill Nottingham notting@redhat.com 2011-01-19 11:27:00 EST --- Maybe we should just not automatically add any lang= provides (or requires) if the orthography is the latin alphabet?
Please do not reply directly to this email. All additional comments should be made in the comments box of this bug.
https://bugzilla.redhat.com/show_bug.cgi?id=670615
--- Comment #3 from Akira TAGOH tagoh@redhat.com 2011-01-19 21:26:10 EST --- Sure. that sounds good to me. however there are no way to know if the orthography for the certain language is the latin alphabet or not. a workaround so far may be just to hardcode the languages to be filtered out.
Please do not reply directly to this email. All additional comments should be made in the comments box of this bug.
https://bugzilla.redhat.com/show_bug.cgi?id=670615
--- Comment #4 from Nicolas Mailhot nicolas.mailhot@laposte.net 2011-01-20 08:15:27 EST --- (In reply to comment #2)
Maybe we should just not automatically add any lang= provides (or requires) if the orthography is the latin alphabet?
This is a very bad idea :
1. all latin scripts are not limited to ASCII 2. a lot of fonts haver partial latin coverage 3. right now an app knows that if it calls the font auto-installer, and gets no answer, that's because no suitable font is available, if font provides are filtered out for some scripts no answer will not mean anything
If the volume of font provides is too big for tools they should be moved to a separate index createrepo side as has been done for file provides.
Please do not reply directly to this email. All additional comments should be made in the comments box of this bug.
https://bugzilla.redhat.com/show_bug.cgi?id=670615
--- Comment #5 from Bill Nottingham notting@redhat.com 2011-01-20 12:40:44 EST --- It's not that it's too big, it's that it's *not useful*. Giving 300+ responses for a particular language is not a helpful bit of information.
Please do not reply directly to this email. All additional comments should be made in the comments box of this bug.
https://bugzilla.redhat.com/show_bug.cgi?id=670615
--- Comment #6 from Nicolas Mailhot nicolas.mailhot@laposte.net 2011-01-20 13:19:39 EST --- (In reply to comment #5)
It's not that it's too big, it's that it's *not useful*. Giving 300+ responses for a particular language is not a helpful bit of information.
Script complexity is highly variable so yes you'll never have a nice uniform font support distribution. That's how human languages evolved. Some scripts will always have hundreds of hits and others almost none. You can't filter this out without hardcoding special knowledge about scripts in all apps that use fonts, and the whole point of fontconfig and font auto provides is to make i18n easier for apps by making them *not care* about language differences and use the same infrastructure and calls regardless of the language in use.
Unicode is the same thing, it's massively overblown for ascii languages but going full unicode is the only way to get good i18n. You can't stop in the middle without hurting some languages, and you can't evaluate the harm without being an expert linguist, so the best solution for everyone involved is to support all of unicode, without trying to weed out the 'useless' parts of it
Please do not reply directly to this email. All additional comments should be made in the comments box of this bug.
https://bugzilla.redhat.com/show_bug.cgi?id=670615
--- Comment #7 from Bill Nottingham notting@redhat.com 2011-01-20 14:02:45 EST --- (In reply to comment #4)
(In reply to comment #2)
Maybe we should just not automatically add any lang= provides (or requires) if the orthography is the latin alphabet?
This is a very bad idea :
- all latin scripts are not limited to ASCII
Sure. Perhaps then just weed out anything that fontconfig claims the orthography is just ASCII? We can reasonably make the assumption that the system will have fonts with ASCII coverage.
Please do not reply directly to this email. All additional comments should be made in the comments box of this bug.
https://bugzilla.redhat.com/show_bug.cgi?id=670615
--- Comment #8 from Nicolas Mailhot nicolas.mailhot@laposte.net 2011-01-20 14:47:04 EST --- ASCII is useless. Almost every ascii-using script is ascii + a few other things (as Alan Cox likes to point out correct English requires ï for example to write naïve), except the few other things are not the same so you can't choose a small ASCII superset without exluding a lot of people that need just a few other things, and you can't include all those little things without getting something a lot wider than base latin. Frentch wants éèœçŷêù, German wants ss (including the new caps variant introduced a couple of years ago, in Spain Catalans want Ŀ and won't appreciate if Madrid is better served than Barcelona, etc etc We're no longer in the period where an ALLCAPS ASCII telegram was considered an acceptable way to render human language, people got used to digital publishing and want all the squiggles and diacritics that ASCII dispensed with
Your "safe" reasonable unicode subset will always be out of date and inconvenient (and don't get me started on monetary symbols, € was unknown a decade ago and is a must now, who know even if the $ will be with us in a decade if China decides to let it sink, etc). Really, it's not worth optimizing this, it's a lot of work for little gain, and besides people are touchy about this stuff, so it's a lot of better to let dragons rest and not have people complain $ancestralenemylanguage is better served than $ownlanguage and Fedora maintainers are a bunch of american imperialists (you get the idea)
For the lucky languages that have many fonts available only manual human sorting can identify the better shortlist to provide, and that requires work l10n side, and work on comps language groups, an automated system like fontconfig is unlikely to provide the filtering you want in that case
Please do not reply directly to this email. All additional comments should be made in the comments box of this bug.
https://bugzilla.redhat.com/show_bug.cgi?id=670615
--- Comment #9 from Akira TAGOH tagoh@redhat.com 2011-02-03 03:06:45 EST --- The problem would be that fontconfig has vague orthography, at least for Xhosa then.
https://bugzilla.redhat.com/show_bug.cgi?id=670615
Paul Flo Williams paul@frixxon.co.uk changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |CLOSED CC| |paul@frixxon.co.uk Resolution|--- |NOTABUG Last Closed| |2012-11-06 08:52:25
--- Comment #10 from Paul Flo Williams paul@frixxon.co.uk --- For the specific example given, of Xhosa, the correct orthography does appear to be just the Latin alphabet. That is the information from this page:
http://www.panafril10n.org/index.php/PanAfrLoc/Xhosa
which gives an (incomplete) citation to the definitive source, which is "Xhosa: Terminologie en Spelreëls No. 3 / Terminology and Orthography No. 3" by Ernest Mdala, published by the South African government in 1972.
So, on the face of it, this is not a fontpackages or a fontconfig bug. If any problems with lang provides are found, I'd suggest they should be raised against fontconfig.
fonts-bugs@lists.fedoraproject.org