Hi Ya'll
<<<DO NOT CHANGE ANY OF YOUR FILES BASED ON THIS EMAIL!>>>
What follows is an approach to a thorny problem the translators are encountering. ALL DOCUMENT AUTHORS should read this too, because a small amount of re-editing will be necessary if this approach is adopted. I think it will be a one-line change to the "${PRI_LANG}/${DOCBASE}.xml" file.
Currently, each document hard-codes the filename needed to include the FDP entities:
<!DOCTYPE article PUBLIC "-//OASIS//DTD DocBook XML V4.4//EN" "http://www.oasis-open.org/docbook/xml/4.4/docbookx.dtd" [ <!ENTITY % FEDORA-ENTITIES-EN SYSTEM "../../docs-common/common/fedora-entities-en.ent"> %FEDORA-ENTITIES-EN; <!ENTITY DOCNAME "example-tutorial"> <!ENTITY DOCVERSION "0.14.1"> <!-- change version here --> <!ENTITY DOCDATE "2006-01-21"> <!-- change revision date here --> <!ENTITY DOCID "&DOCNAME;-&DOCVERSION; (&DOCDATE;)"> <!ENTITY BUG-NUM "000000"> <!-- use this only while in draft stage --> ]>
In addition, several additional entities, such as "&DOCDATE;", must be manually edited each time the document is rebuilt.
This technique of using entities makes effective translations difficult because filename paths and entities are outside the scope of the text addressed by the translation tools. Thus, it is not currently possible to have completely automated I18N support.
By re-using the same tools and techniques already in place for the document body translation, it should be possible to:
1) Place all entity definitions for a single locale into an XML file described by a custom DTD. Utilize .POT and .PO files, in conjunction with an XSL stylesheet, to automatically derive non-English, aka ${OTHERS}, XML files. Just like we can do for the document body files.
2) Instead of referencing a locale-specific href in the <DOCTYPE> declaration, reference a fixed filename "entities.ent" that will actually be a symbolic link (or an equivalent method) to the translated entity file produced above. The building system "Makefile.common" should be able to do this transparently to document authors.
3) Dynamic entities like "&DOCDATE;" can easily be implemented by "Makefile.common" and XSL stylesheet surgery. This will eliminate the in-document entity definitions in favor of entities that we can derive from the "rpm-info.xml" file.
So far, I've sketched out a preliminary framework. The DTD, a sample entity definition XML file, an XSL stylesheet and a Makefile are available at my ftp://ftp.megacoder.com/pub/entities/ site.
Please take a look at this proof-of-concept and post your comments and suggestions.
Cheers
On Thu, 2006-02-23 at 03:12 -0600, Tommy Reynolds wrote:
What follows is an approach to a thorny problem the translators are encountering. ALL DOCUMENT AUTHORS should read this too, because a small amount of re-editing will be necessary if this approach is adopted. I think it will be a one-line change to the "${PRI_LANG}/${DOCBASE}.xml" file.
Currently, each document hard-codes the filename needed to include the FDP entities:
[...snip...]
By re-using the same tools and techniques already in place for the document body translation, it should be possible to:
Place all entity definitions for a single locale into an XML file described by a custom DTD. Utilize .POT and .PO files, in conjunction with an XSL stylesheet, to automatically derive non-English, aka ${OTHERS}, XML files. Just like we can do for the document body files.
Instead of referencing a locale-specific href in the <DOCTYPE> declaration, reference a fixed filename "entities.ent" that will actually be a symbolic link (or an equivalent method) to the translated entity file produced above. The building system "Makefile.common" should be able to do this transparently to document authors.
Dynamic entities like "&DOCDATE;" can easily be implemented by "Makefile.common" and XSL stylesheet surgery. This will eliminate the in-document entity definitions in favor of entities that we can derive from the "rpm-info.xml" file.
Might I suggest a fourth possibility (one which I'm not sure will work, and is therefore inherently less cool than what Tommy has done)?
4) Provide a custom DTD to be used for all docs files, which references an i18n tree somewhere in docs-common/common, for instance. This would "wrap" the DTD for our preferred DocBook (looks like V4.4 currently) and provide general ("parsed"?) entities for each language. So John Q. Public would use the following declarations for his original XML file in "en" language:
<!DOCTYPE article PUBLIC "-//Fedora//DTD Documentation V4.4//en" "http://fedoraproject.org/some/canonical/fdp-en.dtd">
That DTD would contain or reference (doesn't really matter which) the general entities for English. Our scripts could fix the rest because the tools are aware of DOCTYPE, only not entities that are added to the DTD after the initial declaration. (For instance, after running xml2po, a very simple XSL stylesheet could be used with xsltproc to rewrite the DOCTYPE and keep the rest of the infoset exactly the same.)
The DTDs themselves could be contained in docs-common somewhere and published to the canonical location to provide updates when necessary. Alternately, we could use the same magic we use for the rpm-info DTD and simply make the SYSTEM URI point to the local docs-common (either in the user's CVS checkout, or /usr/share/fedora/doc/docs-common for people using the (soon to emerge) fedora-doc-common RPM.
Please, poke holes in this idea. Call me crazy. Call me a dreamer. Call me for dinner!
Uttered "Paul W. Frields" stickster@gmail.com, spake thus:
Might I suggest a fourth possibility (one which I'm not sure will work, and is therefore inherently less cool than what Tommy has done)?
- Provide a custom DTD to be used for all docs files, which references
an i18n tree somewhere in docs-common/common, for instance. This would "wrap" the DTD for our preferred DocBook (looks like V4.4 currently) and provide general ("parsed"?) entities for each language.
I considered this step.
I also remembered there have been discussions (or requests) about FDP having its own DTD, perhaps an official subset of some DocBook version. The idea has been rejected before because we don't want to document and support a custom DocBook DTD version.
We are having enough trouble recruiting XML authors; I think having the appearance of our own DTD would steepen the learning curve when we should be flattening it. And also lessen the document reuse portability.
And then we'll get into the "it would be so neat to have <fdp-element-foo>" added to the wrapper discussions...
So I disregarded this approach and opted for the "virtual file" concept that very closely matches the other building infrastructure.
HTH
On Fri, 2006-02-24 at 14:29 -0600, Tommy Reynolds wrote:
Uttered "Paul W. Frields" stickster@gmail.com, spake thus:
Might I suggest a fourth possibility (one which I'm not sure will work, and is therefore inherently less cool than what Tommy has done)?
- Provide a custom DTD to be used for all docs files, which references
an i18n tree somewhere in docs-common/common, for instance. This would "wrap" the DTD for our preferred DocBook (looks like V4.4 currently) and provide general ("parsed"?) entities for each language.
I considered this step.
I also remembered there have been discussions (or requests) about FDP having its own DTD, perhaps an official subset of some DocBook version. The idea has been rejected before because we don't want to document and support a custom DocBook DTD version.
I think we could easily draw the line at not customizing the DTD beyond providing some general entities, same as we do now with an include. That way there's no conflict, since DocBook stays DocBook.
We are having enough trouble recruiting XML authors; I think having the appearance of our own DTD would steepen the learning curve when we should be flattening it. And also lessen the document reuse portability.
I think the difference is trivial, but...
And then we'll get into the "it would be so neat to have <fdp-element-foo>" added to the wrapper discussions...
...on the other hand, having to fight this battle constantly would suck. It would be so easy to succumb to the Dark Side...
So I disregarded this approach and opted for the "virtual file" concept that very closely matches the other building infrastructure.
OK, I can live with that.
Uttered "Paul W. Frields" stickster@gmail.com, spake thus:
I considered this step.
I think the difference is trivial, but...
Well, I could be, well, (cough) w-w-wrong. Care to mini-hack one I could play with?
On Fri, 2006-02-24 at 15:27 -0600, Tommy Reynolds wrote:
Uttered "Paul W. Frields" stickster@gmail.com, spake thus:
I considered this step.
I think the difference is trivial, but...
Well, I could be, well, (cough) w-w-wrong. Care to mini-hack one I could play with?
I'll come right out and admit my first thought was to say "uh, erm, *mumble*, duh...", but instead I gamely took this the way it was undoubtedly intended, as an opportunity to rise to the occasion. And judging by how fast I did this, it's either (1) the obvious solution to this problem *and* world peace; (2) not as hard as it sounds; or (3) evidence I am FLAT-OUT ROCKING. (I was holding out for (3) but my wife is giving me funny looks, so I'm down to hoping for one of the other two.)
Grab this file:
http://paul.frields.org/images/fdp-en.dtd (use a frames-capable browser or just use the source, Luke)
Drop that into your docs-common/common/ folder. Then get a fresh copy of "mirror-tutorial" (a doc I can vouch for working with current build standards). Replace the DOCTYPE declaration in mirror-tutorial/en/mirror-tutorial.xml as follows:
<!DOCTYPE article PUBLIC "-//Fedora//DTD DocBook XML V4.4-Based Variant//en" "../../docs-common/common/fdp-en.dtd" [ ... ]>
For the "..." part, *REMOVE* the declaration and call for FEDORA-ENTITIES-EN, and leave everything else alone (the other entities are doc-specific and no reason for people not to use those when they need them... for now... although we can probably get rid of these too using some cleverness).
The document should build fine. Now logic says we should be able to simply do XSLT magic on newly-born XML from PO, to replace the DOCTYPE declaration with the appropriate call to the langified DTD. That DTD is simply a wrapper like my fdp-en.dtd pointing to the appropriate entities file. So:
XML(orig) --> POT --> PO(langXX) --> XML(langXX) --> XML'(langXX) xml2po xml2po' xsltproc
Does that make sense? As far as the doc-specific entities go, like DOCNAME, DOCVERSION, etc., we should be able to write a fragment at build time (it doesn't have to validate to be included) for this non-language specific data. Unless someone has qualms about using *ANY* entities, which I would hope we're not against in principle.
What do you think?
Uttered "Paul W. Frields" stickster@gmail.com, spake thus:
I'll come right out and admit my first thought was to say "uh, erm, *mumble*, duh...", but instead I gamely took this the way it was undoubtedly intended, as an opportunity to rise to the occasion. And judging by how fast I did this, it's either (1) the obvious solution to this problem *and* world peace; (2) not as hard as it sounds; or (3) evidence I am FLAT-OUT ROCKING. (I was holding out for (3) but my wife is giving me funny looks, so I'm down to hoping for one of the other two.)
"Take the pebble from my hand, Grasshopper." "Time for you to go."
Ultra-cool solution. I just learned something, too.
We can use my entities DTD/XSL to maintain the entities in a centralized location and keep their translation process intact.
Nothing wrong with using entities, per se, it's just that we didn't have a good way for selecting the proper set based on the locale setting.
Now, keeping in mind your just-established world land speed record, for extra credit:
Can you build a CATALOG so that one need not have Internet access to use this technique?
Good work: it reminds me of me.
Cheers!
Uttered Tommy Reynolds Tommy.Reynolds@MegaCoder.com, spake thus:
"Take the pebble from my hand, Grasshopper." "Time for you to go." Good work: it reminds me of me.
PS:
Karsten,
Give this man a raise.
Uttered Tommy Reynolds Tommy.Reynolds@MegaCoder.com, spake thus:
We can use my entities DTD/XSL to maintain the entities in a centralized location and keep their translation process intact.
OK, I've filled in the English prototype entities in "docs-common/common/entities-en.xml". Check the generated "entities-it.ent" file there and tremble ;-)
Uttered Tommy Reynolds Tommy.Reynolds@MegaCoder.com, spake thus:
OK, I've filled in the English prototype entities in "docs-common/common/entities-en.xml". Check the generated "entities-it.ent" file there and tremble ;-)
Firstly, I don't think I've broken anything as I've been checking stuff into CVS; if I have, I'm sure folk will let me know.
The basic infrastructure for translatable entities is mostly working. After I fix the egregious use of relative paths within the FDP entities, I'll issue a formal announcement.
At the moment, the "example-tutorial" works properly, but nothing else will until I get these relative paths corrected.
Uttered Tommy Reynolds Tommy.Reynolds@MegaCoder.com, spake thus:
At the moment, the "example-tutorial" works properly, but nothing else will until I get these relative paths corrected.
OK, I've fixed relative paths for the "docs-common/common" file references.
Both the "example-tutorial" and my locally-hacked "mirror-tutorial" build correctly.
If nobody complains, I'll issue an [ANN] to here and the f-trans-l to get the "docs-common/common/entities/entities-en.xml" translations started. Once those are done, we can then switch to using this new technique: Paul will have to revisit packaging and then we call do another [ANN] for the remaining doc authors.
Hi all, I have a question about the example of "Configurable Source Address for ICMP Errors" in http://fedoraproject.org/wiki/Docs/Beats/Networking
<quote> For example, the kernel receives an ICMP echo request on the interface eth0. Because the new sysctl option is enabled, this causes the ICMP echo reply to be sent out via eth1. The address of eth0 is used when the default behavior would use the address of eth1. </quote>
First of all, this new sysctl key refers ICMP *Error* messge. It differs from ICMP Echo Reply. So, we are supposed to think about the case that something error happens.
There is another point to be discussed. IMHO, former kernel also uses eth1 interface to sent out the reply (as an error) message in this case. The changed point caused by this new feature is just IP address used in the reply packet. Kernel routing shceme is unchanged in these new kernel releases.
My suggestion is: For example, the kernel receives a packet on the interface eth0 which cause ICMP Error and replies the ICMP Error messages via eth1 interface because of its routing table. If the the new sysctl option is enabled, the address of eth0 is used for the source address of the ICMP Error while the default behavior would use the address of eth1.
On Mon, 2006-02-27 at 01:39 +0900, SEKINE tatz Tatsuo wrote:
Hi all, I have a question about the example of "Configurable Source Address for ICMP Errors" in http://fedoraproject.org/wiki/Docs/Beats/Networking
<quote> For example, the kernel receives an ICMP echo request on the interface eth0. Because the new sysctl option is enabled, this causes the ICMP echo reply to be sent out via eth1. The address of eth0 is used when the default behavior would use the address of eth1. </quote>
First of all, this new sysctl key refers ICMP *Error* messge. It differs from ICMP Echo Reply. So, we are supposed to think about the case that something error happens.
There is another point to be discussed. IMHO, former kernel also uses eth1 interface to sent out the reply (as an error) message in this case. The changed point caused by this new feature is just IP address used in the reply packet. Kernel routing shceme is unchanged in these new kernel releases.
My suggestion is: For example, the kernel receives a packet on the interface eth0 which cause ICMP Error and replies the ICMP Error messages via eth1 interface because of its routing table. If the the new sysctl option is enabled, the address of eth0 is used for the source address of the ICMP Error while the default behavior would use the address of eth1.
Tatsuo:
Thank you for the suggestion. I didn't like the way this was worded either, but wasn't sure how to change it because I wasn't familiar with the subject matter. I didn't want my change to inadvertently alter the *meaning* of the note. In the future, if you find any more problems like this, please file a bug in Bugzilla under product: Fedora Documentation, component: release-notes. Thanks again for your help, I'll change the notes appropriately.