XML2PO is weird!

Paul W. Frields stickster at gmail.com
Sun Mar 5 15:11:29 UTC 2006


On Sun, 2006-03-05 at 08:35 -0500, Daniel Veillard wrote:
> On Sun, Mar 05, 2006 at 11:25:14AM +0000, Miloš Komarčević wrote:
> > On 3/5/06, Tommy Reynolds <Tommy.Reynolds at megacoder.com> wrote:
> > > It seems that xml2po(1) will sometimes expand entities and will
> > > sometimes _not_ expand entities.  There is a "-k" or a
> > > "--keep-entities" that claims "Don't expand entities", but it lies.
> > > Evidently there is a fallback mode in xml2po(1) where *all* entities
> > > are expanded.  I know there is an "-e" or "--expand-all-entities"
> > > switch, but I'm not using it.
> > >
> > > For example, generate a .POT file:
> > >
> > > $ cd example-tutorial
> > > $ xml2po -o the.pot en_US/example-tutorial.xml
> > >
> > > and then apply it (UNCHANGED!) against the original XML file using
> > > various combinations of "-e", "-k" and no switches at all:
> > >
> > > $ xml2po -o junk-e.xml -e -p the.pot en_US/example-tutorial.xml
> > > $ xml2po -o junk-k.xml -k -p the.pot en_US/example-tutorial.xml
> > > $ xml2po -o junk-plain.xml    the.pot en_US/example-tutorial.xml
> > >
> > > and then compare the files using diff(1).  There are NO differences.
> > >
> > > Anybody care to explain?  Or suggest a work-around?
> > 
> > I've incidentally contacted the xml2po maintainer on a different
> > subject, and Danilo did indeed confirm that the -k option is currently
> > broken. He suggested somebody file a bug report at
> > 
> > http://bugzilla.gnome.org/enter_bug.cgi?product=xml2po
> > 
> > and he'll look at it if there's a lot of interest.
> 
>   Using a parameter entity in the internal subset to localize 
> a XML document sounds extremely fragile to me, and a good way to
> get into troubles and uninteroperable behaviours.
> 
>    http://mail.gnome.org/archives/xml/2006-March/msg00028.html
> 
> I wonder who suggested that, it's IMHO close to be broken beyond repair.
> Asking to modify the structure while preserving parameter entities in
> the internal subset mean at the same time to maintain the textual serialization
> and to manipulate the document at the structure level, it is near impossible.

Aha! Some helpful information here.  It appears that our attempts to
maintain similar behavior to the old single-language build process
across the translation process may be misguided.  IOW, we're trying to
fit a square peg in a round hole.

One of the worst offenders is the BUG-REPORTING snippet that attempts to
use entities from the ancestor's internal subset (whether declared in
the doc or included in a SYSTEM entity call).  Usage like that will need
to be eliminated... perhaps we'll simply need to rely on translators to
deal with the documents as parsed?

We were hoping that these snippets could be simply translated once for
the entire repository and then XInclude them in the documents.  But some
of them, like our "bug-reporting.xml" snippet, were going to take
document specific information such as the version number and date of the
document.  If we use XInclude and XPointer in the snippet, we may be
able to deal with that problem by linking it in the build directory.

-- 
Paul W. Frields, RHCE                          http://paul.frields.org/
  gpg fingerprint: 3DA6 A0AC 6D58 FEC4 0233  5906 ACDB C937 BD11 3717
 Fedora Documentation Project: http://fedora.redhat.com/projects/docs/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 191 bytes
Desc: This is a digitally signed message part
Url : http://lists.fedoraproject.org/pipermail/trans/attachments/20060305/d426e0b0/attachment.bin 


More information about the trans mailing list