wiki to xml
Christopher Curran
ccurran at redhat.com
Mon Mar 2 05:22:56 UTC 2009
Pascal wrote:
> Hi all,
>
> I was wondering how do you get mediawiki content into xml. Is it an home
> made script ? Simple ? A sophisticated and a heavy process ?
>
> I didn't found any detailed information about it on the fedoraproject
> website.
>
> I am making some latex/pdfs files from mediawiki content of
> fedora-fr.org with an ugly php script and I think it's time to think to
> something more efficient. So any advices are welcome :)
>
> I already have a look to the wiki2xml extension[1], which can be a
> possibility.
>
>
> Regards,
> Pascal
>
> [1] http://toolserver.org/~magnus/wiki2xml/w2x.php
>
>
Unfortunately those scripts are mostly snakeoil. I've tried many of them
and even had my hand at writing one myself but there is no easy way to
convert them. The main issue is the conflicting views over what
constitutes good XML. Most of the scripts produce horrendous XML.
Also bear in mind what you are trying to do. Wiki is a simple markup
language, docbook XML is a rich markup language. Wiki has elements
related to formatting, docbook usually ignores formatting meta data.
Wiki sometimes has inline PHP which just kills everything (if you have
ever tried to write something which parses code as well as data you will
know it's easier to write a cross compiler). To summarize it is a near
impossible tasks to do well.
All that said, it's much easier to go the other way, from XML to wiki.
Chris
More information about the docs
mailing list