wiki to xml

Christopher Curran ccurran at redhat.com
Mon Mar 2 05:22:56 UTC 2009


Pascal wrote:
> Hi all,
>
> I was wondering how do you get mediawiki content into xml. Is it an home
> made script ? Simple ? A sophisticated and a heavy process ?
>
> I didn't found any detailed information about it on the fedoraproject
> website.
>
> I am making some latex/pdfs files from mediawiki content of
> fedora-fr.org with an ugly php script and I think it's time to think to
> something more efficient. So any advices are welcome :)
>
> I already have a look to the wiki2xml extension[1], which can be a
> possibility.
>
>
> Regards,
> Pascal
>
> [1] http://toolserver.org/~magnus/wiki2xml/w2x.php
>
>   
Unfortunately those scripts are mostly snakeoil. I've tried many of them 
and even had my hand at writing one myself but there is no easy way to 
convert them. The main issue is the conflicting views over what 
constitutes good XML. Most of the scripts produce horrendous XML.

Also bear in mind what you are trying to do. Wiki is a simple markup 
language, docbook XML is a rich markup language. Wiki has elements 
related to formatting, docbook usually ignores formatting meta data. 
Wiki sometimes has inline PHP which just kills everything (if you have 
ever tried to write something which parses code as well as data you will 
know it's easier to write a cross compiler). To summarize it is a near 
impossible tasks to do well.

All that said, it's much easier to go the other way, from XML to wiki.

Chris




More information about the docs mailing list