On Fri, 2004-02-27 at 11:15, Dave Pawson wrote:
My only other suggestion is to chunk the source into tiny bits,
then use a plain text to xml program,
and chunk the big bits via some other progam into article or somesuch.
My I suggest html2db:
http://www.cise.ufl.edu/~ppadala/tidy/
It does a very nice, compliant conversion. Converting from HTML, it
can't know much more than to turn <pre /> into <literal />, but that
kind of thing is easy to fix. I've converted multi-page HTML into
DocBook *ML in just a few hours with a simple convert and edit. The
structure you get in the end is not the point, it's the chunks of markup
which can then be put into a DocBook template.
Of course, it's nice to have an editor that can do e.g. sgml-tag-region
and tag creation/completion for manually marking up missed or incorrect
bits.
- Karsten
--
Karsten Wade : Tech Writer, RHCE : o: +1.831.466.9664
kwade(a)redhat.com :
http://rhea.redhat.com/ : c: +1.831.818.9995
Red Hat Applications : WAF, CMS, Portal Server
-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --