doc to html from a cmdline?

Paul W. Frields stickster at gmail.com
Mon Feb 9 14:56:07 UTC 2009


On Mon, Feb 09, 2009 at 12:33:58PM +0000, Sharpe, Sam J wrote:
> Steven W. Orr wrote:
>> I have about 75 doc files. If I bring one up in ooffice, I can save it
>> as a .html file with no problem. Is there a way to do it from the
>> command line? All this clickety is going to take me too long.
> sudo yum install wv
> for filename in `ls *.doc`;
>   do
>       htmlname=`echo $filename | sed -e 's/.doc/.html/g'`
>       /usr/bin/wvHtml $filename $htmlname
>   done;
>
> Depending on what your Word files are, your conversion mileage may differ.

Or, since OpenOffice.org is giving you the results you like, rather
than switching rendering, you could do the following:

1. Make sure that the PyUNO bits for OpenOffice.org are installed.  On
Fedora 10, this is the "openoffice.org-pyuno" package (go figure).

2. Download this helpful script, which I found by googling:
   http://www.artofsolving.com/files/DocumentConverter.py

3. Do a batch conversion:

   for F in *.doc ; do
       H="$(basename "$F" .doc).html"
       python DocumentConverter.py "$F" "$H"
   done

Note my little batch command is very similar to the one above; I just
happen to be using the "basename" command, which I think most distros
include.

-- 
Paul W. Frields                                http://paul.frields.org/
  gpg fingerprint: 3DA6 A0AC 6D58 FEC4 0233  5906 ACDB C937 BD11 3717
  http://redhat.com/   -  -  -  -   http://pfrields.fedorapeople.org/
  irc.freenode.net: stickster @ #fedora-docs, #fedora-devel, #fredlug
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://lists.fedoraproject.org/pipermail/users/attachments/20090209/c8875310/attachment-0001.bin 


More information about the users mailing list