word documents

Dave Hunt dave.hunt2 at verizon.net
Sun Mar 16 02:41:02 EST 2003

There are Word document viewers for Linux console.  The one I use is
called wv.  Another is called antiword.  No doubt, there are more.
Because Word is a proprietary format, and the specification is not
available, the authors of programs such as wv have had to
reverse-engineer a bit.  Because of this, certain things in the Word
document may not decode as well as we'd like.  Nonetheless, I use wv
and get reasonable results when converting from Word to html.  The
resulting html source is quite bloated, but, it's there.

For pdf conversion, there's pdftotext.  This is part of the xpdf
package, and may already be on your system.  Surprise, it was already
on my stock installation of RH 7.2.  the one thing I don't like about
pdftotext-s rendering, is that hyperlinks get lost.  To preserve the
navigability of pdf documents, I visit <access.adobe.com>, and submit
the url of a pdf document (assuming I've found it on the web) to the
form.  What comes back is a nice html rendering (links and all).  

Hope this helps,


