word documents

Charles Crawford CCrawford at ACB.org
Sun Mar 16 11:22:05 EST 2003

Hmmm.   I never thought about the proprietary issue with MS-Word.  I wonder 
if we should not be talking with Microsoft to get at least the formatting 
info available?  Oh yeah, didn't Gates give the
chinese open source Windows?  Hmmm.

-- charlie.
At 02:41 AM 03/16/2003 -0500, you wrote:
>There are Word document viewers for Linux console.  The one I use is
>called wv.  Another is called antiword.  No doubt, there are more.
>Because Word is a proprietary format, and the specification is not
>available, the authors of programs such as wv have had to
>reverse-engineer a bit.  Because of this, certain things in the Word
>document may not decode as well as we'd like.  Nonetheless, I use wv
>and get reasonable results when converting from Word to html.  The
>resulting html source is quite bloated, but, it's there.
>For pdf conversion, there's pdftotext.  This is part of the xpdf
>package, and may already be on your system.  Surprise, it was already
>on my stock installation of RH 7.2.  the one thing I don't like about
>pdftotext-s rendering, is that hyperlinks get lost.  To preserve the
>navigability of pdf documents, I visit <access.adobe.com>, and submit
>the url of a pdf document (assuming I've found it on the web) to the
>form.  What comes back is a nice html rendering (links and all).
>Hope this helps,
>Speakup mailing list
>Speakup at braille.uwo.ca

More information about the Speakup mailing list