word documents
Charles Crawford
CCrawford at ACB.org
Sun Mar 16 11:22:05 EST 2003
Hmmm. I never thought about the proprietary issue with MS-Word. I wonder
if we should not be talking with Microsoft to get at least the formatting
info available? Oh yeah, didn't Gates give the
chinese open source Windows? Hmmm.
-- charlie.
At 02:41 AM 03/16/2003 -0500, you wrote:
>There are Word document viewers for Linux console. The one I use is
>called wv. Another is called antiword. No doubt, there are more.
>Because Word is a proprietary format, and the specification is not
>available, the authors of programs such as wv have had to
>reverse-engineer a bit. Because of this, certain things in the Word
>document may not decode as well as we'd like. Nonetheless, I use wv
>and get reasonable results when converting from Word to html. The
>resulting html source is quite bloated, but, it's there.
>
>For pdf conversion, there's pdftotext. This is part of the xpdf
>package, and may already be on your system. Surprise, it was already
>on my stock installation of RH 7.2. the one thing I don't like about
>pdftotext-s rendering, is that hyperlinks get lost. To preserve the
>navigability of pdf documents, I visit <access.adobe.com>, and submit
>the url of a pdf document (assuming I've found it on the web) to the
>form. What comes back is a nice html rendering (links and all).
>
>
>Hope this helps,
>
>
>-Dave
>
>
>_______________________________________________
>Speakup mailing list
>Speakup at braille.uwo.ca
>http://speech.braille.uwo.ca/mailman/listinfo/speakup
More information about the Speakup
mailing list