Speakup Differential Pronounciation Of Strange Characters

Lukas Loehrer listaddr1 at gmx.net
Wed Jun 25 03:52:56 EDT 2008


I do not have a solution and I am not using speakup myself. However I
can provide the following information:

About the line in the ssh man page, in particular the bit saying ($,1rx(B.$,1ry(B)

It actually says: "left paren left single quotation mark dot right single quotation mark right paren"

The problem are the "left single quotation mark" and "right single
quotation mark", which are unicode code points 0x2018 and 0x2019.

What tts are you using with speakup? I just checked and the espeak
command line program can handle these characters correctly when they
are encoded as utf-8. I am using en_US.utf8 as my locale. Also,
speech-dispatcher with the espeak output module works as well. Have
you tried using speakup with speech-dispatcher?

Here is a python session to find out what these characters look like in utf-8:

Python 2.5.1 (r251:54863, Mar  7 2008, 04:10:12) 
[GCC 4.1.3 20070929 (prerelease) (Ubuntu 4.1.2-16ubuntu2)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> s = u'($,1rx(B.$,1ry(B)'
>>> s
u'(\u2018.\u2019)'
>>> utf8 = s.encode("utf-8")
>>> utf8
'(\xe2\x80\x98.\xe2\x80\x99)'

So, the special quote characters are each encoded as 3 bytes in utf-8.

Best regards, Lukas

luke writes ("Speakup Differential Pronounciation Of Strange Characters"):
> Hello
> 
> If anyone pulls up the ssh manual page, and looks for the third occurrence 
> of the word "character" (I.E.
> 
> man ssh
> /character
> nn
> 
> ), you will see a line similar to this, at the top of your screen:
> 
> line.  The escape character followed by a dot (".") closes the
> 
> On my system, I assume for UTF-8 reasons, if I am reading the screen, 
> speakup reads the parenthesized quoted period as:
> 
> (O circumflex. O circumflex)
> 
> If I read through that character by character, I get:
> 
> left paren
> cap gamma
> null
> null
> dot
> cap gamma
> null
> null
> right paren
> 
> If I run "unicode_start" before looking at the page, the full screen 
> version is:
> 
> left paren. right paren
> 
> The character by character version is:
> 
> left paren
> null
> dot
> null
> right paren
> 
> I find every aspect of this to be strange.
> 
> I am no expert on terminal handling: does anyone have any thoughts on what 
> may be going on here?
> 
> First why the supposed quotation show up in this way at all, but secondly 
> why speakup is pronouncing these things as "o circumflex" when read as a 
> phrase, but pronouncing them as "cap gamma null null" when read 
> individually.  I assume that it is reading them as punctuation.
> 
> Luke
> _______________________________________________
> Speakup mailing list
> Speakup at braille.uwo.ca
> http://speech.braille.uwo.ca/mailman/listinfo/speakup
> 



More information about the Speakup mailing list