Several problems with the unicode support (was [patch 0/3] speakup: support 16bit unicode screen reading)

Tue Mar 14 18:47:53 EDT 2017

Hi,

  Samuel Thibault wrote:
Tue, Mar 14, 2017 at 10:19:57PM +0100

> Well, speakup is localizable since a long time, but one can only notice
> that it hasn't been actually localized since, while espeak does have
> everything in place, so why not just use that and be done.
> 
When I wrote that,I didn't know that soft/punct is different than
punclvl. Now I set it to 1, and I have punctuation which I can somewhat
control with punclvl and readingpunc, but now I can't get rid of
punctuation completely - even if I set punclvl and readingpunc  to 0, I
still hear some symbols, like comma, dot, dash, etc.

> > And now you lose them in English, too.
> 
> I don't understand this. Is there perhaps yet another bug that wasn't
> fixed or reported?
> 
No, I mean if you use an English voice, but you don't use direct mode,
don't you want the unicode characters spoken?
It's worth noting that I send that letter just right before I saw you've
send a patch, which treats  characters above 256 like in direct mode.
So I don't have other complaints about unicode reading.

> We'd have to think and code a bit about this. The kernel actually uses
> ucs-2 encoding, while people will probably rather feed the internal
> messages as utf-8 strings. But one has to know whether it's utf-8 or
> some 8bit character set which is being used. That question is actually
> related to pasting, for which we need to know the same :)
> 
Well, a byte order mark might be useful here. Or if there's no BOM, may
be assume UTF-8?
How did you know the ASCII encodings til now?

> espeak doesn't speak spaces unless strongly being told to do so :)
> 
Yes, that works. Thanks.

-- 
Best wishes,
Zahari