eSpeak -- features wish list
Hynek Hanke
hanke at brailcom.org
Fri Apr 14 07:20:47 EDT 2006
Hello Jonathan,
I'm happy people like eSpeak so much and it seems it is a very
good technology. I'm going to add the config script for Speech
Dispatcher to the official distribution in the next release. You
inquired about what features could you at for it to be more usable
for accessibility purposes.
I'm the main developer of Speech Dispatcher, a project that
tries to unify the access of free software accessibility tools
to speech synthesis engines.
Basically, what we want to do right now, is to split Speech
Dispatcher in two parts: message dispatching (prioritization etc.)
and TTS API (access to synthesizers). For that purpose, we
developed a requirements document for the API, which also
more or less defines the capabilities we expect from the
synthesizers. You might want to look at the requirements document
http://lists.freedesktop.org/archives/accessibility/2006-March/000078.html
It is still a draft and there will be some changes to it.
But the sub-part about SSML deals with the synthesis settings
capabilities which the users want or would like to have.
Of course I'm posting the link to this document merely as
a potential guideline for you. This API will be implemented by some
layer above the engine drivers and missing MUST HAVE and SHOULD HAVE
capabilities can still be emulated either in the engine drivers or in
the covering layer.
This API is being worked on by Brailcom (Speech Dispatcher), KDE and
Gnome. In fact, KDE is going to use Speech Dispatcher soon.
The things that would most help currently are:
1) Be able to return audio data, not play them itself.
(This would enable us to write a native driver for Dispatcher
or TTS API which could be a good improvement. Also it would
instantly solve the audio problems.)
2) Settings for punctuation and capital letters signalization.
(See TTS API requirements draft above, section SSML. This
doesn't mean this functionality needs to be implemented with
SSML or embedded markup. It can be a static settings to the
binary (espeak --punctuation="all") ).
3) Some way of communication other than running the binary
for each message again (which is more CPU expansive). See for
example how Flite works with Dispatcher (linking a library) or
how Festival works (provides a TCP/IP interface).
I hope I didn't scare you very much :) Of course these are wishes
and some of them rather longer-term wishes. I think you have done
a great work!
Thank you,
Hynek Hanke
More information about the Speakup
mailing list