What should Festival do better? (was: Speech Dispatcher 0.5 Release Candidate 1 available)

Fri Jul 2 17:48:13 EDT 2004

On Fri, Jul 02, 2004 at 04:01:48PM -0400, Jacob Schmude wrote:
> However, the big issue for me is responsiveness. This, to put it frankly, 
> festival is not. It shuts up just fine, but it takes at least a second to 
> begin speaking. This is a 2.13ghz athlon I'm talking about, and it takes a 
> second to speak a short line. If this line happens to be of a long length 
> (such as an email message listing) festival takes 3 seconds or more to 
> start speaking.

If Festival is doing this on your machine, something is *really*
wrong.  I do agree that the responsiveness is still worse than in
hardware synthesis or in Flite when you are scrolling fast through
lines or doing something similar. But the delays I'm tuning here right
now are in the order of tens of miliseconds and I have a 1.5GHz machine.

(Also, the keys and characters get cached in the Festival output module
if you use Festival through Speech Dispatcher, so they don't even have
to be synthesized most of the time and typing input or moving a cursor
should be quite fast.)

I don't know what might cause such big delays on your machine, but it's
definitely not a standard behavior of Festival and Speech Dispatcher.
Could you please send me a Festival output module debug log after
you install 0.5rc1?

Still, even on my machine, much of the delays in communication with
Festival are caused by issues in socket communication and similar, not
by a lack of performance of my computer, so I think this can be fixed.
However, if I downscale to 500MHz, a lack of performance becomes visible.

> After responsiveness gets fixed, what we need for festival is a 
> high-quality voice set. 
> Come to think of it, flite and FreeTTS could use 
> this as well.

I don't know about FreeTTS, but I've heard that importing Festival
voices into Flite is not a trivial job :( But I agree with you that we
need more (especially more languages!) and higher quality voices for
Festival. The question is who will work on it, since it requires quite
a lot of time to create a new voice. On the other side, Festival is
very flexible in this regard. It uses an extension lisp-like language
and many interesting things can be done this way without the need to
modify C code in the core of Festival.

With Regards & thanks everyone for an interesting discussion so far,
Hynek