Coded review.

Tuesday 27 June 2006

Screenshot TextAloud

Pic of the day: The program works by loading documents into what looks like a word processing program (here showing a random technical document) and then reading it aloud. You can also run it directly from Internet Explorer and Firefox. Sadly I can't take picture of the sound, but there's a 30 day free trial.

TextAloud

Being the curious person that I am, I noticed recently an advertisement for TextAloud, a program that indeed does read text aloud. Some weeks ago I read about text-to-speech in my old friend Bjørn Stærk's blog. OK, perhaps that was a couple months ago. Sometimes I talk to my computer and it mostly understands me (not now, though, because my throat is in poor shape, see last week's diaries about my health). So I figured, if it can understand me, I can probably understand it.

After listening to a number of available voices, I realized that only one of them sounded fairly easy to understand: Mike 16, which refers not to his age but to the kilohertz of sampling frequency, I think. For some reason the female voices are all harder to understand, but those are the most popular. I can understand that too, both men and women would prefer to be talked to by a woman, since women are less threatening. When you want to listen to your computer, the last thing you need is threatening. But I'm more rational than that, right? So I stuck with the one easiest to understand, which is Mike. I had to buy the voice separately, the one that comes with the program is Mary and she's too excited. I need something more quiet, stable and low-pitched.

So yeah, I bought the program and the pack of two extra voices (you can't get just one). Didn't set me back too much, $54.95 altogether. I installed it at work so it could read The Economist to me while I work with my other computer on the local network.

I am underwhelmed. It is passable, but it still sounds very robotic. The pronunciation is off and it is all too obvious that it is doing only simple parsing. It takes clues from a few words that tend to appear before pauses, but sometimes that guess is wrong and it sounds quite alien. It fails at the relative length of pauses with some types of punctuation, and it cannot handle the rise and fall of tone in longer sentences. With long words it clearly just wings the pronunciation and sometimes fails miserably. This is understandable since English is somewhat random in how words are put together, probably because of all the pilfering from other languages.

Let us say, don't let this thing teach foreigners to speak English, or they won't get any jobs when they arrive. And this is considered to be one of the best programs.

Text-to-speech technology can benefit from another decade of progress, I think. But if you suddenly lose your eyesight or your voice or something, it beats silence by a wide margin. And it does have entertainment value. If you gather a bunch of your friends, I think you should break it out after two drinks or so... because that's how I am naturally. Easily amused but not yet crazy. (Though you may wonder about that last part, with me having spent several hours' pay on this.)

TextAloud is a product from NextUp.com.


Yesterday <-- This month --> Tomorrow?
One year ago: CoH: Train mission & reputation
Two years ago: Very personal pronouns
Three years ago: Multiple Nanaka Syndrome
Four years ago: Ever smaller gods
Five years ago: Beware those butts
Six years ago: Annotated quotes
Seven years ago: Dream in-laws

Visit the archive page for the older diaries I've put out to pasture.


Post a comment on the Chaos Node forum
I welcome e-mail. My handle is "itlandm" and I now use gmail.com.
Back to my home page.