Feature request : transcriber

Temlakos temlakos at gmail.com
Thu Dec 6 15:32:32 UTC 2012


On 12/06/2012 10:03 AM, Beartooth wrote:
>
> 	People I know are forever sending URLs that are just talking
> heads. I don't do talking heads.
>
> 	But iiuc there is now software that transcribes speech, though I
> don't know how well. Would it then not be possible to write software that
> would go to a site and transcribe what is said there? Even if it could do
> only one speaker, it could save a lot of us a lot of time.
>

Dragon Dictate, and Dragon Naturally Speaking, have been on the 
commercial (that is, shrink-wrap) market for a very long time. They 
transcribe /your/ speech. You sit at the console, hook up a microphone, 
and first read a Mark Twain short story. Then you can speak into the 
microphone, and the Dragon program will produce machine-readable text, 
in your word processor, from what you say.

I have not used Dragon for thirteen years. Back then, all I really had 
was Windows. (Linux was then in its infancy and was little more than a 
command-line implementation of Unix. The rich X system was only just 
then getting started.) Also back then, I had to speak v-e-r-y 
s-l-o-w-l-y. Like, one syllable per second, or even one syllable every 
two seconds. Processors in those days were simply not fast enough for 
Dragon to keep up with, say, someone delivering a speech to a crowded 
lecture venue.

Today, they might be. That is, the Intel Core i7, or whatever equivalent 
AMD might have produced, might be. All I know is that Dragon have 
started to advertise on cable television, something they never did 
before. And they are pitching this program to housewives who think they 
might want to break into novel-writing. Ask any novelist; novel-writing 
is a /very/ text-intensive thing to do.

Three problems:

1. I'm still skeptical that even a modern processor can keep up with 
someone's natural pace of speech. Today the ads say that Dragon produces 
smooth word-processing documents and e-mails "three times faster than 
most people type." But: can they create that text as fast as most people 
/talk/? And do they mean three times faster than a professional office 
secretary can type? (The typical standard was about sixty words a minute 
on an old-fashioned impact typewriter. That's about as fast as I had to 
slow my speech to, thirteen years ago, when Dragon was still new.)

2. I have never seen an open-source implementation, version, or 
equivalent of Dragon. Nor has Dragon, to my knowledge, ported their 
software to Linux. They've ported it to Mac, and I've seen it offered as 
"shareware" (with a $99 suggested "license" fee).

3. I have never seen any claim that Dragon, or anything like it, can 
produce a transcript of a video. Even to implement that would be a 
challenge. Or it might not be: no one has explained to me what steps 
Dragon takes to produce text. If it writes your spoken words to a 
temporary file and then transcribes the file as soon as you speak it 
(within reason), then it just might be able to transcribe an existing 
recorded speech, on audio or video. All it would have to do is recognize 
the codec. (Or else you convert your audio file to a codec that the 
transcription program can recognize.)

Now about your talking-head issue: if you've reached an advertisement, 
you might be able to get an instant transcript this way:

1. Navigate to the page.

2. As soon as the talking head starts talking, hit a button to close out 
the page.

3. A dialog will appear: "Are you sure you want to leave?" Hit "Cancel."

The page will still be there, only now you will see a printed 
transcript. You should then be able to select, copy and paste. I know 
you can still read it all. I've done it a few times.

This is a good question, actually. I'd love to see a "build" of 
something like Dragon. But that development might be way beyond the 
scope of the Fedora project.

Temlakos

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.fedoraproject.org/pipermail/users/attachments/20121206/b48b768d/attachment.html>


More information about the users mailing list