Reading Lips In Software 149
SEWilco writes "The Register points out that Intel has released code for reading lips from a video image, Audio Visual Speech Recognition (AVSR). They do point out that better results would probably be achieved by combining video and audio recognition processing. I don't know if they have any patents, we all know some prior "art" from 2001, er.. 1968. HAL's accomplishment was also mentioned by CNN during 2001 in an article about this group's work."
The only hope for privacy: (Score:5, Funny)
Men and women, boys and girls. All with really thick, dirty, obscuring mustaches.
What is this world coming to?
Re:The only hope for privacy: (Score:3, Funny)
Re:The only hope for privacy: (Score:1)
Re:The only hope for privacy: (Score:1)
Good or Evil? (Score:5, Insightful)
Go calculate [webcalc.net] something
finally! (Score:2)
I've been putting it off for far too long.
Re:finally! (Score:4, Funny)
Good plan... Oh wait, who would you talk to? Bad plan.
Re:Good or Evil? (Score:1)
I'd be scared of speaking to my computer now, tho - can you imagine a virus that uses your own webcam or whatever to see what you're saying when you're sitting in front of the screen?
I guess no more webcam sex for me. =(
Sigh... (Score:3, Interesting)
I'm thinking that the 'good' will outweigh the 'evil' here...
Re:Sigh... (Score:4, Interesting)
Re:Sigh... (Score:2, Interesting)
From the Reg article... "Intel's announcement implies that the system works better when coupled with facial recognition to identify 'known' speakers."
Doesn't this imply that, at least for the foreseeable future, this technology won't be easily used as some general Orwellian tool? It sounds as though it needs to 'learn' each speaker - much like voice recognition software has to be trained to your voice before it can be used accurately.
From the
Re:Sigh... (Score:1)
Twenty years and why not hear everything?
Re:Good or Evil? (Score:2)
ig-pay atin-lay (Score:4, Funny)
geeze, that really wasn't worth the effort...
Re:ig-pay atin-lay (Score:1)
Re:ig-pay atin-lay (Score:1)
Re:Good or Evil? (Score:1)
This reminds me of the Seinfeld episode where George wants to "borrow" Jerry's deaf girlfriend to read people's lips. George and Jerry try to hide their lips from her as they discuss her lip-reading abilities, then she can read their lips anyway.
No new taxes! (Score:5, Funny)
Re:No new taxes! (Score:2)
[ducks]
What about changing what people say? (Score:3, Interesting)
Re:What about changing what people say? (Score:3, Informative)
I know others can lipread better than I can but even in lipreading class they said that you wont be able to catch everything and have to fill in the blanks.
Just to note, All Deaf people can't lipread and not all people can be lipread. Bushy Mustaches, not moving your mouth when you talk are two big obstacles.
Re:What about changing what people say? (Score:1)
Woot, this is a godsend for us college students. (Score:5, Funny)
And well, beats manual note taking if the computer can read the board and his mouth and his voice.
Re:Woot, this is a godsend for us college students (Score:2, Insightful)
Re:Woot, this is a godsend for us college students (Score:2)
Re:Woot, this is a godsend for us college students (Score:1)
So computers can now talk to themselves (Re /.) (Score:5, Interesting)
http://cerboli.mit.edu:8000/research/mary101/resu
Re:So computers can now talk to themselves (Re /.) (Score:1, Funny)
Yep, it is called networking.
Re:So computers can now talk to themselves (Re /.) (Score:2, Interesting)
Re:So computers can now talk to themselves (Re /.) (Score:1)
Br. Why not combine the whole works with some Animatronics (a la Disney) and make some robots?
Body language (Score:5, Funny)
Body language should be even easier than lip reading. I want to know if I'm wasting my time or whether I should invite her back to my place.
Re:Body language (Score:5, Funny)
Suchetha
Re:Body language (Score:1, Funny)
You: Did you know chickenz cross road to not be on old side?
Us: LOL!! You so funny!!
You: Black peoples are funny with watermelon in their cadillac!
Us: OMG - its funny cause its funny cause itr true!!!
You: Please, take my wife!
Us: Ha Ha ha ah ah aH
You: Nerd get no sex!!!! Ha ha ha
Us: Funny but true but sad but funny to
Re:Body language (Score:1)
Oh yeah THAT'll work (Score:4, Funny)
Frink: (With sarcasm detector) Are you kidding? This baby is off the charts mm-hai.
CBG: A sarcasm detector, that's a real useful invention.
(Sarcasm detector explodes)
Re:Body language (Score:2)
Some coding expertise... (Score:4, Insightful)
Re:Some coding expertise... (Score:2, Interesting)
Re:Some coding expertise... (Score:4, Insightful)
Re:Some coding expertise... (Score:2, Interesting)
Re:Some coding expertise... (Score:2)
Re:Some coding expertise... (Score:1)
Re:Some coding expertise... (Score:2)
Re:Some coding expertise... (Score:2)
Sound pressure waves cause the density of air to fluctuate, which would bend the path of a beam of light travelling through it.
basically, you'd need more than one laser in this situation, i think, you'd need l
Re:Some coding expertise... (Score:2)
Re:Some coding expertise... (Score:1)
Re:Some coding expertise... (Score:1)
Voice recognition that doesn't require training? (Score:1)
Unfortunately, my voice is not the one giving the lectures, and there are actually two or three different lecturers. Since training is impossible (AFAIK, at least), I'm wondering how far speech-to-text technology has come, especially in the open source community. Ca
Re: (Score:2)
Planet Express Delivery Ship (Score:3, Funny)
Fry, Leela, and Bender are hiding out in the shower discussing how to turn of Planet Express Delivery Ship. The little red light is on, the screen is scrolling back and forth between the lips as Leela gives orders and Bender objects. Then the ship says, "Oh, if only I could read lips!"
Orwellian p0ssibilities (Score:2, Insightful)
Re:Orwellian p0ssibilities (Score:1)
Not that 2001 ended up being very accurate... (Score:5, Interesting)
Where there's life there's hope / Please pass the (Score:2)
As for computer lip reading, there's a chapter [mit.edu] in Hal's Legacy about this very topic.
Reason for this being released as open source (Score:2)
Re:Reason for this being released as open source (Score:2)
Re:Reason for this being released as open source (Score:2)
This just in... (Score:2)
Too late for me... (Score:4, Funny)
Oh yeah? Lip Read this! (Score:3, Funny)
Can it read this? (Score:3, Funny)
Anybody played with other languages? (Score:2)
Re:Anybody played with other languages? (Score:2)
Re:Anybody played with other languages? (Score:3, Funny)
Re:Anybody played with other languages? (Score:2)
But can pitch be lip-read? If not, would a system like this work at all for languages who apply pitch aswell as formants to distinguish between words?
Re:Anybody played with other languages? (Score:2)
Japanese lip reading is very hard though. For example, you can't tell if I'm saying, "tsu", "zu", "su" just by my lips. You can also go through "ra", "ri", "ru", "re", "ro" without moving your lips (and if you do,
Ha I've fooled them (Score:2)
fools...
ummm wait.
OpenCV under Linux? (Score:2)
I've investigated Intel's vision library, OpenCV, before... and it does appear to be available for Linux if you look hard enough... but I couldn't find any Linux applications using it to actually *do* something.
Has anyone had any success with OpenCV/Video4Linux?...
How do you think the court system would handle... (Score:3, Interesting)
Would this tool then be declared a "circumvention device" under the DMCA, or would the courts finally realize that code can be considered protected speech? The code was, after all, spoken in its original form in this case.
This same question could also be applied to audio-to-text converters as well. Maybe there's hope the DMCA will be declared unconstitutional after all.
Interesting food for thought...
David
Re:How do you think the court system would handle. (Score:1)
Re:How do you think the court system would handle. (Score:2)
Very good point... for that matter, how would the courts handle it even without this new technology? Even without programs that can read words from video, it is still theoretically possible (though maybe not practically possible) that someone could read the source code to DeCSS aloud onto a video tape, such that someone else at the receiving end could manually record that code into a source file and compile it.
(And if you wanted to be really ironic about it, you could always store the video on a DVD :-) )
Re:How do you think the court system would handle. (Score:1)
Prior Art (Score:4, Informative)
"A computer, examining a set of video images, to perform lip reading" is not patentable. HAL would be prior art for this; but it doesn't matter because there isn't any inventive step here anyway.
"A computer, processing a set of video images by locating what appears to be a set of lips, selecting recognizable points, using the movement of those points to track the deformation against a 3D model, comparing against a table of syllables to compute the probability of each particular syllable, and using knowledge about a language to determine which syllables are most likely to follow each other" could be patented. HAL would not be prior art for this, because there is no indication of how HAL performed the lip reading.
Fox News (Score:2, Funny)
Re:Fox News (Score:2, Funny)
It's cool with me... (Score:2)
SF movies typically don't count as prior art... (Score:4, Informative)
patents are supposed to be on inventions, not ideas. (very) generally speaking, you have to demonstrate you know how to do something for it to count as prior art. actually building something counts, as does a patent application (since the patent application has to explain how the invention works at a reasonable level of detail, for an admittedly arguable legal definition of reasonable).
ianal, but the last i heard, a mention in a science fiction book or movie wouldn't typically be considered prior art. a person skilled in the art can't tell from 2001 how to make a computer read lips.
Stupid, offtopic and not funny. (Score:2)
Lisp ?? (Score:1)
First I thought, Jeeze... I can already read Lisp, emacs style...
Then... ohhhh... they mean Lisps... like a speech impedement... That would be cool, to read lisps.
But reading lips makes much more sense.
Actually, this could be a major breakthrough (Score:4, Interesting)
silence is golden (Score:1)
This could solve my fundamental beef with speech as an interface - privacy! Dictating email and documents would be great, if I didn't have to broadcast to everyone around me. Not to mention the annoyance of hearing the guy in the next cube complain to his girlfriend over IM...
Mouthing words silently takes some getting used to, but it has advantages. No more trying to type on a tiny PDA keyboard - etc. Obviously this is a ways off, but it seems doable.
cool (Score:1)
Still patentable? (Score:1)
"I don't know if they have any patents, we all know some prior "art" from 2001, er.. 1968. HAL's accomplishment was also mentioned by CNN during 2001 in an article about this group's work."
Is there not a difference between the idea and the way to implement the technical solution. Meaning thay cannot patent the idea, but they can still patent the code itself for the way thwe code works.
Just curious. What does everyone think?
I am unreadable (Score:1)
I don't like this at all! (Score:1)
The REAL THREAT of this is "them" using camera's to look at people from afar (or by whatever means) and eavesdropping on people when they can't get a microphone in..
You can be sure that H.L.S. will jump on this like white on rice...
They certainly aren't the first (Score:2, Informative)
video quality problems (Score:2)
But then on a DVD you'd just hit the subtitle button and problem sorted
Bah humbug... your brain already does this. (Score:2)
Sports (Score:2, Funny)
Soviet Russia (Score:1)
Lipreading is a myth, as is this code working. (Score:4, Informative)
The idea most people have of lipreaders, like in the movie See No Evil Hear No Evil (Richard Pryor Gene Wilder comedy) or the Seinfeld lipreader episode just really isn't possible. Many sounds such as "t" and "d" look the exact same, and many such as "k" and "g" are not visible at all. The best lipreaders really can only get 2/3 of what is being said, (if they are entirely Deaf, which many Deaf people are not, if your hearing loss is not total it can be far more efective) and that is with the person speaking slowly, facing them, and human intuition (context). Throw in facial contortions, (like yelling... "they can't hear me so if I yell it will help") low light, bad angle, fast talking, etc. and the accuracy drops dramatically.
Computers lack the ability to figure out what word is being said based on context when the lips don't provide adequate information. They are also historically terribly poor at things like complex image recognition. Registration script busting is based on what? Image recognition with noise in the image (i.e. type the word that appears in the next form box) and no one has even come close to a functional computer ASL interpreter and ASL is far easier to disguish visibly than speech.
I don't see that 40% word error rate it is currently having being able to improve much at all, and I'm guessing the video feed that's off of isn't anything like fullspeed nonexagerated human speech.
Your fears of the video cameras on the streets logging your conversations are pretty unfounded
Reading lithpth? (Score:2)
The Conversation (Score:1)
The idea of combining it with speech recognition in an adaptive fashion, using one source to cross-check the other, could open up a whole new area of privacy invasion.
Imagine this stuff running on all the CCTVs in the town where you live...
does'nt work well (Score:1)
Re:Copyrighted Prior Art (Score:3, Interesting)
Just in case anyone gets the wrong idea here, copyrighted works cannot be used to contravene a patent.
erm, yes they can. In fact, the firm I work for specializes in that very thing.
Re:Prior Art? (Score:5, Interesting)
No, he never did. If he had, he would almost certainly by now be far and away the richest man on the planet. Now, imagine if you will what Arthur Clarke might have done with a fortune that would make Gates green with envy... He'd have been on Mars twenty years ago.
Re:Prior Art? (Score:1, Funny)
--
I'm not a cowboy, just let me submit this comment and get back to work.
Re:Prior Art? (Score:2)
Re:Prior Art? (Score:2)
I see you also had the same thought I did. Clarke would have been the richest man, but he wouldn't be on this planet. If not on Mars, he'd at least be...in Clarke Orbit.
(I like his paper's discussion about whether radio frequencies might pass through the atmosphere: "..we have visual evidence that frequencies at the optical end of the spectrum pass th
Re:Prior Art? (Score:2)