Stories
Slash Boxes
Comments

News for nerds, stuff that matters

Slashdot Log In

Log In

Create Account  |  Retrieve Password

Hands-On With Microsoft's Touchless SDK

Posted by Soulskill on Sat Oct 11, 2008 10:56 AM
from the but-are-they-open-source-tomatoes? dept.
snydeq writes "Fatal Exception's Neil McAllister takes Microsoft's recently released Touchless SDK for a test spin, controlling his Asus Eee PC 901 with a Roma tomato. The Touchless SDK is a set of .Net components that can be used to simulate the gestural interfaces of devices like the iPhone in thin air — using an ordinary USB Webcam. Although McAllister was able to draw, scroll, and play a rudimentary game with his tomato, the SDK still has some kinks to work out. 'For starters, its marker-location algorithm is very much keyed to color,' he writes. 'That's probably an efficient way to identify contrasting shapes, but color response varies by camera and is heavily influenced by ambient light conditions.' Moreover, the detection routine soaked up 64 percent of McAllister's 1.6GHz Atom CPU, with the video from the Webcam soon developing a few seconds' lag that made controlling onscreen cursors challenging. Project developer Mike Wasserman offers a video demo of the technology."
+ -
story

Related Stories

This discussion has been archived. No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More
Loading... please wait.
  • by Anonymous Coward on Saturday October 11 2008, @11:10AM (#25339847)

    Can it recognise that someone's about to pick up a chair?

  • Sounds a lot like the stuff developers have been doing with the Eyetoy since PS2... I wonder if this tech will show up on the 360, and they're just getting the kinks out now with this stuff? I don't know if people would use that practically since they would have to switch from having their hands up in the air to down on the keyboard/mouse for various things... Maybe it can be used for kiosks for people who worry about germs...
    • Maybe it can be used for kiosks for people who worry about germs...

      ... Conjures up mental image of shoppers at a suburban mail flailing in space at some computer kiosk ...

      I really wish you wouldn't have said that.

    • Sounds a lot like the stuff developers have been doing with the Eyetoy since PS2...

      Way before that... A guy did this at the University of Michigan back in 1991 or so. There was a little Mac science fair thing at the campus computer store in the Union, and this one guy blew all the other contestants away with his touchless interactive project. Wish I remembered his name.

  • by Sockatume (732728) on Saturday October 11 2008, @11:12AM (#25339863) Homepage
    While it's very vogueish to make comparisons with Apple products lately, Sony's Cambridge studio are the group that spring to mind when it comes to gestural webcam-based interfaces. On a related note, their original Eyetoy tech demos were similarly "keyed to color", using large foam props, although the end product worked on skintones and therefore was heavily dependent on good lighting and contrast. They patented a "wand" with coloured LEDs back in 2005 which provided a reasonable compromise between the two (a month or two before the Wii Controller popped up, and made it all look passe).
    • yea, while i appreciate ingenuity (on Sony's part, not MS doing this years later), the Wii-mote interface seems a much better solution--at least until MS can reduce the processor overhead to a reasonable level. for now though, using a hand-held device to track physical gestures seems like the most viable option.

      it's not necessarily a problem with using optical sensors (the Wii uses IR to track user gestures also), but the web-cam approach is too encumbered at the moment by the need for more advanced machine

  • I have an EEE PC (Score:4, Insightful)

    by gillbates (106458) on Saturday October 11 2008, @11:35AM (#25340003) Homepage Journal

    Running Linux. And the voice commands actually work!

    I'm not sure why I'd bother to chew up my battery with the webcam when I can just talk to the thing. If anything, it seems to me like the voice recognition would be far more promising than using the webcam.

    Okay, I know how this is going to sound, and I'm really not trying to troll, so please bear with me. I suppose there's a contingent of people who like the thought of waving their hands in the air to control their computer (Wii users?!), but I just don't see this going anywhere, especially because Microsoft is involved. If you look at their history, they typically get things wrong the first few times. Whatever promise this technology holds, I expect that:

    1. Any really cool technique will be patented by Microsoft and doomed to obscurity by their poor implementation of same; and
    2. It really is easier for most people to talk to their computer, or use the mouse/keyboard to control their computer, than it is to wave.
    • Re: (Score:2, Informative)

      I tried it out, and the drawing demo seems to be the most promising application. In the absence of a touch-screen monitor, this could be a lot better than an external touchpad. And there's definitely something neat about using a tomato to play snake. Still a long way to go, though...
    • Re: (Score:2, Informative)

      1. Any really cool technique will be patented by Microsoft and doomed to obscurity by their poor implementation of same; and

      From the license:
      "(B) Patent Grant- Subject to the terms of this license, including the license conditions and limitations in section 3, each contributor grants you a non-exclusive, worldwide, royalty-free license under its licensed patents to make, have made, use, sell, offer for sale, import, and/or otherwise dispose of its contribution in the software or derivative works of the contribution in the software."

        • That's an extraordinarily easy condition to meet given the copyleft nature of the license. It's in fact a much more succinct version of the same concept called out in section 11 of the GPLv3. Quoted in part from http://www.gnu.org/licenses/gpl-3.0.html [gnu.org]:

          "Each contributor grants you a non-exclusive, worldwide, royalty-free patent license under the contributor's essential patent claims, to make, use, sell, offer for sale, import and otherwise run, modify and propagate the contents of its contributor version.

          • Where "essential patent claims" and "contributor version" are defined at excruciating length over the course of the rest of the section.

            But does "contribution in the software" in Microsoft license mean the same? What happens with code under both licenses if none of the original code is recognizable yet patented functionality is preserved? As I understand it, GPL has all-encompassing "can't re-license under any other terms but this license" mechanism, and license is tied to the body of licensed code, not its individual parts, patches or projects, therefore everything can be modified (up to being replaced) or re-used in other projects unless

  • I remember reading a year ago that some Toshiba Qosmios could recognize gestures. This [pocket-lint.co.uk] is not the article I read, but the first I found.

    Also, I don't think it's fair to kick Microsoft over this. It seems to be a bit of an experiment. I'd love to see this on Linux though, another step closer to the Minority Report world.

  • OpenCV (Score:1, Informative)

    by Anonymous Coward

    Intel made a very nice open source library for computer vision. It's called OpenCV [http://sourceforge.net/projects/opencvlibrary/] and can be used to track pixels (or hands, or heads...).

    I first saw it on Pycon Brasil 2008, with EHCI python bindings [http://www.slideshare.net/dannyxyz22/ehci-interao-com-computador-atravs-de-webcam-presentation].

    Microsoft library is not a big deal... I made a script to switch KDE desktops using face moviments with 45 lines of python script + ehci, including a lot of useless

  • by nan0 (620897) on Saturday October 11 2008, @12:05PM (#25340249)
    opencv [sourceforge.net] has nice python bindings, runs on mac, win & nix.
    openframeworks [openframeworks.cc] wraps c++ like processing [processing.org] wraps java, also has opencv bindings.

    MS appears to basically doing optical flow & color tracking. the above libs can do those, and more, and are great for programmers and nonprogrammers alike. tho if you really hate code, you may rather use max/msp/jitter or gem/pd [puredata.info].

  • This could be used to great effect with people that have handicaps that prevent the use of standard interfaces. Gestures that they CAN perform can be programed to take the place of gestures they cannot, ones that we all take for granted.

  • Microsoft isn't alone in this, but I do get the impression that they have a few research units that they fund as window dressing, that are constantly presenting exciting demos of pretty cool stuff that never make it into actual products and never will.

    Like Detroit's "concept cars."

    Or Xerox PARC's Alto.

    Or a Fortune 500 company I worked at that collapsed with astonishing speed. Little groups were always coming up with amazing things, and higher-ups were always clucking admiringly over them, but the little gro

  • ...released Touchless SDK for a test spin, controlling his Asus Eee PC 901

    Although McAllister was able to draw, scroll, and play a rudimentary game with his tomato, the SDK still has some kinks to work out. 'For starters, its marker-location algorithm is very much keyed to color,' he writes. 'That's probably an efficient way to identify contrasting shapes, but color response varies by camera and is heavily influenced by ambient light conditions.' Moreover, the detection routine soaked up 64 percent of McAllister's 1.6GHz Atom CPU, with the video from the Webcam soon developing a few seconds' lag that made controlling onscreen cursors challenging.

    Perhaps a machine would be in order that didn't go to the extreme of energy-saving and low-quality manufacturing. Start from the top side and then work down.

    • Re: (Score:2, Insightful)

      Maybe he should try testing it on a real computer next time.... 64% of an underpowered device is not much to complain about.

        See my sig, I'm no MS apologist

      • I would say that the tested device represents the comuting ability of the target market for the software.
      • LPF? (Score:5, Insightful)

        by gillbates (106458) on Saturday October 11 2008, @11:32AM (#25339993) Homepage Journal

        You know, someone should have really told these guys about this thing called a low-pass filter. It's very easily implemented in hardware (heck, most DSPs can do it rather handily), and uses very little power. A TI dsp would have no problem handling this kind of load.

        As for mediocre hardware, yes, the EEE is a little underpowered compared to a desktop. But, when you consider the fact that a 200 MHz dsp can encode NTSC video in realtime, chewing up 60% of the CPU is just poor implementation. That's ~1 GHz on a fully pipelined, superscalar processor, with a heatsink, to do what an embedded DSP can do with oh, say about 50-100 MHz of processing power, without a heatsink, using a RISC processor, running on AA batteries.

        And this yet one of the reasons I believe programmers should have to learn hardware. They wouldn't write code so inefficiently if they only understood the typical hardware engineer's approach to these problems.

        • It's not necessarily poor implementation as much as it is trying to use a hammer to turn a screw. Like you said, DSP's are made for the job. It wouldn't cost all that much to put a little DSP lovin' into a subnotebook in the form of a co-processor.

        • Re: (Score:1, Redundant)

          From the summary

          Moreover, the detection routine soaked up 64 percent of McAllister's 1.6GHz Atom CPU, with the video from the Webcam soon developing a few seconds' lag that made controlling onscreen cursors challenging.

          And your text

          A TI dsp would have no problem handling this kind of load.

          My only reaction was "oh shit, Microsoft software using a high amount of CPU for a given task?! SAY IT ISN'T SO!!!" Sorry ...

        • Uh... DSPs are specialized for that kind of stuff, whereas the Atom is just a cut down x86 processor, and a pretty crappy one at that. But I do agree with the rest of your post.
          • Okay, I know it's a little late to post this, but these are the numbers I'm getting from my EEE 900. I'm running a 3-tap FIR filter to average all the pixels in a dummy frame. This doesn't include the time it would take to pull the frame from the CMOS/CCD sensor.

            On battery alone:

            Resolution: 160 x 120 : 4223 frames, (422.300000 per second)
            Resolution: 320 x 240 : 849 frames, (84.900000 per second)
            Resolution: 640 x 480 : 303 frames, (30.300000 per second)
            Resolution: 720 x 480 : 269 frames, (26.900000 per

    • Troll???? I guess the truth is bothersome to some.
      • The problem with his statement is that you can go look at the code and see if it is/isn't inefficient for what it does. He was trolling with an inflammatory one liner designed to get people arguing over whether it's possible for us MS geeks to actually write decent software.

        Hint: the answer is 'yes'.

        • "He was trolling with an inflammatory one liner designed to get people arguing over whether it's possible for us MS geeks to actually write decent software.

          Hint: the answer is 'yes'."

          Then put your money where your mouth is...show us.

          Maybe your defination of 'decent' is different...

          • http://www.codeplex.com/touchless/SourceControl/DirectoryView.aspx?SourcePath=&changeSetId=25142 [codeplex.com]

            That was very hard. I had to spend a whole 10 seconds searching the internet.

            Now, burden of proof is on you: what is wrong with the source code, available there? Isn't this the basis of the OSS "many eyes" theory?

            • I humbly stand corrected.

              I noticed after I posted that this was being released as OSS, and cringed...alas, my fanboy-ism has led me astray again!

              I will say that this is a good thing, instead of the FUD I flung out earlier.

              Thank you for calling me on this.

              I would rather be ridiculed than just 'plain stupid', and much prefer to be corrected than dismissed.

              BTW, thanks for the link...I was knee-jerk wrong, all the way around.

              *hangs head, sheepishly*

        • He was trolling with an inflammatory one liner designed to ...
          .

          Unless you can read minds, how do you know what the comment was designed to do? It appears that you are the one being the provocateur here.

    • Microsoft has a bunch of labs/departments dedicated to just making stuff until one catches on... a bit like Google, but instead of being "everyone", its some departments. So they'll make stuff like this or DeepZoom, which may or may not catch on... others like Spec# may have more potential.

    • Sort of. Those cameras control the part of the interface that bills you extra if more than one person is in the room when "premium content" is being displayed and reports on you if you leave during commercials.

      I'm joking. For now.
    • My problem was the opposite, I had a really crappy narrative for my 'screenshot'...no video.

      Meh, no telling what these clowns (MS) are up to again, but I can take comfort in their consistency:
      1. It will be buggy
      2. It will have security issues
      3. It will cost mucho $$$$ to maintain licenses
      4. Steve Ballmer will do another 'Developers! Developers!, Developers!' speech about it
      5. It WON'T run on Linux, or the MS codemonkeys aren't done yet
      6. It will cost mucho $$$$ to maintain licenses (did I say that already

    • Yes, but expect a 'vacation' at gitmo for trying this.

      Oh, and be VERY careful with the Roomba [irobot.com]!

      "Save Big with Robot Value Packs
      The smarter way to get it done"

      *note to self* Never (again) post [hic!] drunk!

    • Showing it the middle finger does the same thing.

      • You need to explore 'Help & Preferences' to avoid this.

        This was pounded into me when I read your post, and so help me, clicked on the 'parent' post you replied to.
        I know better, but did so anyway.
        I'm going to scrub my brain out with bleach now.

        • You need to explore 'Help & Preferences' to avoid this.

          with this..

          Moreover, the detection routine soaked up 64 percent of McAllister's 1.6GHz Atom CPU

          I can't stop laughing, I mean it took years for me to finally dump my beloved P3 1ghz, and I feel the pain of seeing your light weight processor struggle under the load... Of what, a possible future UI element? Either way, to include that snippet in the summary was beyond hilarious. Methinks he should just go spend 50 bucks at Newegg and get a dual core AMD processor if processor usage in this case really does warrant such attention.