Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
×
Encryption Security Programming United States IT Technology

USAF Wants To Find Steganographic Content 267

Bud Higgins writes "The U.S. Air Force has posted a Small Business Technology Transfer Program (STTR) solicitation in which they seek proposals for the automated detection of steganographic content. They seek an application that should run both unobtrusively in the background and in a manual mode, and provide the user the capability to scan all email attachments, downloaded materials and accessed files with an appropriate steganalysis algorithm, reporting any abnormal results (i.e. the presence of steganography). I personally don't think that is feasible, but maybe a good programmer can prove me wrong. A link to the solicitation AF04-T008 can be found here. For those who are not familiar with the SBIR/STTR program, it provides up to $850k for 3 years of research." This sounds very similar to what Niels Provos did over a several-year period at University of Michigan's CITI and released under a free license. I hope the USAF doesn't spend too much of my money without considering extending that research.
This discussion has been archived. No new comments can be posted.

USAF Wants To Find Steganographic Content

Comments Filter:
  • Feasible? (Score:5, Informative)

    by jmv ( 93421 ) on Sunday January 11, 2004 @04:37AM (#7943585) Homepage
    ...reporting any abnormal results (i.e. the presence of steganography). I personally don't think that is feasible...

    I think it probably depends on where you hide the data. For instance, it's probably harder to hide data in the LSBs of an image than, e.g. a file that's supposed to be white noise ("Hey, my mic doesn't work, it only records noise. See for yourself"). Of course, the less data you encode, the harder it is to detect it.
    • Re:Feasible? (Score:5, Insightful)

      by RomulusNR ( 29439 ) on Sunday January 11, 2004 @04:51AM (#7943624) Homepage
      Uh, sure, the "this is supposed to be random noise" trick will work about as long as the average spam-filter-avoidance trick lasts.

      "The enemy is sending out an abnormally large amount of random noise data. Must just be having microphone trouble. Nothing to see here."

      Roger that.

      No +1, cause I've been drinking...
      • Re:Feasible? (Score:3, Informative)

        by interiot ( 50685 )
        You don't really need to send random noise though... small amounts of randomness (but large enough to hide data in) exist in bits of files that people send around... most notably sound, image, and movie files, which, lucky for us, are just the sort of files that strangers tend to pass around in abundance.
        • by eguaj ( 612494 ) on Sunday January 11, 2004 @08:22AM (#7944089)

          Why bother with cryptography/steganography/etc. when you can use slashdotography ?

          You simply post your message in clear form in the comments of a "highly trollistic" news, and your message will automatically become hidden and indetectable with all the noise surrounding it.

      • by Sycraft-fu ( 314770 ) on Sunday January 11, 2004 @06:55AM (#7943880)
        In audio that is. SAy you decide to start hiding stuff in live performance music, as in fan recorded data. Much of that is distributed in 24-bit format since we are talking about hardcore people here. Well, this is good already, seeing as you aren't going to find 24-bit converters that really get 24-bits of SNR. So you have plenty of inherant noise to begin with. Add to that the noise of a concert and you've plenty to mask the signal with.
        • by Anonymous Coward
          You'd have go go around obtaining lots of original recordings. Like using an one-time pad, with stego, you can't use the same source twice, nor can you use a source that's already available. You need to be the sole source. Otherwise the enemy can do a binary comparison and see that there's something different, possibly hidden data.
        • Not quite that easy (Score:4, Interesting)

          by wirelessbuzzers ( 552513 ) on Sunday January 11, 2004 @12:38PM (#7945304)
          The problem with the LSBs of an image is that they aren't quite random. Unless the image is raytraced or otherwise artificially produced, there's a fair amount of order there. Even a raytraced image might not be quite random.

          The same holds with audio. For instance, crypted data is white noise, but concert noise is "pink noise" which has a characteristic spectrum. The noise produced by converters is closer to white, but it isn't quite either. People like Neils Provos have been studying this for a while, trying to find out which bits they can change without altering the statistics of the image or audio, but with limited success. As of last year (don't know how it is this year), all published steganography schemes at least a few months old had been broken.
          • Ahh, but the noise of converters is white noise. So all you need are some cheap 24-bit converters, and there's no shortage of those, and you are good to go. You get some cheap portable that has a SNR of sometihng like 102-105dB. Ok well that needs a maximum of 18-bits to actually encode that resolution. Now since there can be some signal below the noise floor, and since you want to be carful, take two more bits on that. That still leaves you 4 bits per sample to use that is going to essentially be pure whit
      • Of course, i was just talking about the possibility to detect the steg itself, not the human factor... Even sending 24-bit music and all might sound suspect.

        The other important issue is whether the "ennemy" knows what kind of steg you might use. That helps detection a lot.
  • Hrm (Score:5, Insightful)

    by Cave Dweller ( 470644 ) on Sunday January 11, 2004 @04:38AM (#7943592)
    Those of you paranoid enough will probably chime in with something along the lines of "Yeah, but Echelon probably has something like this built-in already!". Anyway, isn't the point of steganography to hide information in such a way that you *cannot reliably* tell whether the information was there in the first place?

    I'm not sure what they're looking for here; perhaps a better steganography algorithm?
    • Yeah, this is a stupid idea.

      It's in the league of the millions of requests PGP gets to decrypt user data because they forgot the password.

      Just asking the question implies a kind of ignorance that frankly I find worrisome given the responsibilities these guys have.
    • Re:Hrm (Score:5, Insightful)

      by johannesg ( 664142 ) on Sunday January 11, 2004 @06:26AM (#7943820)
      They might be looking for an algorithm that establishes just how random the "random bits" of a file are. For example, you would expect the least significant bits in a jpeg to be more or less random - any degree of organisation there could be a hidden text or something else.

      I would expect such an argument to have specific knowledge of various file formats, since randomness in a jpeg is not quite the same as randomness in for example a .EXE file.

      I would further expect that my approach would be soundly defeated by first encrypting the information to be hidden, since encrypted data looks a lot more random than normal data anyway.

      Personally I doubt it can be done. You might be able to defeat specific steganographic algorithms, but the general case cannot be solved. It would be a bit like having a universal decryption algorithm...

      • Re:Hrm (Score:4, Insightful)

        by Ugmo ( 36922 ) on Sunday January 11, 2004 @09:56AM (#7944355)
        I would further expect that my approach would be soundly defeated by first encrypting the information to be hidden, since encrypted data looks a lot more random than normal data anyway.

        It would still be somewhat valuable to know that encrypted messages were being sent even if you do not know what the content is. If you know bad guy #1 is posting some steg encoded pictures on his porn site and bad guy #2 visits it on a regular basis (along with 1000's of other non-bad guys) you could at least get a clue that something is up if bad guy#1 changes the frequency or number of his updates. In short, traffic analysis.

        If you cannot detect any kind of steg whatsoever, you can't even get this info.
      • Re:Hrm (Score:3, Interesting)

        by starm_ ( 573321 )
        Actually this is not a good method. The least significant bit of text is not less random than images. It is often even more random.

        I have read a paper on this and they used the opposite method than what you propose. They assumed images have sections which are not very random. (most images contains some areas with uniform color) If the least significant byte of an image is very random compared to the other bytes it can indicate steganography.

        Of course you have to ajust the thresholds to account fo the diff
      • Re:Hrm (Score:2, Funny)

        by gumpish ( 682245 )
        It would be a bit like having a universal decryption algorithm...

        No sweat. Didn't you see Sneakers?
      • Re:Hrm (Score:3, Interesting)

        For example, you would expect the least significant bits in a jpeg to be more or less random - any degree of organisation there could be a hidden text or something else.

        Actually, I would expect relatively little randomness in a compressed image, because removal of randomness (along with redundancy) is what compression is all about. And since well-encrypted data should appear random, you'd get further by testing for bits that are too random, rather than for hidden structure.

        • Re:Hrm (Score:3, Informative)

          by tftp ( 111690 )
          because removal of randomness (along with redundancy) is what compression is all about.

          I am afraid you have it backwards. Compression is removal of repetitive, guessable parts. The better you compress, the more random the output becomes. Perfectly compressed data consists of bits where each bit has no relation whatsoever to any other bit in this data.

          So it is perfectly possible to hide information in large data files. The original request is impossible, because you not just need to reliably extract the

      • They might be looking for an algorithm that establishes just how random the "random bits" of a file are. For example, you would expect the least significant bits in a jpeg to be more or less random - any degree of organisation there could be a hidden text or something else.

        Yes, but you'd be putting encrypted data into the LSBs. And encrypted data looks like random noise. So how could such an algorithm detect that? Maybe the answer is to use psychics [amazon.com].
  • Oh yeah? (Score:2, Interesting)

    by Mynkami ( 740099 )
    "They seek an application that should run both unobtrusively in the background and in a manual mode, and provide the user the capability to scan all email attachments, downloaded materials and accessed files with an appropriate steganalysis algorithm, reporting any abnormal results (i.e. the presence of steganography)."

    Suuuuure, Carnivore anyone?

    • Re:Oh yeah? (Score:5, Insightful)

      by Soko ( 17987 ) on Sunday January 11, 2004 @04:58AM (#7943634) Homepage
      Take off the tinfoil hat, dude. Checking all pics on the net for steganographic info is virtually impossible - just too much info to sort through in a reasonable time frame.

      They likley want this to scan documents leaving thier internal network in an attempt to catch people who are sending out sensitive or secret info. To me this looks like the USAF is plugging a leak, not going on the hunt.

      Soko
      • Comment removed (Score:4, Interesting)

        by account_deleted ( 4530225 ) on Sunday January 11, 2004 @05:55AM (#7943767)
        Comment removed based on user account deletion
      • Re:Oh yeah? (Score:5, Insightful)

        by SlashdotLemming ( 640272 ) on Sunday January 11, 2004 @08:25AM (#7944094)
        They likley want this to scan documents leaving thier internal network in an attempt to catch people who are sending out sensitive or secret info. To me this looks like the USAF is plugging a leak, not going on the hunt.

        That's exactly one of the reasons for the technology. The DoD has an obligation to protect sensitive information. There are a crazy number of hoops that need to be gone through to get unclassified info off of a classified system. They can't have people encoding stuff in pictures of Barney then walking away with it.

        I know the usual paranoids are up in arms about the AF doing this, but the same people would flood "The DoD is so stupid" if it were found out that people were abusing the technology to transport classified info.
        • Re:Oh yeah? (Score:3, Insightful)

          by dvdeug ( 5033 )
          There are a crazy number of hoops that need to be gone through to get unclassified info off of a classified system. They can't have people encoding stuff in pictures of Barney then walking away with it.

          Step number one is, even if it looks innoculous, don't let it through. Nobody is going to let you email or floppy a picture of Barney out of a classifed system, because there's no reason to, and it might contain classified information. It doesn't matter what the stegnography filter says, it won't go.
          • Step number one is, even if it looks innoculous, don't let it through. Nobody is going to let you email or floppy a picture of Barney out of a classifed system, because there's no reason to, and it might contain classified information. It doesn't matter what the stegnography filter says, it won't go.

            You can send email from classified systems. It's only to other classified system though, because its a closed network. Hack the Pentagon website all you want. You'll never get the meat, because it's not on the
  • SBIR/STTR program (Score:5, Informative)

    by Wavicle ( 181176 ) on Sunday January 11, 2004 @04:43AM (#7943605)
    I work for a company that is funded through a SBIR grant, so on behalf of the company I work for and to all tax paying Americans let me just say: Thank You!

    It really is an interesting government program. All the IP we generate with the money stays with us. However in the interest of equitable return to the taxpayer, we have decided to release all of our core software components GPL. (Okay, okay this also helps when it comes time for our semi-annual review, to show that we aren't just soaking the taxpayers.) We hope to turn a profit partially by our user interface components (non-core code that we are not releasing) and also through support.

    Trying to get one of these grants is highly competitive, but if you have a really good idea and don't want the vulture capitalists to "fund" you, this is a great program.
  • stego wrapped pgp (Score:3, Insightful)

    by Macgyver7017 ( 629825 ) on Sunday January 11, 2004 @04:49AM (#7943619)
    Maybe statistical analysis can determine if a given image or other medium is possibly hiding information. But if that information is encrypted, doesn't it look like random data without the key? Without knowing the key or even the cipher used to encrypt it... how can it be shown to actually be information? "That's just random noise/corruption in my images your honor... I dont know what your talking about"
    • by Nonesuch ( 90847 ) on Sunday January 11, 2004 @05:17AM (#7943678) Homepage Journal
      Maybe statistical analysis can determine if a given image or other medium is possibly hiding information. But if that information is encrypted, doesn't it look like random data without the key?
      Yes. One quick-and-dirty test of the strength of a cryptographic algorithm or hash function is that the output appears random, and a small change in the input results in a large change in the output.

      If the steg'd data has obvious headers and block formatting, a weak algorithm could leave enough of a pattern in the output file to be detectable. And of course some applications of stego are used to embed cleartext data...

      Without knowing the key or even the cipher used to encrypt it... how can it be shown to actually be information? "That's just random noise/corruption in my images your honor... I dont know what your talking about"
      Proponents of stego sometimes suggest it's use in environments where even the suspicion of crypto is enough to risk persecution and/or prosecution.

      The other "trick" to detecting stego is that "normal" JPG/BMP/WAV/MP3/AVI/MPEG files tend to not actually show a high degree of random noise -- the seemingly random data in the LSB tends to have a pattern imposed by the encoder used and the input device.

      I'd guess that this problem is more of an issue on highly-processed information from clean sources. You wouldn't expect random noise on an MP3 file ripped off the latest pop album release, but it wouldn't be out of place on a .SHN "bootleg" recording of a TMBG live concert from a handheld DAT recorder...

    • Re:stego wrapped pgp (Score:5, Interesting)

      by Ronald Dumsfeld ( 723277 ) on Sunday January 11, 2004 @05:33AM (#7943713)
      Maybe statistical analysis can determine if a given image or other medium is possibly hiding information. But if that information is encrypted, doesn't it look like random data without the key? Without knowing the key or even the cipher used to encrypt it... how can it be shown to actually be information? "That's just random noise/corruption in my images your honor... I dont know what your talking about"

      Statistical analysis can indeed detect where hidden information is placed into an image, usually by noticing that the balance of the image is off. In fact, using encrypted data is more likely to stand out because images are not usually populated with statistically random data.

      Here's a piece on scanning Usenet [xtdnet.nl] for hidden images. As a broadcast medium you'd expect it to be most frequently used as you can anonymously post material and it is well-nigh impossible to locate the intended recipient.
  • 1. Win contract.
    2. Base new software on Mr. Provos' work.
    3. Profit!!

    In an IT world where profit is linked to enterprise software, this will be a very interesting piece of work for somebody. Kudos to the winner. I would bid myself if I was a US citizen!
  • by argan0n ( 684665 ) on Sunday January 11, 2004 @04:51AM (#7943622) Homepage Journal
    As stegdetect [outguess.org] (last time I checked) easily fails on files created with steghide [sourceforge.net]
  • Wonder why Air Force (Score:4, Interesting)

    by Saeed al-Sahaf ( 665390 ) on Sunday January 11, 2004 @04:53AM (#7943628) Homepage
    The Air Force has always been at the fore front of technological thought within the military. I've been Air Force since 1984, and currently work in Information Management, although my first career field was Fire Fighting, I cross trained into IT in 1998. I work with many first class programmers and network guys, most of them classic "hackers". It does not surprise me they are looking at this.

    One thing that does surprise me is that they have allowed the Air Force guys to look at this at all, it seems much more like an Army or NSA thing.

    • One thing that does surprise me is that they have allowed the Air Force guys to look at this at all, it seems much more like an Army or NSA thing.

      The Air Force does quite a bit of intelligence work. They share some resources with the NSA, and give intel to the Army. Lately there's been a big push toward the idea of "information warriors," since we've proven that we can blow stuff up -- now we just need nerds that are bright enough to find the bad guys.

      Yes, this primarily is the domain of the NSA, but

  • pattern deviance (Score:3, Informative)

    by RomulusNR ( 29439 ) on Sunday January 11, 2004 @04:54AM (#7943630) Homepage
    I'd expect that a fair amount of first-order steg would be detectable by a process that examined all patterns in a data stream, and spotted that or those patterns that were UNLIKE the other patterns in the data, based on some heuristic.

    Of course, if you were to steg with an OTP or some such (i.e. your steg is based on deviance from a known data set), you'd more easily escape such detection.
  • by Anonymous Coward on Sunday January 11, 2004 @05:02AM (#7943644)
    In "Unification" (Star Trek episode 108), the cloaked Klingon ship that delivers Picard and Spock into Romulan territory sends a coded message to Enterprise that is piggybacked on surrounding Romulan transmissions. If the Romulans were not able to discover this in their time, what makes the USAF think they'll be able to do it now?
  • Interesting (Score:5, Insightful)

    by arvindn ( 542080 ) on Sunday January 11, 2004 @05:03AM (#7943647) Homepage Journal
    Looks like detection of steganographic content might be a significantly easier problem than decoding it. The reason is that normal compressed images don't have redundancy -- i.e, the image file size is no larger than it needs to be for the quality (information content) that it has. But embedding a message introduces redundancy, by an amount proportional to the capacity of the stego system. This can be detected, the programmer only needs to have a good grasp of the image format, domain transformation techniques etc.

    But I had a this little idea. Suppose we "pollute" normal images with random data with say 1% redundancy. What I mean is, whenever you create an image you take some random data and steganographically embed it in the image. Write a gimp plugin or something so that the process is transparent and automatic. Your file only becomes 1% bigger, so its no big deal. Not everyone needs to do this, just sufficiently many people so that the vast majority of the positives of stego detection systems are going to be false positives. As long as the message is encrypted before embedding, it is provably impossible to tell a genuine stego image from a false positive, assuming that the underlying encryption isn't broken. So you get a secure stegosystem with 1% efficiency "for free".

    [dons tinfoil hat]

    We'd all better soon start doing something like this, given where governments are going.

    /me runs off to patent office

    • "Not everyone needs to do this, just sufficiently many people so that the vast majority of the positives of stego detection systems are going to be false positives. "

      these aren't fake positives they are show people who have used the defective encoder. You then take this portion of the images and look for deviation from the normal actions of this plugin. The remander have a good chance of containing stegographic content.

      You haven't made it harder to find stegged images you have cut down on the work neede
      • Re:Interesting (Score:3, Insightful)

        by Anonymous Coward
        Actually, if the plugin uses a good enough random source then it's not possible to distinguish the results from good steganography. That's kind of the point. The problem that the original poster is trying to solve is that good steganography is too good at looking like completely random data, and there's not that much completely random data when real-world codecs and image formats are involved...
    • Re:Interesting (Score:2, Interesting)

      by Anonymous Coward
      My guess is that they aren't so interested in decoding it. Well, they would like to be able to do that, but their main intent is probably to know when someone is sending an encoded image out of their network. That person would then get investigated for possible espionage. In fact, in a case like that, decoding it would be a hindrance to the Air Force. Here's an example:

      Suppose you work inside the Air Force and want to blow the whistle on them for some illegal acts. So you gather the incriminating docu
    • Actually, reliability varies inside normal detection needs. And the reason for this is demonstrated by the first sentence. The first letters of each word spell out your handle. What kind of program could automatically detect such things given the infinite varieties?

      You can at best detect a tiny subset of steganography algorithms, then along will come a smarter fish. Can you find the second hidden message in this post?
    • Re:Interesting (Score:4, Interesting)

      by saforrest ( 184929 ) on Sunday January 11, 2004 @07:18AM (#7943910) Journal
      But embedding a message introduces redundancy, by an amount proportional to the capacity of the stego system.

      I don't think you mean 'redundancy' here, since the added data is obviously not redundant. It can't be, since it has to encode the steganographic message.

      I think you mean 'apparent redundancy', i.e. the container file would appear to be redundant to someone who doesn't know there's a secret message since it's larger than it needs to be.

      However, this problem can be avoided if the encoder simply chooses a steganographic method which does not increase container size. As a trivial example of this idea, consider

      this stegangraphic tool I wrote [forrest.cx] which is based on permuting HTML tag attributes.

      Clearly, tag attributes must have some fixed order when written into a file. My program simply permutes them in a specific way within the file, thus encoding content without increasing container size.

      The general idea is to make use of the existing redundancy of the container to encode data. The one caveat here is that the amount of container redundancy is bounded above by the size of the container, so there is a fixed maximum amount of data that can be encoded.
    • Why are you interested in creating a smoke screen? I have always been of the opinion that anyone who wants to attempt to crack my encrypted material is welcome to try. If they succeed then it represents a security failure in the chain. Fine. Now we just have to improve the encryption or the means by which we keep our keys secret.

      In short, rather than creating a smokescreen of false positives for their system, why not take it as an incentive to improve stenography.
      • The reason is simple, to make sure that your communication will get through, that it will not be censured.

        Say your in prison and want to organize your evasion with some outsider. You know that ALL of your mail will be read by your guardians. So if you encrypt your message and send the cyphertext as is, your guardians will just keep it for themself and never let the mail go to the recipient.

        However, if you hide the message in a letter that looks normal, then your pretty sure that your mail will not be cens
    • Re:Interesting (Score:3, Informative)

      by Lumpy ( 12016 )
      oh hell it's easier than that.

      I wrote a program back in college that did better than that.

      your "hidden data" must be 1/16th the size of the total image size. I used tga files as they were very common back then.

      I simply encoded my data one bit at a time into the lsb of every other pixel. extremely small changes in the pixel color so it's undetectalbe by the human eye. and I'd bet that it's undetectable by every detection program out there. I even wrote in a function to specify the number of padding 0's
    • It doesn't matter what /we. do to make stenography more difficult to detect. The USAF (and the rest of the alphabet agencies) want this to screen their incoming and outgoing traffic. Any positives at all are bad for an internal agent, false or not. You're still under suspicion, right? The problem is that paranoids are always suspect, to a degree it is considered aberrant behavior because the general populace follows the 'innocent people have nothing to hide' philosophy.

      then again most of /. is not the gen

    • ok, my bullshit meter is off the chart here and I'm only on the second +5 comment here.

      Steganography hides *very small* messages in other *much larger* messages. By its definition it's impossible to detect. Here's why.

      First of all, any terrorist worth his 76 virgins first encrypts the message to be sent. Good cyphers produce output that is statistically random so theres a good probability that the new message to be hidden is infact: random.

      Now, you take a huge file, say a wav, or a bmp, and every f

  • I personally don't think that is feasible, but maybe a good programmer can prove me wrong.

    The "solution" can be implemented with the current laws and regulations, and I think the programmer is only a small part to make this system work. A lot of enforcement authorities have to come together and the current evidence suggests that they will come together. Of course, it is a moot point that by the time they figure this out, people would have learned to hide data in other creative ways - the eternal cat-and-rat game ...

    Consider this

    the automated detection of steganographic content.

    If Adobe (and others) could be forced to include in their code methods to detect currencies Slashdot | Photoshop CS Adds Banknote Image Detection, Blocking? [slashdot.org] and not disclose it till they were caught by some vigilant users, what makes us so smug that other major companies with "closed" software are not already in-bed-with-the-feds ? So, it is conceivable that the automatic detection may be going on and we wouldn't be any wiser.

    They seek an application that should run both unobtrusively in the background and in a manual mode,

    See the Adobe example of how such "spyware" can be forced to run "unobtrusively."

    and provide the user the capability to scan all email attachments, downloaded materials and accessed files with an appropriate steganalysis algorithm,

    Major Email providers like Yahoo and Hotmail already provide automatic scanning for virus, AOL is including automatic scanning for spyware, MicroTrend (?) already has Online Virus Scanning of your Hard Drive (!), and so under the threat of the Patriot Act (and it's ilk) many of these companies can be forced to scan everything that goes in and out of their systems.

    reporting any abnormal results (i.e. the presence of steganography).

    This is the key. Now the threshold for "abnormal" has been reduced so much (almanac carriers as potential terrorists, CAPPS passenger detection based on names and 15 flights were cancelled last month based on this, anti-war protestors as possible terrorists and hence being tailed by the Feds etc.) that the problem of false alarms no longer dogs the current administration and law enforcement agencies.

    This is the crux. When the error threshold is reduced so much that the high rates of error are no longer problematic, then any solution (whether efficient or not) can be implemented. Who cares whether it works well or not. Till now the false alarms were the things that stopped such 1984-ish like scenarios from unfolding. Once you accept high errors, and accept even high collatoral damage as the price of doing "business," you can have a solution to almost anything implemented - whether it deserves to be implemented or not is a whole different issue. But who cares? You got nothing to hide - Right?

    • I can imagine terrorists or criminals starting to use open source software in the future because of this. Then some marketing or PR department of some large closed source or any sworn enemy of open source (ie. SCO) would start sprouting FUDs about open source and damage it's credibility. Worse, it could push the government to regulate it.
  • Finally... (Score:4, Funny)

    by FooGoo ( 98336 ) on Sunday January 11, 2004 @05:13AM (#7943669)
    A use for the code I wrote to sort porn based on image content. I can see it now. Project JISM: Joint Image Statistical Modeling. Any my mom said my chronic masterbation wouldn't get me anywhere.
    • by Tokerat ( 150341 ) on Sunday January 11, 2004 @05:31AM (#7943708) Journal

      by FooGoo (98336) on 05:13 AM EST -- Sunday January 11 2004

      A use for the code I wrote to sort porn based on image content. I can see it now. Project JISM: Joint Image Statistical Modeling. Any my mom said my chronic masterbation wouldn't get me anywhere.
      Up all night doing research, I see? ;-)
    • Re:Finally... (Score:2, Interesting)

      I wonder if anyone has done a statistical analysis of spelling errors in emails by American youth. Talk about undetectable ways to hide a message in plain text!

  • by marcello_dl ( 667940 ) on Sunday January 11, 2004 @05:28AM (#7943702) Homepage Journal
    ... from stenographic content. Either he knows it's there (so he won't report it, surely) or he doesn't know (so he does not extract the potentially dangerous content). A scan for steganographic content should be performed by ISPs or by something like carnivore.

    Anyway the USAF initiative is more clever than it seems, because vital steganographic content (terrorist plans and so) must be hidden in "popular" files, to make it hard for the good guys to find out the intended audience of the message. So a user level scan might be somewhat helpful.

    It will also give a good excuse to people caught surfing for porn ("I am just helping out the USAF, dear!").
  • by graf0z ( 464763 ) on Sunday January 11, 2004 @05:28AM (#7943703)
    The basic problem with steganography is that it hides content in noise but compression reduces noise.

    It is easy to 'steganohide' content in uncompressed noisy files like tiff or wav. But that content gets destroyed by lossfull compression which is mainly used by multimedia formats (jpeg, mpeg, divx, mpg3, ...). If not, it's called a watermark, but (un)fortunately nobody found a watermark algorithm yet which is robust against lossfull codecs and adding some more noise.

    So You have to steganohide Your content after compressing. But compressed files have much less noise, and that noise is not random noise but has statistical quirks. If You just hide Your content as white noise and add it to the file - thats detectable, because it changes the statistical behaviour of the file!

    Instead You have to write an specific steganografic algorithm for each lossfull compression format You want to hide content in! It has to respect the 'format noise character'. That's what Niels Provos did for pnm and jpeg with outguess [google.com].

    /graf0z.

  • I wonder if they've talked to this guy [dartmouth.edu]

    He claims to have a system which can detect modifications to photographic images.

    Any tampering with a photographic image causes detectable statistical changes. These changes can indicate that the image may have been edited to change the content or possibly that steganographic data has been added.

  • by freidog ( 706941 ) on Sunday January 11, 2004 @05:41AM (#7943735)
    paper (pdf) [xtdnet.nl] on detection of steganographic messages based on simple statistical analisys of the image. It seems to work well against 2 of the 3 major steganographic endodings they tried.
  • It is very clear this is an impossible task. All one needs to do is run a standard PUBLIC KEY ENCRIPTION - you can get the code from www.openssl.org - then stow the encripted bits into the noise in the target file.

    It can be stowed as replaced low order bits where the address of the bit is generated via a hashing function.

    Even IF ( a really big IF here) it is possible to determine which bits were flipped (XOR) or stowed, one is still faced with knowing the arbitrary hashing function that was used.

    If one
  • by shadowcabbit ( 466253 ) * <cx.thefurryone@net> on Sunday January 11, 2004 @05:52AM (#7943759) Journal
    For any such system to work, it would have to basically be the greatest code-cracking machine on the face of the planet. More than that, though, would be the implications of false-positives. Let's say I send a photoshopped picture of, oh, I don't know, Natalie Portman to a buddy who works for the Air Force. The system, working under the operating parameters it's set to work with, picks up on a specific pattern of bits in the picture and determines that it's a coded message. The coded message is decoded to, inexplicably, reveal GPS coordinates, a date/timestamp, and the phrase "Free XXXXXX" (or some equally suspect verbiage). What would YOU think the "message" meant?

    Given enough processing power, even /dev/rand can produce terrorist messages. It's the million-monkey problem, except with thermonuclear weapons.
    • Given enough processing power, even /dev/rand can produce terrorist messages.

      It would have to be an enormous amount of power. Consider we limit the possibilities merely to the alphabet.

      To come up with the word 'the' would be reasonably common place. The odds are 1 in 27*27*27 (26 letters plus space), or 1 in 19683, that any three outputs from a purely alphabetical /dev/urandom would give you that.

      But the word 'the' is hardly a meaningful message. Let's consider 'The quick brown fox jumps over the lazy

    • Nope. (Score:2, Insightful)

      by mindstrm ( 20013 )
      The idea is to detect the likely presence of stego.. not to decode it, tha's an entirely different thing.

      Analyzing a jpg or png to staistically determine if it's "clean" or has a message in it is not all that difficult. Decoding that message is a totally unrelated feat.. more likely reserved for cryptographers.
  • Think of code consisting from selectively placed LOL, OMG, ROTFLMAO, HEH, WOW, SUXORZ, ROXORZ, C00L, WOOT and several dozen smileys, place them at random places of a blog message and send them over some IM network. Undistinguishable from billions of messages that cruise the network daily.
  • by jetmarc ( 592741 ) on Sunday January 11, 2004 @07:25AM (#7943925)
    > I personally don't think that is feasible

    Of course this is feasable! At least with todays steganography software.

    What the software does, is to overwrite appearently insignificant portions of the "container" data (the audio/picture/text/whatever file that transports the smaller hidden file). The steganographers say (rightfully) that, by encrypting the hidden data with a strong-enough algorithm, it is indistinguishable from random data. Ie, no one (without the key used for encryption) would be able to tell if it's encrypted data, or perfectly random data.

    However, the programmers of steganographic software now go one step further and say (wrongly!) that images and audio files carry random noise in their least significant bits (LSB). Certainly, the lowest of those 16 bits of CD quality audio does not carry much data. And granted, 16 bits give 96dB of dynamic range while analog master tapes (studio quality) only have about 80dB, and microphone technology hardly touches 96dB. The LSB of an audio wave file definately is noisy, no doubt about that.

    But (big "BUT"), it is far from being perfectly random. In the LSB you might find 50Hz/60Hz hiss from the buildings electric cabeling. You might find characteristic noise that's typical for your brand of microphone, or even a kind of "noise fingerprint" that could be used to distinguish your microphone from others of the same brand (much like crime investigators can distinguish typewriters by analyzing the blackmail letter). Actually, an experiment showed that when cutting all but the LSB of a music wave file, the tune remains still recognizable!

    What the stego programmers do is to replace that LSB (or even 4 least significant bits) with perfectly (pseudo) random data. That's a difference! I can just cut all but the LSB and check if it statistically matches perfect random data (whitenoise) or if "some of" the music tune is "somehow" in there (eg by correlation, a DSP technique).

    The same applies for pictures. If the pictures were scanned, the lower bits will contain artefacts characteristic to the particular scanner used. Digital photos exhibit "signatures" of the CCD/CMOS chip used in the digicam. Etc.

    The steganographers know this, while the programmers of stegano software deliberately ignores it. It's a solvable problem, but infinitely difficult. If you know what the stegano-detection software is looking for, you can easily avoid it. Just encrypt your hidden data to "perfect random" and then transform it (by adding data, thus loosing efficiency) to exhibit almost the same "fingerprint" signature as the data you are going to overwrite. In case of an audio wave file, impress a bit of the tune on your data.

    But obviously, you can't reach perfection, because a 100% match means that you overwrite the original data with a 100% copy of it (-> you have stored 0 bytes of hidden data). Or you know how the detector works, what tresholds it uses to bin the file as "steganographic", and stay a little below the treshold. But that puts you on the risky side.. Will they change the tresholds? Will they check for other characteristics as well, something that you didn't address in your steganographic software?

    That's why the steganographic programmers (not researchers!) ignore this problem. It has no practical solution. It's so much easier to just ignore it, and offer you the choice between 4 and 8 bits of hidden data per 16 bits of wave data (like eg "Scramdisk" does, a recommendable harddisk encryption software). This is better than nothing, but it is far from "not feasable" to detect!

    Marc
    • Actually, an experiment showed that when cutting all but the LSB of a music wave file, the tune remains still recognizable!

      Many years ago (10+), just out of interest in crypto, I XOR'ed a raw audio file (my own speech) with pseudo random data (all bits, from LSB to MSB). The result, was one very noisy audio file with the speech still audible! I thought "WTF!?"

      I figured that since, on average, 50% of bits would be toggled, some of the audio information would still be present in a form a human could recogn
      • Thus spake Shanep: Many years ago (10+), just out of interest in crypto, I XOR'ed a raw audio file (my own speech) with pseudo random data (all bits, from LSB to MSB). The result, was one very noisy audio file with the speech still audible! I thought "WTF!?"

        Your thought ("WTF!?") was right on target. I don't know what you actually did, but it clearly wasn't XOR the audio file with anything resembling random bits. If you XOR a message with truly random bits, the result will consist of truly random bits

  • as an excuse to automatically screen US-inbound emails and then levy an extortionate fee to process a vistors' visa?

    Or do you think all of the emails will just go somewhere else instead?
  • by dirt_puppy ( 740185 ) on Sunday January 11, 2004 @07:28AM (#7943934)
    As others stated, (as always in cryptography) if the stegging user isn't stupid (means he would encode before steg), the data to be stegged would be as random as the data that you steg it in. There is no possibility to tell one set of random data from another set of random data. I think they do it for discovering stupid spys.
    • by JKR ( 198165 ) on Sunday January 11, 2004 @07:58AM (#7944028)
      The problem is that emailing streams of random data around looks pretty suspicious. You want to hide random-looking data in a NON-random stream (that has a legitimate purpose, e.g. an image file). THAT's why you can detect it.

      Even random data has to fit in. For example, it used to be the case that the A/D stage of some cheap sound cards was so noisy that the recording from line-in gave you a 16 bit audio sample stream with the bottom 4 bits effectively random(like dithering but much much worse.) However, the noise (while random in nature) was shaped in a particular way, so if you just hide your encrypted secrets in those 4 bits it would be obvious that the "noise" wasn't appropriate.

      Jon.

  • People are saying that adding stegged content to a (compressed) file adds redundancy, which can be detected.

    I also read here that compressing the data and adding it, would still add redundancy. Is this correct?

    What about compressing, then encrypting the data? I always thought that compression and encryption both attempt to minimise the entropy of a set of data. How can it be detected if it's random?
  • They might try compressing the images - an image with a large amount of non-random text hidden within it should compress somewhat more than a standard compressed image.
  • What if instead of trying to hide something in a specific image for example, you gave the steganographic software a selection of say 100 images and got it to choose which one would be best suitable to hide the data so it was hardest to find. While it might take alot of processing power to do this for a large selection it would make finding allot harder. Oh wait were supposed to be making it easier :P, how about banning all steganographic software and research under the PATRIOT III act and then only criminal
  • so, like - wow! - this sure has spawned an interesting debate about how hiding messages within random data 'disguised' as plain old emails could be possible, or maybe not, and maybe someone could find it and filter it out... wow, impressive.

    So, why do we want to look for such messages? Are terrorists from the middle east supposed to be passing messages around with this technology that even the finest scientists at Slashdot's secret underground laboratory can't even seem to agree would be possible? Here I a
    • One of the largest problems the DOD has faced (and still does) is internal leaks from it's own people. Employees of the DOD for 30 years used to take information and sensaative data out all the time and sell it to other countries. So yes, there is a real and valid reason for this sort of software. spy and counterspy is still a big part of the world, it's just gone more hightech.
  • Steganography was a problem during World War II. Mail was subject to inspection and censorship. There were concerns about espionage and attempts to evade censorship. Mail was checked for invisible ink and anything else that might be used to hide messages. Some people used steganography to save money. Since there were special subsidized postal rates for mailing newspapers, messages could be sent by using a pin to poke holes in the paper, spelling out the characters of the message. Some soldiers tried to evad
  • by dpbsmith ( 263124 ) on Sunday January 11, 2004 @10:40AM (#7944549) Homepage
    In these days when the FBI thinks possession of an almanac makes you suspicious...what happens to you if some half-baked experimental steganography-detection program looks at billions of .jpgs, gets to an image you've included in an eBay auction descriptions, and detects some not-quite-decodable signal just above the noise that it interprets "there's definitely something hidden in that image, even though we can't tell what?"

    How do you prove that you're innocent?

    How do you prove that your image does NOT contain steganography?

    Worse yet, suppose you are using steganography--say, a watermark to prevent people from stealing your image. Will the FBI believe what you tell them is the decoded content?

    I mean, a few decades ago some nutcase analyzed Shakespeare's First Folio and decided that it was printed in a mixture of two slightly different fonts that constituted a binary code with a message proving that it had been written by Sir Francis Bacon. (No kidding). That proves that it's easy for someone who's looking for steganography to find it, whether it's there or not.
  • If the data is compressed and/or encrypted prior to being stegged, then even if the data is correctly extracted, it will be impossible to determine whether it is actual data or just noise.

  • Conceptually, the execution bounds for looking for these "hidden" messages seems not too different from trying to find factors of prime numbers. Take an image, and distill it into two parts, one of which is a hidden message you know nothing about, and the other is the final image with the hidden message removed.

  • by JohnQPublic ( 158027 ) on Sunday January 11, 2004 @11:39AM (#7944935)

    The original poster doesn't believe that it's possible to detect steganographic content. There have been lots of technical follow-ups that suggest it might be possible, but almost nobody has mentioned the funding issue. The task is most likely possible simply because there's been an STTR solicitation published. Many of the STTR and SBIR solicitations are designed by their authors to fund existing projects known to the authors. These "solicitations" provoke very few proposal submissions, occasionally even just the one from the expected recipient of the funds.

    Don't get me wrong - this isn't a scam. The funding groups are usually genuinely interested in having what they specify developed, sometimes wind up buying lots of it once the development is complete, and in most cases all qualified bidders are truly considered. It's just that the solicitations are often written so narrowly that only a select few bidders can qualify.

    But hey, at least the bidders are required to be small businesses, not like those Halliburton contracts for Iraq!

  • Detecting encrypted steganography would be difficult. It would involve statistical analysis of the "unimportant" bits of a known good media sample (be it image, audio, even an executable) and comparing it to the suspect message.

    This would involve a tremendous database on the part of the USAF. More importantly, if the people using the steganography had a similar database (and code that could encrypt their hidden text to match the properties of the "known good"s), then the messages would be undetectable.

    A b
  • wouldn't a stego'd image be indistinguishable from one that had been recompressed?
  • OK, let's take a look at this situation. If the sensitive/secret information is protected the way it should be (ie. seperate computers on networks in separate rooms, etc.) an I [Mr. Bad Airman] want to get this kewl info fired off to my handlers in Al Queda, what are my options? Even if I could send information over the internet from one of these computers, which I shouldn't be able to, how am I going to be able to run stego software if I can't load any programs on these systems (which I sure as hell shou

How many QA engineers does it take to screw in a lightbulb? 3: 1 to screw it in and 2 to say "I told you so" when it doesn't work.

Working...