Forgot your password?
typodupeerror
Programming IT Technology

Morphing Code to Prevent Reverse Engineering? 507

Posted by michael
from the not-as-think-as-you-easy-it-is dept.
ptolemu writes "Cringely's latest article discusses a new obfuscation technique currently being researched called PSCP (Program State Code Protection). An informative read that concludes with some interesting insight on the software giants that heavily depend on this kind of technology."
This discussion has been archived. No new comments can be posted.

Morphing Code to Prevent Reverse Engineering?

Comments Filter:
  • by theMerovingian (722983) on Friday February 20, 2004 @01:32PM (#8341337) Journal

    delete all the white space, and comment in Hungarian
    • by AntiOrganic (650691) on Friday February 20, 2004 @01:34PM (#8341363) Homepage
      Just name all of your variables in Hungarian notation like Microsoft. No one will have any idea what the fuck is going on even if your entire source code leaks.
      • Hungarian is the old standard. Microsoft is now encouraging a pascal/camel case type notation. New focus: readability.
        • Re:do what i do (Score:3, Interesting)

          by stratjakt (596332)
          Never saw the unreadability of hungarian notation, myself. Like any syntax, it takes some getting used to, but I find it so useful to see a variable called giInstanceCount and know its a global integer.. or miInstanceCount for a class level, etc..

          The scope tag is probably more useful than the data type.
          • Re:do what i do (Score:5, Insightful)

            by angst_ridden_hipster (23104) on Friday February 20, 2004 @03:22PM (#8342908) Homepage Journal
            The problem with Hungarian, of course, is that it lies.

            It's like the comments. They tell you what the programmer *meant* to do, not what he or she did.

            Similarly, Hungarian notation tells you the *intended* scope, type, etc, but the compiler may have a very different view of things.

    • Comment?

    • by kfg (145172) on Friday February 20, 2004 @01:48PM (#8341557)
      Ok, I tried that. It really works.

      In fact, it obfuscated my Python code so badly even the interpreter couldn't figure out what the hell it meant.

      Maybe I need to improve my Hungarian.

      KFG
    • by BobGregg (89162) on Friday February 20, 2004 @02:02PM (#8341709) Homepage
      >>delete all the white space, and comment in Hungarian

      Ha, you laugh. At my first job, the documentation for our product (a medical management system) was written by the original software developer - who was Hungarian. Screen after screen, there were pages filled with explanations like this:

      LOBExpCode. This is the LOBExpCode for the system. Enter your LOBExpCode here.
      NGFTSMapC. This is the NGFTSMapC for the system. Enter your NGFTSMapC here.

      And so on. And no, no data dictionary. Occasionally there would be half-pages of attempted explanation in extremely broken English. Even our own developers couldn't tell what half the stuff did. So that's one form of code obfuscation...

    • by Anonymous Coward
      and comment in Hungarian

      As a Finn, I must propose our language as a viable alternative for obfuscation purposes. Please allow me to demonstrate:

      Tama koodi ei toimi ja siina on ladonoven kokoinen aukko - mutta ei Linus sita tajua.

      • by __past__ (542467) on Friday February 20, 2004 @03:14PM (#8342793)
        You are aware that hungarian and finnish are closely related languages, right? And that basically no other language is closely related to either of them? (IIRC, basque is, but its chances to become an official UNO language are pretty slim either)

        Writing code and/or comments in finnish or hungarian would be the ultimate obfuscation technique. People who know english have a bigger chance to guess what words could mean if they were written in persian.

        • by Arker (91948) on Saturday February 21, 2004 @02:37AM (#8347886) Homepage

          You are aware that hungarian and finnish are closely related languages, right? And that basically no other language is closely related to either of them? (IIRC, basque is, but its chances to become an official UNO language are pretty slim either)

          Umm no. You're way off, sorry.

          Finnish and Hungarian are related, but not very closely. They're both Finno-Ugric languages, but the relation is roughly as distant as that between, say, German and Greek for instance. And probably less apparent, since German has quite a few Greek loan-words, particularly in scientific fields, but Hungarian and Finnish don't borrow from each other noticeably.

          Other Finno-Ugric languages include Mansi, Khanty, Udmurts, and Mordvin, the balto-finnic languages (or dialects, depending on who you ask) which includes Finnish, Estonian, Karelian, Izhora, Veps, Vod, and Liv; and the closely related Saami languages spoken in the far north of Sweden, Norway, Finland, and northwest Russia.

          This group is in turn more distantly related to the Samoyedic languages spoken in parts of Siberia.

          Basque isn't closely enough related to any of these for linguists to have established any relationship, although many have suspected there was one and put a lot of time and energy into trying to find evidence of one.

  • I've done mostly server-side work where:

    - the jar files were secure because they were on the server and
    - bytecode optimization and jar size was the least of our problems

    Obfuscation seems to be useful only for client-side Java applications that contains super-secret valuable algorithms. I mean, who cares if somebody decompiles your code to see how you did sortable JTables or whatever?
    • I agree. With my experience at a company that develops in Java, those that use Java become lethargic and lazy such that the actual code itself is typically very uninteresting (as Java does all of the "optimizations" that a developer in other languages could tool around with). Beyond that, most of the developers in my company are obsessed with performance because Java crushes our performance.

      But then again, our software isn't on 90% of all computers or whatever, so I guess we're less worried about exploi
      • by Unoti (731964) on Friday February 20, 2004 @01:56PM (#8341646) Journal
        You mentioned lazy programmers, and having Java crush your performance. In my experience with Java, perceiving that the language is crushing performance is often a symptom of the programmers becoming lethargic and lazy. Perhaps you're having the same experience and don't realize it?
        • The two things probably work in coordination. From my experience, Java does generally represent a performance drop from, say, C++. And so I think that's true regardless.

          I think on top of that, Java does so much stuff, like garbage collection, that programmers don't need to worry about it. But the Java optimizations are always implemented conservatively. If I did my own garbage collection, I could free the memory as soon as I'm done with it, but under Java GC is done only periodically, and only sweeps
          • by Jorrit (19549) on Friday February 20, 2004 @02:21PM (#8341953) Homepage

            If I did my own garbage collection, I could free the memory as soon as I'm done with it


            Or you can do like many C++/C projects do and simply forget to free the memory (i.e. memory leaks).

            Greetings,
          • by radish (98371) on Friday February 20, 2004 @04:07PM (#8343494) Homepage
            The two things probably work in coordination. From my experience, Java does generally represent a performance drop from, say, C++. And so I think that's true regardless.


            I disagree, the theory of an optimising JIT compiler (such as HotSpot) disagrees, and a whole bunch of studies disagree. But you are entitled to your opinion.

            If I did my own garbage collection, I could free the memory as soon as I'm done with it, but under Java GC is done only periodically, and only sweeps items that fulfill certain qualities (so it might not get everything as soon as it should).


            Firstly, there are a number of GC strategies available, you only describe the most common. Secondly, what is the performance hit of an object staying around for too long? As soon as memory becomes tight the JVM will become more aggressive with GCs in order to free up space. If memory is not tight however, then NOT doing a GC is actually more performant (a no-op is always faster than a dealloc). Thirdly, Java GC is not really "conservative". An object is eligable for collection when (and only when) the reference count is zero. Off the top of my head I can't think of any reasonable argument that says it is safe to dealloc an object to which there is a live reference. If you know that the object is dead - just null the reference and voila - it's eligable for GC. In the apps I am responsible for the number one overriding priority is reliability - memory leaks, buffer overflows and magically vanishing objects are way too much of a risk and hence C++ is not a suitable environment for us.
            • by Dukael_Mikakis (686324) <andrewfoerster.gmail@com> on Friday February 20, 2004 @06:13PM (#8345155)
              I've coded projects in both Java and C++ (and benchmarked them, actually), and in my experience (which is just that) the C++ ran more quickly than Java. You're entitled to disagree. Where I work we use OptimizeIt which does help things out, but our software still runs absolutely dreadfully (I won't deny that likely much of it is the programming itself). But I still stand by my contention that C++ allows you to run faster than Java. It allows you greater control (directly) and doesn't impose any of the overhead of hierarchy that Java does.

              I enjoy Java and program in Java and will confess that the stuff they include is usually useful (our software would probably be fscked if we didn't have GC or any of these other features, they just degrade performance (and I believe they have to). I would love to hear your response.

              When I describe the mark and sweep method, it is the most common, and will likely be the most frequently used. However check here [javaperfor...tuning.com] for an analysis of the other types. If garbage collection were a lightweight, trivial process, then why would Java need to implement 6 different schemes?

              Incidentally, we tried testing the various different schemes here and it was a mess trying to get anything out of it.

              Yeah, all you have to do is null the object and it'll be collected. Keep in mind, though, that in C++ you just do a delete (or a dealloc) and it's gone, you don't need to scan the whole environment doing reference counts and then doing the corresponding deallocs.

              I agree that Java is fine, and it's sturdy, and it's a delight to use, it's just that (all the way up to the great-grandparent) I think that he got it right when Java programmers are (rightfully so) more concerned about all these optimizations (why do you think they're necessary?) than about any sort of run-time security.

              Again, just my opinion.
    • by Tassach (137772) on Friday February 20, 2004 @01:44PM (#8341495)
      Java works best as a server-side language; it's well suited to that role.

      If you need a tamper-resistant client-side binary, don't use Java. It's that simple. A good engineer understands many different tools and selects the best one for the job.

      • by Grishnakh (216268) on Friday February 20, 2004 @02:56PM (#8342498)
        If you need a tamper-resistant client-side binary, don't use Java. It's that simple. A good engineer understands many different tools and selects the best one for the job.

        You're obviously not living in the "real world". Here, an engineer uses the tool that the PHB management selects for him, based on buzzwords, what competitors are doing, and what schmoozing vendors have sold to them.

      • by __past__ (542467) on Friday February 20, 2004 @03:20PM (#8342894)
        If you need a tamper-resitant binary, don't use anything that can be executed natively by any known processor architecture either. Compiled code for the JVM might be slightly easier to understand than code compiled for the x86 arch, but there are good decompilers for both.

        If a CPU can understand your code, so can a human. The solution to this problem are licenses, not obfuscators.

    • Yes (Score:5, Insightful)

      by Tim Macinta (1052) * <twm@alum.mit.edu> on Friday February 20, 2004 @02:18PM (#8341897) Homepage
      Obfuscation seems to be useful only for client-side Java applications that contains super-secret valuable algorithms. I mean, who cares if somebody decompiles your code to see how you did sortable JTables or whatever?
      There are plenty of good reasons to use an obfuscator on code targeted at the client-side. Retroguard [retrologic.com] will strip out unnecessary information from your class files and will rewrite variable, class, and method names, usually to a substantially shorter size. This can save enough space in the deployment size to make obfuscation worthwhile for the space savings alone in environments where every byte counts (particularly, J2ME/MIDP).

      Obfuscation does also provide a speed bump to those attempting to disassemble your code. Without obfuscation, anybody with a casual interest could just glance at your code using javap, etc. Retroguard [retrologic.com] fits saemlessly enough into the build process that adding a simple level of protection to the code is usually simple and transparent.

      • Re:Yes (Score:5, Insightful)

        by tcopeland (32225) * <tom@@@thomasleecopeland...com> on Friday February 20, 2004 @02:32PM (#8342120) Homepage
        > in environments where every byte counts

        Right, yup, obfuscation reduces class file size. Certainly, that can be important in some environments.

        > anybody with a casual interest could just
        > glance at your code using javap,

        Sure. But what will they learn? How the code processes MouseEvent.MOUSE_CLICKED? How you use sockets? How you show that nifty splash screen? I mean... who cares?

        Going off topic now, but, anyhow, nifty [pensamos.com]!
        • by Tim Macinta (1052) * <twm@alum.mit.edu> on Friday February 20, 2004 @02:56PM (#8342489) Homepage
          Sure. But what will they learn? How the code processes MouseEvent.MOUSE_CLICKED? How you use sockets? How you show that nifty splash screen? I mean... who cares?
          "Super-secret" algorithms aside, it's not so much that I'm worried they will learn how to do what I did, it's more that I don't want people reusing my code without permission. I've found people who have copied pages from my website almost verbatim, and even one person who blatantly plagiarized a page of my and changed "Copyright Tim Macinta" to "Copyright his name"! Granted, that's a little easier to do with HTML than with Java, but if it's simple to protect against by using an obfuscator, then why not? I really wouldn't mind people learning from my code, it is actually reusing the code without permission and without attribution that I am protecting against. I'm not just being paranoid - I had somebody email me the source code to an applet I wrote once (which he apparently decompiled) with a note along the lines of "Ha, ha! Now I have your source!" I don't know what motivates these people, but there are enough of them out there that I have since started obfuscating almost everything as a basic precaution.
          Going off topic now, but, anyhow, nifty! [pensamos.com]
          Thanks. It's not as off-topic as you think. I used obfuscation with MMB (which is a client side Java app) to prevent plagiarism and reduce the siez (barely). Nothing about it is "super-secret", although it is a little more complex than just routing MouseEvents around.
  • by geoffspear (692508) * on Friday February 20, 2004 @01:33PM (#8341352) Homepage
    It's not the ability to reverse engineer code that creates security problems; if it was, open source code, which you don't even need to reverse engineer would be much less secure. The problem is just badly written code.

    This technique might be interesting for stopping people from stealing your closed source code, but as far as security goes it's pretty much worthless. 99% of the vulnerabilities in MS's code were found before their code was leaked, and if you believe them, even the major exploit found after it was leaked had more to do with bad code than someone finding the existing problem by reading the code.

    • by meta-monkey (321000) * on Friday February 20, 2004 @01:40PM (#8341435) Journal
      There are reasons beyond "theft" for wanting to obfuscate your code.

      For instance, consider Quake. Quake is a great deal of fun, so long as everybody is playing fair. However, when somebody cracks the game and develops an aimbot (they're real), it's not fun anymore. Even if Quake were open source, some kind of run-time obfuscation would be great just to help prevent cheaters.

      I recall reading about an exploit for Age of Empires (or was it Age of Kings...) where in a networked game, you could run a monitor program that would let you see what resources your opponent had. Then, by watching changes in their resource supply, you could guess what units they were building. That was automated for you, of course. "Ah, they keep spending 45 wood and 25 gold, they must be building archers! I should build cavalry."

      Anyway, even when we're not talking about greedy corporations protecting their intellectual property rights, there are still good reasons for keeping what's going on in your program hidden from prying eyes.
      • by Macrobat (318224) on Friday February 20, 2004 @01:55PM (#8341628)
        Details of Quake reverse-engineering can be found here. [catb.org] But I'm not so sure obfuscation would have helped in this case. It seems like Quake's design just put too much information in the hands of the client systems; it might have taken a day or two extra to decode, but the question is, why was such data allowed to be controlled by the client in the first place?
    • by Dukael_Mikakis (686324) <andrewfoerster.gmail@com> on Friday February 20, 2004 @01:48PM (#8341549)
      It's just like the axiom about divorce that goes something like "It's not the fact that divorce is legal that's killing our marriages, it's the bad marriages that are causing so much divorce."

      Because of the n millions of lines of code in Redmond it's certainly daunting to actually go through and make good code out of the mess, rather than the obscurity.

      The fact that there's an open vulnerable port is a flaw, and the FIX is to make the port secure, rather than to shift its address every five seconds or whatever, which is only a Band-Aid.

      MS is just lucky that the bulk of its customers don't truly know what's going on, otherwise the business model they have wouldn't work.

      I.e. since I'm not a doctor, my doctor can prescribe whatever for me, or insist that I do whatever, and I'll take it as scripture. If what he recommends is the stupidest thing in the world, or he's blatantly a horrible doctor, I would have no idea and suffer the consequences. If I were also a doctor, though, I'd be able to call shenanigans the very second he did something wrong. That's why educating the consumer is the most crucial point of this whole issue.
  • Enough (Score:4, Funny)

    by Tebriel (192168) on Friday February 20, 2004 @01:33PM (#8341356)
    The code I write is obfuscated enough as it is. I'm my own anti-piracy mechanism.
  • by Kenja (541830) on Friday February 20, 2004 @01:33PM (#8341359)
    Wonder Twins power, ACTIVATE!

    Form of, illegible code.
    Shape of, encrypted executables.

    Not sure where the monkey fits into all of this.

  • Won't work (Score:4, Insightful)

    by sosume (680416) on Friday February 20, 2004 @01:34PM (#8341365) Journal
    It just won't work. Any code that can be run can be reverse engineered. So-called sophisticated coding techniques only lead to unreadable code..
    • Re:Won't work (Score:5, Insightful)

      by Chairboy (88841) on Friday February 20, 2004 @01:42PM (#8341469) Homepage
      > So-called sophisticated coding techniques only lead to unreadable code..

      That IS the point, I'm sure you realize.
      • Re:Won't work (Score:3, Insightful)

        by Golthur (754920)
        Yes, but this is a simplistic solution that still won't work. Once programs obfuscate themselves as they run, someone's going to make an automated tool to de-obfuscate it - e.g. a custom VM that justs dump out the bytecode on the path of execution as it executes it to a file.

        Automation just breeds counter-automation. It's an arms race, and I don't really think JITO (Just In Time Obfuscation) is the answer.
    • Re:Won't work (Score:5, Insightful)

      by jfengel (409917) on Friday February 20, 2004 @01:50PM (#8341575) Homepage Journal
      Sure, you can reverse engineer it. But is it worth the effort?

      Most of the time it's not even worth reverse engineering unencrypted code, because it's really hard. There are open source projects that go undone because people don't want to expend the effort.

      The trick is not to make it impossible, but to make it hard enough that it isn't done. That level is different for different projects, but it's always finite.
    • Re:Won't work (Score:5, Interesting)

      by Deadstick (535032) on Friday February 20, 2004 @02:00PM (#8341690)
      I remember a primitive attempt at this in the copy protection routine for dBase III way back when the DMCA was but an industry wet dream. This was the ProLok Disk, the one with the laser burn.

      There was a section of code hidden by about forty layers of byte-by-byte XORing against bytes looked up in a table. At each level, it would intercept the Debug and Single Step interrupts, XOR the next layer, and jump into it. In those floppy-only days, it had to be reverse engineered a layer at a time, each step producing a disk with one less layer. Approximately the 40th disk had the actual copy-test code...which turned out to be pirated code!

      This was also before BIOS shadowing in RAM, and the BIOS executed straight from ROM. The test for the laser burn required hooking into it, which of course they couldn't do in ROM. Instead of working out their own shadowing routine they copied some 700 bytes of the IBM Fixed Disk BIOS, inserted their hooks, and then made a weaselly attempt to cover their tracks by interchanging logical-shift with arithmetic-shift instructions wherever it was guaranteed that nothing would go through the carry bit.

      And all that meshugass was there only to hide the publisher's own piracy...the copycrack consisted of a two-byte change elsewhere on the disk.

      rj
    • Re:Won't work (Score:5, Interesting)

      by mugnyte (203225) * on Friday February 20, 2004 @02:00PM (#8341691) Journal
      Well, that seems a bit simplistic. However, when I take a look at running code, there are several things that don't jive with the article:

      One forms logical boxes around things. For instance, a good cracker knows to identify the boundary between the JIT and the bytecode, know where the security check call is made, and what threads are monitoring the heap and garbage collector.

      When cracking, you initially "freeze" the code, the machine, the stack, and the registers. You're working at such a low level, it begins with a step-by-step of understanding how everything fits together.

      For example: Imagine the .NET framework itself in a sandbox. You watch as the OS is fed an EXE, it identifies the type and starts to run it. The CLR (potentially) starts up and checks permissions, loads all the JIT stuff, etc. Then, bytecode is churned. You are stepping one instruction at a time, interrupting at each of the CLR's instructions. One notices the buffer used for the JIT and the "feed" going into it. Tools are written that do this watching and drop items (portions of files, by instruction) into logical "components" and each pass becomes a little clearer. By running the application and matching behavior to component, you begin to learn how the application is designed.

      Also, you look at the program file itself. THIS is what the article seems to be saying: the bytecode is obsfucated...without context clues you're not going to discern how it works. But you can snap up context many times with a cracking tool. In this article, they seem to imply that each snapshot will be different by scrambling the variable names, or program locations. By seeing how all the names have been crammed, a pattern develops.

      Also, I take issue that .NET can be both "open" and yet "secure". Unless the bytecode SPEC is proprietary and unable to be reverse engineered (it is neither, hence the "open" nature of it), one can form a CLR that processes valid-yet-obsfucated code and rebuild a logical image of how the program is designed. I can take the MONO bytecode runtime have it start to partition the code into blocks and examine, for the calls each block makes to layer beneath it, what it is allegedly doing.

      Lastly, what makes these tools immune from reverse-engineering themselves? If I know the patterns this DASH-stuff uses, I can begin to reverse them. Unless there's one-way hashing or hardward/networked keys flying around, everything to solve the puzzle is right there, for me and my friends to examine at our leisure. This is done today by virus writers to try to avoid detection by checkers; they know how they work.

      If this tool becomes actually as valuable as he claims, then I expect it's own design (stolen or RE'd) to appear in the cracker circles like any other.

      But perhaps I'm missing something?
  • zzzzzzz (Score:3, Insightful)

    by SparafucileMan (544171) on Friday February 20, 2004 @01:36PM (#8341391)
    *shrug* You still have controll over the computer. Just load something of your own mnaking before your OS loads the obfusicator. Interrupt 13, anyone?
  • easy to do (Score:3, Funny)

    by Anonymous Coward on Friday February 20, 2004 @01:37PM (#8341405)
    write really bad code. you don't see anyone reverse engineering Windows, do you?
  • by bc90021 (43730) * <bc90021.bc90021@net> on Friday February 20, 2004 @01:38PM (#8341408) Homepage
    The problem with Microsoft's code being readable is that there are only Microsoft people reading it. Half the time they wouldn't see the forest for the trees (since they are so involved with it all the time anyway), and the other half they would miss things that other people might pick up.

    With Open Source, *everyone* gets to look at the code, so there any many eyes, and the bugs get shallower.
    • by dtio (134278) on Friday February 20, 2004 @01:56PM (#8341650)
      Nonsense. They don't see the forrest for the trees? I beg your pardon?

      You're a asuming that there is a Microsoft way to look at code and that every MS developer is a robot brain washed to think that way. MS hire very capable and brilliant people, you couldn't tell the difference between a bunch of NT kernel hackers and a bunch of Linux kernel hackers, both groups are extremely knowledgeable and manufacture high quallity code.

      MS has the biggest industrial infrastructure in the world of quallity assurance. Every developer should go trhough an internship in Redmond to see this.

      Large software *is* complex, period. Given a finite amount of talent and time, bugs are depedent of the size of the project, it really don't make a difference whether you're code is open source or not.

      How many people do you think actually look at open source code to look for bugs?

      Moral: if MS releases buggy, exploitalbe and potentially unsafe code is *not* because they are sloppy or because propietary code is inherently worse than open source, is because large software is complex and takes a lot of time to do it right [joelonsoftware.com].

  • It's ironic (Score:5, Insightful)

    by Dukael_Mikakis (686324) <andrewfoerster.gmail@com> on Friday February 20, 2004 @01:38PM (#8341409)
    The medical profession deals with viruses by identifying our weaknesses, and exposing them to the viruses (the ultimate "reverse engineering"?). If there were a biological DMCA, developing vaccines would certainly violate it on the illegality of "hacking into the body".

    With software, though, people still insist on trying hide and pretend as if there were no viruses out there and that we would be impervious to them.

    Can we finally just open all of our code so we can vaccinate it against all these exploits?
  • by mveloso (325617) on Friday February 20, 2004 @01:38PM (#8341418)
    This looks vaguely like self-modifying code, like back in the old days of copy protection.

    The thing I don't understand about the article (and how it describes the PSCP process) is this: how will this make reverse engineering more difficult?

    When you're starting to crack something, you work backwards from system calls, library calls, and known behaviors. "Known behaviors" are, well, patterns of code that people (or compilers) use to do things. Anyone good at low-level stuff can probably identify the compiler used to build the code. Likewise, if you think about something enough, you can probably figure out three or four ways to do something, and look for that pattern in the code.

    PSCP prevents this...how? By making this process happens as the program runs? How else do you reverse engineer something?

    Anyway, it sounds like this thing sits right before the .net runtime engine (or maybe it's loaded and spews bytecode to the runtime), then it can be removed...or the output intercepted. .

    What am I not getting here?
    • by pc-0x90 (547757) on Friday February 20, 2004 @01:48PM (#8341551)
      Java (and subsequently .Net) bytecode made a reverse engineer's life a bit easier on a whole, because of the way it could be decompiled into source that was extremely similar to the original. All this seems like it would do, is remove that benefit and cause the reverse engineer to approach it the same old way one would approach a compiled C program (as you described, with a debugger and hooks on syscalls). Or bust out a new type of disassembler to emulate traces, and dump that to an assembly listing. But you're right, it's not really that mind blowing if the reverse engineer has worked on non-java/non-.net binaries before.
    • by El (94934) on Friday February 20, 2004 @01:54PM (#8341625)
      It makes reverse engineering more difficult because you can't disassemble the whole program at once, only the currently running portion. And you don't know what the boundaries between the currently running portion and the obfuscated byte codes are. However, if you just TRACE the running code, you should get a pretty good idea of how it executes under normal operation -- it's not like the actual algorithm changes every iteration. Granted, you probably won't know how it handles most exceptions and boundary conditions, but who cares?
    • by hazee (728152) on Friday February 20, 2004 @01:58PM (#8341665)
      Yeah, and self-modifying-code was eventually abandoned because it played havoc with the then-new CPU caches and pipelines.

      Have these people learned nothing?
    • by plierhead (570797) on Friday February 20, 2004 @02:06PM (#8341751) Journal
      From what I can make out (could be wrong) it blasts out a myriad of possible branches and loops that appear to be program logic but are actually executed solely to confuse the reverse engineer.

      If so, this raises a couple of issues:

      • Code bloat - very much an issue for a technology targetting client-side apps; and
      • It negates Cringley's open source point:

        And there is even an Open Source aspect to this new form of protection: It can be used as a new form of attribution. Who wrote what part of that Open Source program? Copyright notices and comments can be removed, but the PSCP code renaming signature can't be.

        Open source code which can only submitted while obfuscated (thus preserving its signature) is not open source any more, so I don't buy this as a benefit of the technology.

      I think the main dangers to people protecting their source code in the medium term will remain what they are now: Incompetence and conspiracy from within (witness the win2k code leakage)

      • Open source code which can only submitted while obfuscated (thus preserving its signature) is not open source any more, so I don't buy this as a benefit of the technology.

        Yes, I was puzzled by his statements about watermarking open-source code. You would still have to distribute the original, unobfuscated source to allow people to make changes. The GPL even explicitly forbids distributing obfuscated code. It says something the like the code must be distributed in the "preferred format for making chang
  • by Speare (84249) on Friday February 20, 2004 @01:39PM (#8341422) Homepage Journal

    Just like all the hubbub over proprietary signal encryption to "protect" digital audio streams, all you need here would be the CPU-equivalent of the old Analog Out jack.

    Break it down to the Universal Turing Machine and tape analogy. The program code is the tape, and the state of the machine is in the tape-executing device. If the tape were to somehow morph itself dynamically, and yet execute properly by morphing to a well-designed program at the moment it is read for execution, all you have to do is to watch the read/write head of the UTM itself.

    If they find ways to monkey around with bytecodes so that they're shifted around between disk and executor, just run it with a special version of the executor. Shouldn't be hard... the standard for what the unencrypted bytecodes are capable of accomplishing are standardized. Execute the code once, and take "notes" of what is being accomplished. Run through a code coverage test suite, even a crude black-box analysis, and you should get an unscrambled bytecode equivalent.

    It just doesn't make sense. If obfuscation, i.e. obscurity, is your only security, it is no security at all.

  • Wow (Score:5, Insightful)

    by Anonymous Coward on Friday February 20, 2004 @01:39PM (#8341424)
    Cringely has really outdone himself that time. I can't even follow this poorly thought out mess. He seems to totally misunderstand every single concept he touches on.

    Compilation to bytecode and an "interpreted language" are NOT THE SAME THING. Both the CLR and a compiled java class are effectively machine code for a machine that doesn't exist. These abstract machines have machine code that reveal *MORE* information to a disassembler/reverse engineer than, say, x86 or PPC assembly, but it is still far, far from being code. This is reaction one that I have. The rest of the article is so confused I don't even know how to respond to it.
    • Re:Wow (Score:5, Informative)

      by edwdig (47888) on Friday February 20, 2004 @02:08PM (#8341780)
      Having worked with Java bytecodes when I took compilers, I will say that you can get really close to the original program by looking at the bytecodes. You can't tell if someone used a while loop or a for loop, but you can still reconstruct the loop from the code.

      The Java Virtual Machine is a stack machine - there are no CPU registers. There's a seperate memory store for local variables. That tends to make it easy to tell exactly what data is being operated on at any given time.

      I've seen Java decompilers that return very clear, readable code.
  • by Jacek Poplawski (223457) on Friday February 20, 2004 @01:39PM (#8341428)
    Reverse engineering is good, and each coder should try it. This is the way to learn how someone else code is working, when that code is closed source. I don't think you can fool experienced assembler code with messing code around.
    Think about R.E. like about game. It's like cracking, but it's good. And it's about creating, not about destroying.
    • by El (94934) on Friday February 20, 2004 @02:00PM (#8341687)
      Actually, even stepping through the execution of your own code can give you insights into how the compiler operates, and allow you to write more efficient code. For example, I knew a C programmer who used to declare const strings and arrays local to each of his functions. I had to point out to him that these get copied onto the stack every time the function is entered, severely slowing program execution -- and he might want to consider making these static! Of course, I wouldn't have noticed this if I hadn't been stepping through his code in the first place.
  • the dark side (Score:5, Interesting)

    by musikit (716987) on Friday February 20, 2004 @01:40PM (#8341437)
    how come for every new technology that comes out that is suppose to "secure" us i can think of a way it can be used "malicously"

    ex. I write YourDoom.A and i write it using this new code morphing obfuscator. how exactly are Anti-virus programs 1. suppose to remove this? 2. identify this?

    Given the numberous amount of VB/Outlook bugs and considering that .NET is so "young" can't you see this used for creating a perpetual virus that can't be removed? you wouldn't even be able to ID the bug that caused this to virus to run itself.
  • by Didion Sprague (615213) on Friday February 20, 2004 @01:41PM (#8341441)
    I don't know the answer to what I'm about to ask. I'm a writer, not a programmer, but as I was reading Cringley's column -- especially toward the end when he talks about how PSCP can be used in DRM to really (really, really) obfuscate a watermark -- I got to thinking: couldn't this theory of PSCP be used to further obscure (or encrypt -- whatever you want to call it) P2P networking?

    And maybe this is already being done -- or maybe this is just pure stupidity on my part for asking the question -- but couldn't this sort of "morph-as-you-go" theory be used to obfuscate -- and essentially hide -- a network path used to get (or put) a piece of data? Kinda like BitTorrent -- but in a much more severe, much more shifty way? You getting the data -- eventually -- and you're both downloading and uploading as you go -- but the paths through which your current bit of data is being retrieved are both unknown until you visit it and obscured once you leave it?

  • by chammel (19734) on Friday February 20, 2004 @01:41PM (#8341443)
    Once the virus writers get a hold of this viruses will be much harder to catch, unless anti-virus writers start looking more for virus-like activity.
    • by nacturation (646836) <nacturation AT gmail DOT com> on Friday February 20, 2004 @02:06PM (#8341752) Journal
      Once the virus writers get a hold of this viruses will be much harder to catch, unless anti-virus writers start looking more for virus-like activity.

      Of course, virus writers have been using this since the early 1990s. One particular virus called Ontario III [nai.com] (there might be others before it) used this trick. An interesting part from the virus writeup: "The Ontario III virus uses a very complex form of encryption with no more than two bytes remaining constant in replicated samples."
  • performance (Score:5, Insightful)

    by happyfrogcow (708359) on Friday February 20, 2004 @01:41PM (#8341452)
    When a computer program runs, the computer can follow millions of paths to get the job done. We leverage those millions of paths and transform them into billions of paths instead

    Millions of paths implies some sort of jump instruction, whether or not that translates to millions of function calls, i don't know. assume it does. then instead of making millions of function calls, your making billions of function calls. Going from millions to billions is a large step, bigger than just swapping an "m" for a "b" in marketingspeak. So are they planning on passing this performance hit to the legitimate consumer? No thanks, I'll take my Free source code and like it.
  • by warlockgs (593818) on Friday February 20, 2004 @01:44PM (#8341508)
    Would code that was changing itself while running (polymorphic) be nailed by a heuristically-scanning anti-virus program? I would hate to de3velop something, and then all of a sudden get seriously bad press for releasing what seems to act like a virus. Just food for thought.
  • Great. (Score:5, Insightful)

    by Anonymous Coward on Friday February 20, 2004 @01:47PM (#8341535)
    So legitimate software is going to take on the functionality that virus software has been using for years? And companies are patenting these techniques as if they are somehow new? Virus writers are the true innovators here. They pioneered the infamous Mutation Engine. I would consider off the shelf software that used those techniques innovative, in fact I find it creepy. Honestly, if the time wasted trying to protect so-called intellectual property was used instead to invent things to simplify our lives, we (as in humanity) would be better off.
  • Top 12 Things A Klingon Programmer Would Say

    1. 12. Specifications are for the weak and timid!


    2. 11. This machine is a piece of GAGH! I need dual
      processors if I am to do battle with this code!

      10. You cannot really appreciate Dilbert unless you've read
      it in the original Klingon.

      9. Indentation?! -- I will show you how to indent
      when I indent your skull!

      8. What is this talk of 'release'? Klingons do not make
      software 'releases'. Our software 'escapes' leaving a bloody
      trail of designers and quality assurance people in its wake.

      7. Klingon function calls do not have 'parameters' -- they
      have 'arguments' -- and they ALWAYS WIN THEM.

      6. Debugging? Klingons do not debug. Our software
      does not coddle the weak.

      5. I have challenged the entire quality assurance
      team to a Bat-Leth contest. They will not concern us again.

      4. A TRUE Klingon Warrior does not comment his code!

      3. By filing this SPR you have challenged the honor
      of my family. Prepare to die!

      2. You question the worthiness of my code? I should
      kill you where you stand!

      1. Our users will know fear and cower before our software.
      Ship it! Ship it, and let them flee like the dogs they are!
  • by nicophonica (660859) on Friday February 20, 2004 @01:49PM (#8341572)
    I have worked on a couple of projects where the 'higher ups' (COO, CEO) were obsessed with the value of the intellectual property that their code represented. Woe be to the developer that tried to explain to them that their code was crap, written by team of programmers obviously just learning learning VB and trying to write it like a dumbed down version of Java. Most of programming was developing solutions to straight forward programming problems, which they still implemented in nearly the worst possible way.

    Yet, I have no doubt that if someone came up to them and warned them about the dangers of IP theft and showed them this solution, they would bite.

    If they really wanted to do maximum damage to their competition they should have just released the source code and hoped their competitors tried to used that as guidance.

    There are probably some rare instances when a specialized software technique is developed and you want to keep its implementation specifics secret. I have yet to run into a single instance of this after many years in the industry.

  • Hmm... (Score:5, Informative)

    by arvindn (542080) on Friday February 20, 2004 @01:50PM (#8341578) Homepage Journal
    I wonder if they've seen the proof of the impossibility of obfuscating programs [nec.com]?
  • by kyz (225372) on Friday February 20, 2004 @01:50PM (#8341580) Homepage
    There is nothing new under the sun. These Java and .NET obfuscators are just the same old anti-SoftICE sections, which were just the same old Amiga/Atari copylocks, which were just the same Spectrum/C64 turboloaders, and so on.

    Every single one of these is broken. Almost all good programmers are capable of deciphering the standardised, retail-boxed algorithm used for the obfuscation, and can easily un-obfuscate it. Are all the Java variables named "a"? Diddums! You don't have a Java decompiler with the option to ignore that simple tweak.

    All that matters is:

    1) How important is the code behind the obfuscation?

    2) How much time and effort is the reverse engineer willing to spend?

    If you use a company's retail-box obfuscator, anyone with the "'Brand X obfuscator' deobfuscator v1.0" can get straight at your code. It's a technological arms race, nothing more.
  • I don't love microsoft, but I think this article makes several claims without backing them up or offering any explanation as to their merits. Such as:

    1. .NET, on the other hand, is Microsoft's chosen successor to Visual BASIC, and effectively exposes source code at the very heart of Microsoft consumer and enterprise applications.
    2. If .NET is Such a Security Nightmare (It Is)...

    And "You can write a program in C# or Visual Basic.NET." while factually accurate, ignores Delphi.NET, C++ managed code using the CRL, and other implementations of the CRL (COBOL, etc).

    I think the basic premise of the article, where if someone is using your objects it is obviously a bad thing/security breach, is flawed. If you need to secure your objects, SECURE them! Seal them, see who is calling you, etc.

    Lastly, As shown by previous posts, Obfuscation is not the end-all panacea to security. In my opinion, it's barely a detour. Otherwise, Open Source literally could not be secure.

  • Just one question (Score:5, Insightful)

    by carlmenezes (204187) on Friday February 20, 2004 @01:52PM (#8341593) Homepage
    Seems to me that stuff like this would make it quite difficult to debug once an application has been released - also, how would things like a memory dump on application crash help to debug anything here?
  • by zz99 (742545) on Friday February 20, 2004 @01:53PM (#8341604)
    I have found that most code generation tools (the kind you program boubles and arrows in, like this one [telelogic.com]) will give you C code that looks like it's been obscurified on purpose.
    E.g. all states and variables are in an array called n[][] and the program is basically a big loop.

    Quite impossible to know whats going on
  • by BlueFall (141123) on Friday February 20, 2004 @01:53PM (#8341612)
    It sounds to me like the author of the article is talking about two completely different issues. The first is code decompilation and static obfuscation. The second is about runtime obfuscation.

    In theory, if you don't run the binary you have, you don't need to worry about it modifying itself. The same techniques that work on obfuscated byte code now should work on the the binary. Now if you were trying to reverse engineer a program by running it and tracing it, that's where PSCP seems like it would help.
  • by Sporkinum (655143) on Friday February 20, 2004 @01:58PM (#8341667)
    If it changes how it executes every time, it sounds like it would be a fantastic way to introduce unreproducable bugs.

    I'm sure this would make QA testing a nightmare.
  • by El (94934) on Friday February 20, 2004 @02:06PM (#8341746)
    Let's start a software company based on an algorithm that promises to compress any string of bits into a 1 bit smaller string of bits, and thus by multiple invocations can compress any string of bits into a single bit... Then let's see if we can get Cringely to recommend this technology!
  • by Xeger (20906) <slashdotNO@SPAMtracker.xeger.net> on Friday February 20, 2004 @02:07PM (#8341767) Homepage
    Okay: for argument's sake we'll say that PSCP works like a charm to prevent nasty evil crackers from debugging my program effectively.

    We'll ignore the obvious problem presented by the fact that your .NET program's IL instructions are JITted into machine code at runtime, thereby making it pointless to modify the IL -- unless these people are invalidating the JITted code every time the IL code changes (is that even allowed?) or providing a translator between IL and machine code that inserts code-morphing instructions into the OUTPUT machine code (which I seriously doubt).

    We'll ignore the fact that instructions which modify other code are generally very easy to spot -- because they must refer to regions of a program's address space where code resides -- and it should be easy to find these code morphing instructions and turn them into no-ops.

    We'll set both of those tricky issues aside and focus on the crux of the matter. How does this PSCP protect the program *before* it starts running? When the cracker gets his hands on my juicy .NET assembly which is bursting with code, how does PSCP prevent him from taking the assembly apart and dissecting? Answer: PSCP *doesn't* provide any such protection.

    So, the school of crackers that likes to use a debugger to deduce program behavior may find themselves having trouble. But in the worst case, all I need to do is run the morphing code in a debugger, record the location of the program counter at the point in the program's execution in which I'm interested, and then consult the corresponding section of the program code that resides in the original .NET assembly. I may need to hunt back and forth for a little while to find the specific place I'm interested in, but I can't imagine that PSCP is able to change offsets or locations of instructions by very much.

    Think about it: if PSCP wanted to change the location of a jump target, for example, it would need to track down every other instruction in the program that jumped to that instruction, and modify the jump to point to the new location of the jump target.
  • by kompiluj (677438) on Friday February 20, 2004 @02:08PM (#8341775)
    Check out:
    Retrologic awarded Java byte code obfucator (Open Source! and free!) [retrologic.com]
    not free but you can try before you buy [condensity.com]
    ZelixKlassMaster [zelix.com]Yet Another Java Byte Code Obfuscator (YAJBCO)
    But I'm not sure they really work - just provide level of security similar to classical machine code. Btw. the MyDoom virus was BurnEye encrypted - so what?
  • Yawn (Score:4, Insightful)

    by ENOENT (25325) on Friday February 20, 2004 @02:12PM (#8341817) Homepage Journal
    WHOOOO CAAAARRRES???

    Yeah, users demand that their executables should change randomly at runtime. I'm sure that there can never be any bugs introduced by this process. Applications won't randomly crash for no reason...

    Oh, wait. I guess this is MSFT. They wouldn't care about random crashes, data corruption, security holes, or any of that boring stuff.

  • by pimpinmonk (238443) on Friday February 20, 2004 @02:13PM (#8341823) Homepage
    In Soviet Russia, the code modifies YOU!

    Imagine the ramifications of that statement. Actually it's kind of true--my increasingly bad sleep patterns and worsening ability to attract women are probably direct results of coding! But hey, at least I can't get reverse-engineered (that sounds like sodomy, so I think it's a Very Good Thing(TM))
  • Cloakware (Score:4, Informative)

    by roady (30728) on Friday February 20, 2004 @02:14PM (#8341839)
    Cloakware [cloakware.com] also has some nice obfuscation technologies [cloakware.com]
  • by dr2chase (653338) on Friday February 20, 2004 @02:24PM (#8341995) Homepage
    Speaking as a former bytecode-to-native compiler writer, I can assure that if someone writes a compiler from Your Favorite Intermediate Language (YFIL) to native code, then someone can crack it. Every (stupid) obfuscator trick out there, the compiler has to tolerate in its quest for verifiable, compilable, optimizable code.

    Examples of Stupid Obfuscator Tricks include:

    • Scrambling exception ranges so they don't nest.
    • Inserting non-structured GOTOs
    • Inserting never-executed exits from synchronized blocks
    There are others, these are just the ones that I recall. A compiler (static or JIT, it does not matter) must deal with all of these.

    There are two outs that I know of. One is to only use interpreted code and morph it on the fly (still seems vulnerable to an observant interpreter, but perhaps the amount of necessary observations can be made extravagantly large), the other is to require use of a "trusted" compiler (which, in turn, requires use of a "trusted" OS to prevent substitution of an untrusted compiler, which in turn requires "trusted" hardware to prevent substitution of an untrusted OS).

  • by fatgeekuk (730791) on Friday February 20, 2004 @02:29PM (#8342085) Journal
    Ok, you are a member of a development department responsible for applications that need to be certified and tightly controlled. (maybe big pharma)

    The FDA come a knocking and start asking about the checks in place that ensure that the code that you write and document is the code that actually gets performed.

    FDA Auditor: So, this code specified in this document. Can you please show me how you ensure that this code is actually performed when you run the program here that you say is the one that this document references.

    IT Guy: Sorry, Cant

    FDA Auditor: Why isnt it?

    IT Guy: because the code that gets run is different every time it is run, and indeed during a single run it changes.

    FDA Auditor: So, What your saying is that you cannot guarantee that the applications specified in all these documents is the application code that actually runs.

    IT Guy: Yep, thats about it...

    Oh, now at this point in the discussion it gets serious.

    Who on this list actually thinks that dynamic code obfustication like they propose is actually worth a damn.

    What happens when this mutating mess gets it wrong?

    Who is to blame?

    Come on now, this is stupid, this is the worst form of pandering to corporate paranoia.

    This is true snakeoil.

    These are all just turing machines.
  • by tqbf (59350) on Friday February 20, 2004 @02:35PM (#8342180) Homepage

    Like Ultra-Wide-Band networking and enterprise XML integration, this column fits a Cringely mold of writing an entire article about the business plan of one small company most people haven't heard of, and passing it off as an important insight about the IT industry as a whole. It works for the most part because there are a lot of neat-sounding business plans out there. Every start-up company in the world has a story about how their vision, fully realized, would shake up the entire industry. It makes for great column-fodder, but provides poor analyses.

    If you read the whole column here twice, you immediately become aware of the fact that Cringely's entire "argument" turns on the idea that security rests on keeping source code secret. Because "interpeted" code "always" discloses code secrets, "interpreted" platforms like .NET will require the intellectual property wrapped up in schemes like PSCP. Therefore, the "inventors" or PSCP hold an important position on the chess-board of the entire IT industry. Microsoft and Sun will launch bidding wars to ensure they control the PSCP IP.

    Of course this is just crazy-talk. Just for a moment, leave aside the argument that something like PSCP can really prevent reverse engineering. In the post-PSCP world, all security rests in a distributed repository of millions of lines of source code "locked up" in an organization that spans 45 buildings and untold tens of thousands of people in Redmond. You can't keep source code secret. Closed source is a speed-bump to dedicated attackers, who will break into networks, find corrupt insiders, or even get janitor temp positions in order to get the code.

    Nobody working in security seriously believes that the source code for Windows 2000 wasn't floating around the computer underground years before the most recent disclosure. 'Twas ever thus: most of the SunOS and Solaris exploits that powered attackers in the mid-90's were derived from stolen Sun source code. Stolen source trees have always been the most stable currency in the computer underground for exactly that reason. What you do with the compiled product of that code makes no difference if the blueprints are already in enemy hands.

    I'm not sure it's even worth confronting Cringely's argument (that PSCP is a strategic technology that is crucial to .NET security) head-on, but I think I can make a decent response simply by evoking video game copy protection. Companies went through all sorts of contortion to devise copy-protection schemes. Kids with the Microsoft Macro Assembler bible thwarted them, because, just like in the DRM/Media battle, when you control the entire player architecture, it is impossible to completely secure the content. Regardless of whether PSCP makes it harder to grep out the cookie cutter exploit from the .NET IR, the payoff in the "battle" between code-obfuscation and exploit generation is much higher than the payoff to defeat copy protection, and nobody has ever won the copy protection battle.

    Cringley is right every once in awhile (business plans occasionally do pan out!), like with Eolas and Burst. I normally wouldn't care enough to comment, but this time he's inadvertantly promoting a damaging and popular misconception in his article.

  • by Tarwn (458323) on Friday February 20, 2004 @02:41PM (#8342259) Homepage
    I found myself less informed after reading that article, less intelligent perhaps as well.

    First of all, anyone that intends to write an article about a "new" software engineering theory or theoretical application needs to make sure they not only understand what they are talking about, but they also choose to collect quotes from people who know what they are talking about.
    Here's a hint, if the person says "leverages" in a serious tone of voice they are either a sales-person or only received information from the sales team.

    Now, beyond the other comments I could add, such as bad definitions of the framework, and the authors inability to name more than 2 examples of languages available to interact with that framework, there seems to be a large problem with the research content. There isn't any.

    I could likely spend 20 to 30 minutes researching background informaiton on the internet and still have a more solid article, simply because I would have real information.

    The information provided in this article appears to be the results of carefully skimming sales brochures. There is no real information on the processes involved, reverse engineering, or numbers invilved in terms of performance.
    We find out that there are "...billions of paths..." but this is just marketing talk, obvious for it's lack of detail. Reverse engineering is detailed as something used by hackers (in the newer, negative sense) to find holes in code. There is no mention of the other side, ie reverse engineering old software when the original developers are not available and no one felt documentation or up-to-date source code was necessary, among many other valid and legal reasons for reverse engineering. There is a brief comment about the extra resource usage, but it is considered negligable (in comparison to...?) and in fact this process is also mentioned as having no negative impact. tanstaffl.

    All in all this sounds like something that will be overhyped, overused, and in the end more of a pain than anything else. Clueless managers everywhere will demand all of the code use this new and impervious format when there are many easier ways to prevent security loss without the so far unknown problems with this new method (not to mention security holes in the obfuscation methods itself).

    Now when people try to reverse engineer code to look for security holes they won't find them because the holes were swept under the carpet. I may stand up for MS more often than deride them, but the kindest way I can say this is that this new method of obscurity is a little less than bright. Just as I wouldn't use anything beta from MS, you can bet that I won't be using this technology either. I prefer solid code, testing, and a solid license. By the time they have finished reverse engineering version 1, the next version will be underway, leaving them just as far behind as before.
  • Snake Oil (Score:3, Interesting)

    by roman_mir (125474) on Friday February 20, 2004 @02:52PM (#8342411) Homepage Journal
    What a load of crock, this is definitely aimed at the CIOs and the so called CTOs of some large corporate IT divisions and it for sure will be another big buzz in the next round of 'architectural' meetings. And some of the so-called architects (I have a few in mind) will without doubt will push this shit forward as another excuse for taking up space in their positions, when in fact they are totally useless and constantly still other peoples' ideas.....

    sorry, I am bitter :)

    On the other hand byte code obfuscators will not stop anyone who wants to disassemble. I remember about 9 or 8 years ago I was disassembling a simple DOS com file (anyone remembers inclogo.com, with a INCLOGO word being printed in a 320x200 graphics mode with some simple 256 byte coloring and shading that changed in a loop? It looked cool, so I wanted to find out how it worked.) Couldn't figure out the machine code for some reason, so loaded it into the Turbo Debugger :) good old days, and followed the trace. Well, duh, it was self-modifying. In a simple loop it would modify the entire execution tree by XORing every byte with the next byte. All of a sudden a portion of the code that was obfuscated was readable machine code again. So all I had to do was write a C program that read the com file and spat out the deobfuscated version.
    Ta da!

    done.

    How is this going to be any different? Code cannot be obscured to the hardware and a cracker works at the hardware level.

    Doesn't this sound to you like the commercials for the 'new and improved 1024 bit encryption' that sometime ago was put out by one Israeli company (there was an article on /. about this, anyone remembers?)

    This is snake oil.
  • by Bugmaster (227959) on Friday February 20, 2004 @03:02PM (#8342582) Homepage
    I agree that this real-time code-mutating thingy is pretty cool. But... why ? You take a moderate-to-major performance hit (all that code won't morph itself), and for what ? To stop people from reverse-engineering your program ? Why not just write it in a secure manner to begin with, so that reverse-engineering is not a threat ? It works for OpenSSH, after all, it can work for anyone else too.
  • by SmurfButcher Bob (313810) on Friday February 20, 2004 @03:24PM (#8342952) Journal
    Yep, harkens back to the failures of the old Apple ][ era.

    Self modifying code did little more than provide an extra 30 minutes of amusement.

    It didn't stop any of us back then, it sure as hell won't stop anyone now. Apparently, these idiots have never heard of things like Soft-ICE.

    Reverse engineering isn't hard, it's just tedious without the source. OTOH, we've been doing it for decades without source... it's only recently that we've had the luxury of (sometimes) having it. Regardless, these boneheads seem to confuse "reverse engineering" with "decompiling" - the two have nothing to do with each other.

    "Changes variable names"... rofl, that's really gonna screw up DEBUG, isn't it...
  • by dpbsmith (263124) on Friday February 20, 2004 @03:45PM (#8343206) Homepage
    ...that obfuscator had better be completely bug-free.

    Just suppose that every once in a while the obfuscated version of the code just isn't exactly 100% functionally equivalent to all the others.

    How are you ever going to debug that?

    It's far worse than a bug in a compiler optimizer.

    Worse yet, this could even be used to attack competitors. Let's say the obfuscator has the ability to distinguish code from different vendors in some way... (well, for example, let's supposed the code is signed). It could subtly sabotage the products of certain vendors so that they seemed to be buggy or unreliable... and the victim would never know what had happened or have any way of knowing what had happened (assuming the victim could not reverse the obfuscation).

  • by alispguru (72689) <bane AT gst DOT com> on Friday February 20, 2004 @04:06PM (#8343480) Journal
    The original article quotes the code-morphing guy as saying:

    And the increase in processing overhead is trivial. PSCP, if done right, costs almost nothing.

    I don't believe it. This stuff can't cost "almost nothing" if it works with threads. If you have multiple paths of execution running through the same code, and the code is being dynamically morphed as the threads run, then either:

    The morpher is fully thread-aware, to keep morph operations for thread A from pulling the rug out from under thread B (or C, D, ...). This implies extra sempahores, locking, unlocking, and the overhead of handling them.

    The morhper is not fully thread-aware, and every so often the morpher for one thread will clobber another thread.

    Am I missing something here?

  • Huh? (Score:4, Insightful)

    by JacobO (41895) on Friday February 20, 2004 @06:05PM (#8345071)
    I can't help but feel like there's something I should already know (but don't) when reading Cringely's material. The articles that I have just read (linked and related) seem to go into some detail about a topic (obfuscation, interpreters, high tech secrets) but then without any good reason he expects us to believe that we are somehow "vulnerable" because some module of code can be reverse engineered. Perhaps we are to believe that because of .NET we are all going to have our secrets stolen.

    The result is that nearly every emerging Microsoft product is vulnerable, including the OS itself

    Now, it seems to be that the only conclusion being drawn is that my OS is vulnerable because someone can reverse engineer its code as if understanding it makes it less secure. Is Linux any less secure than Windows because everyone has access to its source code? Isn't this really an issue for people who "need" to keep their source code from prying eyes so their IP is not stolen?

    This one is quite confounding:

    Microsoft is absolutely committed to .NET, yet .NET as it stands today is very vulnerable to security lapses

    What is a "security lapse" and why does lack of good obfuscation tools allow it? Am I vulnerable without tried and trusted security through obscurity?

    Looking further back at the article on .NET from November 8, 2001, there is an interesting theory on how .NET is Microsoft's way of tracking all "calls" through "Windows' communication system" (whatever that is) to record any use of non-MS services so the third-party provider can be summarily squished.

    Watch out everybody, the black helicopters are circling overhead.

Suburbia is where the developer bulldozes out the trees, then names the streets after them. -- Bill Vaughn

Working...