Follow Slashdot stories on Twitter

 



Forgot your password?
typodupeerror
×
Programming IT Technology

Morphing Code to Prevent Reverse Engineering? 507

ptolemu writes "Cringely's latest article discusses a new obfuscation technique currently being researched called PSCP (Program State Code Protection). An informative read that concludes with some interesting insight on the software giants that heavily depend on this kind of technology."
This discussion has been archived. No new comments can be posted.

Morphing Code to Prevent Reverse Engineering?

Comments Filter:
  • by pc-0x90 ( 547757 ) on Friday February 20, 2004 @02:48PM (#8341551)
    Java (and subsequently .Net) bytecode made a reverse engineer's life a bit easier on a whole, because of the way it could be decompiled into source that was extremely similar to the original. All this seems like it would do, is remove that benefit and cause the reverse engineer to approach it the same old way one would approach a compiled C program (as you described, with a debugger and hooks on syscalls). Or bust out a new type of disassembler to emulate traces, and dump that to an assembly listing. But you're right, it's not really that mind blowing if the reverse engineer has worked on non-java/non-.net binaries before.
  • Hmm... (Score:5, Informative)

    by arvindn ( 542080 ) on Friday February 20, 2004 @02:50PM (#8341578) Homepage Journal
    I wonder if they've seen the proof of the impossibility of obfuscating programs [nec.com]?
  • Re:Virtual Machine? (Score:1, Informative)

    by Anonymous Coward on Friday February 20, 2004 @02:51PM (#8341584)
    There's an already excellent virtual machine debugger used for exactly this purpose by a few crackers.

    Self-modifying code is ENTIRELY obsolete. Has been for ten years. Sorry.
  • by zz99 ( 742545 ) on Friday February 20, 2004 @02:53PM (#8341604)
    I have found that most code generation tools (the kind you program boubles and arrows in, like this one [telelogic.com]) will give you C code that looks like it's been obscurified on purpose.
    E.g. all states and variables are in an array called n[][] and the program is basically a big loop.

    Quite impossible to know whats going on
  • Re:Wow (Score:1, Informative)

    by Anonymous Coward on Friday February 20, 2004 @02:58PM (#8341670)
    These abstract machines have machine code that reveal *MORE* information to a disassembler/reverse engineer than, say, x86 or PPC assembly, but it is still far, far from being code.

    Have you ever seen decompiled java bytecode? It's almost indistinguishable from the original source code. The problem with x86 assembly is that each instruction doesn't map 1:1 with the source. With java bytecode, it *is* close enough to a 1:1 mapping that perfectly legible code can be produced from an regular class file.
  • Re:easy to do (Score:2, Informative)

    by DarkAurora ( 324657 ) on Friday February 20, 2004 @03:05PM (#8341743)
    Um... yeah [winehq.org]
  • by nacturation ( 646836 ) <nacturation AT gmail DOT com> on Friday February 20, 2004 @03:06PM (#8341752) Journal
    Once the virus writers get a hold of this viruses will be much harder to catch, unless anti-virus writers start looking more for virus-like activity.

    Of course, virus writers have been using this since the early 1990s. One particular virus called Ontario III [nai.com] (there might be others before it) used this trick. An interesting part from the virus writeup: "The Ontario III virus uses a very complex form of encryption with no more than two bytes remaining constant in replicated samples."
  • by BiggsTheCat ( 460227 ) on Friday February 20, 2004 @03:06PM (#8341753)
    Deobfuscation is in NP. That is, for any type of obfuscation, there is a method to reduce it to a deobfuscated copy. It may take polynomial time, but it can be done.

    Read the paper [nec.com] if ya don't believe me.
  • by kompiluj ( 677438 ) on Friday February 20, 2004 @03:08PM (#8341775)
    Check out:
    Retrologic awarded Java byte code obfucator (Open Source! and free!) [retrologic.com]
    not free but you can try before you buy [condensity.com]
    ZelixKlassMaster [zelix.com]Yet Another Java Byte Code Obfuscator (YAJBCO)
    But I'm not sure they really work - just provide level of security similar to classical machine code. Btw. the MyDoom virus was BurnEye encrypted - so what?
  • Re:Wow (Score:5, Informative)

    by edwdig ( 47888 ) on Friday February 20, 2004 @03:08PM (#8341780)
    Having worked with Java bytecodes when I took compilers, I will say that you can get really close to the original program by looking at the bytecodes. You can't tell if someone used a while loop or a for loop, but you can still reconstruct the loop from the code.

    The Java Virtual Machine is a stack machine - there are no CPU registers. There's a seperate memory store for local variables. That tends to make it easy to tell exactly what data is being operated on at any given time.

    I've seen Java decompilers that return very clear, readable code.
  • by Anonymous Coward on Friday February 20, 2004 @03:11PM (#8341804)
    the question is, why was such data allowed to be controlled by the client in the first place?

    Because Quake is old, and when it was released, most players were still on 28.8 modem connections. In other words: a lot of stuff was left on the client for the purpose of saving bandwidth.

    The Quake 1 source wasn't released until a couple years later. So there was no reason to obfuscate it, since it wasn't out there* (yeah don't get me going on the history of id Software).

    * wasn't _supposed_ to be out there. There was a copy of at least part of the Quake 1 source floating around the warez channels for awhile, courtesy of crack dot com.
  • Cloakware (Score:4, Informative)

    by roady ( 30728 ) on Friday February 20, 2004 @03:14PM (#8341839)
    Cloakware [cloakware.com] also has some nice obfuscation technologies [cloakware.com]
  • Re:Wow (Score:1, Informative)

    by Anonymous Coward on Friday February 20, 2004 @04:11PM (#8342737)
    I actually did this to write code to plug in to a commercial Java application. The documentation for writing plugin modules was so poorly written it would have been impossible without decompiling some of their existing modules.

    Although I'm no expert at Java bytecodes, I didn't have any problems, and the only tools I used came with the Sun JDK.

    On the other hand, some other code we got from them was put through a C obfuscator and it was almost impossible to reverse engineer. I gave up. Of course now that they provide unobfuscated code I'm able to make improvements to it for our project.
  • Re:Wow (Score:2, Informative)

    by pjt33 ( 739471 ) on Friday February 20, 2004 @04:28PM (#8343011)
    You can't tell if someone used a while loop or a for loop

    Depends. Some versions of javac vary the position of the test (start or end of the loop) according to the loop construct.

  • by graxrmelg ( 71438 ) on Friday February 20, 2004 @04:49PM (#8343249)
    The GPL says "The source code for a work means the preferred form of the work for making modifications to it." Some obfuscated derivative of the source code doesn't count.
  • by cxvx ( 525894 ) on Friday February 20, 2004 @05:40PM (#8343962) Homepage
    Thirdly, Java GC is not really "conservative". An object is eligable for collection when (and only when) the reference count is zero.

    Minor nitpick, that is not completly correct.
    An object can still have references to it, and still be elegible for gc'ing. That is what Weak- and SoftReferences are used for.

    From the API docs [sun.com]:

    Soft reference objects, which are cleared at the discretion of the garbage collector in response to memory demand. Soft references are most often used to implement memory-sensitive caches.

    Suppose that the garbage collector determines at a certain point in time that an object is softly reachable. At that time it may choose to clear atomically all soft references to that object and all soft references to any other softly-reachable objects from which that object is reachable through a chain of strong references. At the same time or at some later time it will enqueue those newly-cleared soft references that are registered with reference queues.
  • Re:do what i do (Score:3, Informative)

    by NoOneInParticular ( 221808 ) on Friday February 20, 2004 @06:28PM (#8344654)
    Why on earth would you like to search for "i"? It's there, it's local scope, and usually only valid inside loops. Do:

    for (int i = 0; ...;...)
    do_stuff
    // i is not valid here
    for (long i =...;...;...)
    do_stuff

    Why would you need to search for "i" in code set up in this way?

  • by j.leidner ( 642936 ) <leidnerNO@SPAMacm.org> on Friday February 20, 2004 @06:33PM (#8344709) Homepage Journal
    "PSCP" is not new at all. Some Commodore C= 64 games back in 1982 were delivered in an encrypted form. Only a short window around the program counter's (PC) current position would be decrypted on demand by an interrupt-driven procedure.

    Empirical study shows that such protection mechanisms are very weak.

  • by IBitOBear ( 410965 ) on Friday February 20, 2004 @08:32PM (#8345804) Homepage Journal
    Most game exploits could be stopped outright if every-so-often the well-known memory maps of the active data sets were MD5(ed) and transmitted to the server. As the hit-points and unit-statuses (like the unkillable peon hack for Starcraft) are well-understood by the server the faults can be easily detected and removed.

    Remember that most game hacks involve an exterior program that twiddles the in-game parameters after the session is up and running. If the changes were treated as a proper database update journal then things are easy. As the server and the client "play their journal" out at one another a "checksum" operation can be requested and the two memory maps had better match. The errors don't have to be "corrected" after all, they just need to be punished.

    This isn't un-crackable but it is un-crackable in psud-realtime. The theoritical cracker would have to have, essentially, a second game engine running to maintian "the image that ough to be there" along with the engine of the real game. Then there would have to be a reconciler of some sort. At a minimum the machine doing the hack would have to be at least three time (yes, oversimplified math 8-) as powerful as the gamer's gaming experience. (That is, if the hacker wants to watch untextured wireframes "kill eachother" at 4 frames a second... he could probably devise a cheat. 8-)

    Even so, as the server-side is applying the remote journal some very simple interger checks (c.f. if ((StartingHP + RepairHP) Turns) then EjectCheater(); if ((Pedometer / Turns) > MaxSpeed) then EjectCheater();

    Online game hacks almost invariably exploit the kinds of design errors that come from hiring programmers who have only ever programmed games. Simple distributed data integrity checks (and a suspicous mind, and an understanding of why windows programs are never secure) could pretty much cut them down to nill.

    (And before anybody starts narfing, I fully understand that, what with the distributed processing model the above math would need "fudge factors" and some adaptability. These too, are techinques that are well understood by people who work with distributed processing and data collection and synchronization tasks understand. Lossy environment and everything. This also wouldn't involve any real CUP hardship if designed correctly. Compared to the time to compute and render a frame, doing an MD5 over the domain of core data every few seconds isn't that hard to schedule. And it wouldn't necessarily have to be even as strong as an MD5. But gawd people, these games arn't even doing a data domain XOR... They don't get to cry over it when people do an exterior memory image patch hack. It's like leaving your car running with the doors open in Flatbush and then whining when it gets stolen. 8-)
  • Re:Hmm... (Score:2, Informative)

    by Yumpee ( 32901 ) <terabaap@yumpee.org> on Friday February 20, 2004 @09:06PM (#8346116) Homepage
    Proof that there exist some functions with unobfuscatable properties (for some definition of unobfuscability) need not imply that practical obfuscation is not possible.

    Christian Collberg [arizona.edu] has done some very interesting work on obfuscating programs at a high-level by densely intertwining their control flow and data accesses with a parallel heap-pointer-intensive computation. Sort of like a separate thread, with the key point that lots of dynamically allocated memory must be used to defeat analysis. This both obfuscates the original program and also helps in tamper-proofing (the original program can be modified by the compiler to rely on the values computed by this alternate thread).

    Separating the main program from the inserted "thread" is much harder than checking and skipping some branch instructions in a decompiler or SoftICE. Static pointer analysis is an NP- hard problem for compilers, which makes de-obfuscation of this kind probably not practical.

    Y.
  • Funny? Redundant! (Score:3, Informative)

    by DarkHelmet ( 120004 ) * <mark AT seventhcycle DOT net> on Saturday February 21, 2004 @02:16AM (#8347632) Homepage
    http://polls.slashdot.org/comments.pl?sid=95819&ci d=8208868 [slashdot.org]

    Give credit where credit is due. Granted, for all I know the one I linked is ripped too, but still...

    Time to filter out the new redundant / trolls, relevant or not.

This file will self-destruct in five minutes.

Working...