Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
Programming IT Technology

Morphing Code to Prevent Reverse Engineering? 507

ptolemu writes "Cringely's latest article discusses a new obfuscation technique currently being researched called PSCP (Program State Code Protection). An informative read that concludes with some interesting insight on the software giants that heavily depend on this kind of technology."
This discussion has been archived. No new comments can be posted.

Morphing Code to Prevent Reverse Engineering?

Comments Filter:
  • by tcopeland ( 32225 ) * <tom&thomasleecopeland,com> on Friday February 20, 2004 @02:32PM (#8341349) Homepage
    I've done mostly server-side work where:

    - the jar files were secure because they were on the server and
    - bytecode optimization and jar size was the least of our problems

    Obfuscation seems to be useful only for client-side Java applications that contains super-secret valuable algorithms. I mean, who cares if somebody decompiles your code to see how you did sortable JTables or whatever?
  • by geoffspear ( 692508 ) * on Friday February 20, 2004 @02:33PM (#8341352) Homepage
    It's not the ability to reverse engineer code that creates security problems; if it was, open source code, which you don't even need to reverse engineer would be much less secure. The problem is just badly written code.

    This technique might be interesting for stopping people from stealing your closed source code, but as far as security goes it's pretty much worthless. 99% of the vulnerabilities in MS's code were found before their code was leaked, and if you believe them, even the major exploit found after it was leaked had more to do with bad code than someone finding the existing problem by reading the code.

  • Won't work (Score:4, Insightful)

    by sosume ( 680416 ) on Friday February 20, 2004 @02:34PM (#8341365) Journal
    It just won't work. Any code that can be run can be reverse engineered. So-called sophisticated coding techniques only lead to unreadable code..
  • zzzzzzz (Score:3, Insightful)

    by SparafucileMan ( 544171 ) on Friday February 20, 2004 @02:36PM (#8341391)
    *shrug* You still have controll over the computer. Just load something of your own mnaking before your OS loads the obfusicator. Interrupt 13, anyone?
  • by bc90021 ( 43730 ) * <`bc90021' `at' `bc90021.net'> on Friday February 20, 2004 @02:38PM (#8341408) Homepage
    The problem with Microsoft's code being readable is that there are only Microsoft people reading it. Half the time they wouldn't see the forest for the trees (since they are so involved with it all the time anyway), and the other half they would miss things that other people might pick up.

    With Open Source, *everyone* gets to look at the code, so there any many eyes, and the bugs get shallower.
  • It's ironic (Score:5, Insightful)

    by Dukael_Mikakis ( 686324 ) <andrewfoerster AT gmail DOT com> on Friday February 20, 2004 @02:38PM (#8341409)
    The medical profession deals with viruses by identifying our weaknesses, and exposing them to the viruses (the ultimate "reverse engineering"?). If there were a biological DMCA, developing vaccines would certainly violate it on the illegality of "hacking into the body".

    With software, though, people still insist on trying hide and pretend as if there were no viruses out there and that we would be impervious to them.

    Can we finally just open all of our code so we can vaccinate it against all these exploits?
  • by mveloso ( 325617 ) on Friday February 20, 2004 @02:38PM (#8341418)
    This looks vaguely like self-modifying code, like back in the old days of copy protection.

    The thing I don't understand about the article (and how it describes the PSCP process) is this: how will this make reverse engineering more difficult?

    When you're starting to crack something, you work backwards from system calls, library calls, and known behaviors. "Known behaviors" are, well, patterns of code that people (or compilers) use to do things. Anyone good at low-level stuff can probably identify the compiler used to build the code. Likewise, if you think about something enough, you can probably figure out three or four ways to do something, and look for that pattern in the code.

    PSCP prevents this...how? By making this process happens as the program runs? How else do you reverse engineer something?

    Anyway, it sounds like this thing sits right before the .net runtime engine (or maybe it's loaded and spews bytecode to the runtime), then it can be removed...or the output intercepted. .

    What am I not getting here?
  • by Speare ( 84249 ) on Friday February 20, 2004 @02:39PM (#8341422) Homepage Journal

    Just like all the hubbub over proprietary signal encryption to "protect" digital audio streams, all you need here would be the CPU-equivalent of the old Analog Out jack.

    Break it down to the Universal Turing Machine and tape analogy. The program code is the tape, and the state of the machine is in the tape-executing device. If the tape were to somehow morph itself dynamically, and yet execute properly by morphing to a well-designed program at the moment it is read for execution, all you have to do is to watch the read/write head of the UTM itself.

    If they find ways to monkey around with bytecodes so that they're shifted around between disk and executor, just run it with a special version of the executor. Shouldn't be hard... the standard for what the unencrypted bytecodes are capable of accomplishing are standardized. Execute the code once, and take "notes" of what is being accomplished. Run through a code coverage test suite, even a crude black-box analysis, and you should get an unscrambled bytecode equivalent.

    It just doesn't make sense. If obfuscation, i.e. obscurity, is your only security, it is no security at all.

  • Wow (Score:5, Insightful)

    by Anonymous Coward on Friday February 20, 2004 @02:39PM (#8341424)
    Cringely has really outdone himself that time. I can't even follow this poorly thought out mess. He seems to totally misunderstand every single concept he touches on.

    Compilation to bytecode and an "interpreted language" are NOT THE SAME THING. Both the CLR and a compiled java class are effectively machine code for a machine that doesn't exist. These abstract machines have machine code that reveal *MORE* information to a disassembler/reverse engineer than, say, x86 or PPC assembly, but it is still far, far from being code. This is reaction one that I have. The rest of the article is so confused I don't even know how to respond to it.
  • by Jacek Poplawski ( 223457 ) on Friday February 20, 2004 @02:39PM (#8341428)
    Reverse engineering is good, and each coder should try it. This is the way to learn how someone else code is working, when that code is closed source. I don't think you can fool experienced assembler code with messing code around.
    Think about R.E. like about game. It's like cracking, but it's good. And it's about creating, not about destroying.
  • by Anonymous Coward on Friday February 20, 2004 @02:40PM (#8341436)
    Two words:

    "trade secrets".

    If someone can reverse engineer a software DVD player, then he can reimplement it without paying for the trade secret from the DVD CCA. In addition, the implementation can leave out the no-skip "feature" and region coding, which are part of the deal when you buy the trade secret.
  • by chammel ( 19734 ) on Friday February 20, 2004 @02:41PM (#8341443)
    Once the virus writers get a hold of this viruses will be much harder to catch, unless anti-virus writers start looking more for virus-like activity.
  • performance (Score:5, Insightful)

    by happyfrogcow ( 708359 ) on Friday February 20, 2004 @02:41PM (#8341452)
    When a computer program runs, the computer can follow millions of paths to get the job done. We leverage those millions of paths and transform them into billions of paths instead

    Millions of paths implies some sort of jump instruction, whether or not that translates to millions of function calls, i don't know. assume it does. then instead of making millions of function calls, your making billions of function calls. Going from millions to billions is a large step, bigger than just swapping an "m" for a "b" in marketingspeak. So are they planning on passing this performance hit to the legitimate consumer? No thanks, I'll take my Free source code and like it.
  • Re:Won't work (Score:5, Insightful)

    by Chairboy ( 88841 ) on Friday February 20, 2004 @02:42PM (#8341469) Homepage
    > So-called sophisticated coding techniques only lead to unreadable code..

    That IS the point, I'm sure you realize.
  • by Anonymous Coward on Friday February 20, 2004 @02:47PM (#8341534)
    This is a problem only to closed source systems, GNU/Linux is free software, and thus there is nothing to reverse-engineer.

    Another great thing about my GNU/Linux boxen (besides being free as in speach) is that they don't get virii and BSODs all the time like my roommates M$ Windows^H^H^H^Hblows. So its open *and* secure.
  • Great. (Score:5, Insightful)

    by Anonymous Coward on Friday February 20, 2004 @02:47PM (#8341535)
    So legitimate software is going to take on the functionality that virus software has been using for years? And companies are patenting these techniques as if they are somehow new? Virus writers are the true innovators here. They pioneered the infamous Mutation Engine. I would consider off the shelf software that used those techniques innovative, in fact I find it creepy. Honestly, if the time wasted trying to protect so-called intellectual property was used instead to invent things to simplify our lives, we (as in humanity) would be better off.
  • It's just like the axiom about divorce that goes something like "It's not the fact that divorce is legal that's killing our marriages, it's the bad marriages that are causing so much divorce."

    Because of the n millions of lines of code in Redmond it's certainly daunting to actually go through and make good code out of the mess, rather than the obscurity.

    The fact that there's an open vulnerable port is a flaw, and the FIX is to make the port secure, rather than to shift its address every five seconds or whatever, which is only a Band-Aid.

    MS is just lucky that the bulk of its customers don't truly know what's going on, otherwise the business model they have wouldn't work.

    I.e. since I'm not a doctor, my doctor can prescribe whatever for me, or insist that I do whatever, and I'll take it as scripture. If what he recommends is the stupidest thing in the world, or he's blatantly a horrible doctor, I would have no idea and suffer the consequences. If I were also a doctor, though, I'd be able to call shenanigans the very second he did something wrong. That's why educating the consumer is the most crucial point of this whole issue.
  • by nicophonica ( 660859 ) on Friday February 20, 2004 @02:49PM (#8341572)
    I have worked on a couple of projects where the 'higher ups' (COO, CEO) were obsessed with the value of the intellectual property that their code represented. Woe be to the developer that tried to explain to them that their code was crap, written by team of programmers obviously just learning learning VB and trying to write it like a dumbed down version of Java. Most of programming was developing solutions to straight forward programming problems, which they still implemented in nearly the worst possible way.

    Yet, I have no doubt that if someone came up to them and warned them about the dangers of IP theft and showed them this solution, they would bite.

    If they really wanted to do maximum damage to their competition they should have just released the source code and hoped their competitors tried to used that as guidance.

    There are probably some rare instances when a specialized software technique is developed and you want to keep its implementation specifics secret. I have yet to run into a single instance of this after many years in the industry.

  • Re:Won't work (Score:5, Insightful)

    by jfengel ( 409917 ) on Friday February 20, 2004 @02:50PM (#8341575) Homepage Journal
    Sure, you can reverse engineer it. But is it worth the effort?

    Most of the time it's not even worth reverse engineering unencrypted code, because it's really hard. There are open source projects that go undone because people don't want to expend the effort.

    The trick is not to make it impossible, but to make it hard enough that it isn't done. That level is different for different projects, but it's always finite.
  • by kyz ( 225372 ) on Friday February 20, 2004 @02:50PM (#8341580) Homepage
    There is nothing new under the sun. These Java and .NET obfuscators are just the same old anti-SoftICE sections, which were just the same old Amiga/Atari copylocks, which were just the same Spectrum/C64 turboloaders, and so on.

    Every single one of these is broken. Almost all good programmers are capable of deciphering the standardised, retail-boxed algorithm used for the obfuscation, and can easily un-obfuscate it. Are all the Java variables named "a"? Diddums! You don't have a Java decompiler with the option to ignore that simple tweak.

    All that matters is:

    1) How important is the code behind the obfuscation?

    2) How much time and effort is the reverse engineer willing to spend?

    If you use a company's retail-box obfuscator, anyone with the "'Brand X obfuscator' deobfuscator v1.0" can get straight at your code. It's a technological arms race, nothing more.
  • by no soup for you ( 607826 ) <jesse.wolgamott@noSPaM.gmail.com> on Friday February 20, 2004 @02:51PM (#8341592) Homepage

    I don't love microsoft, but I think this article makes several claims without backing them up or offering any explanation as to their merits. Such as:

    1. .NET, on the other hand, is Microsoft's chosen successor to Visual BASIC, and effectively exposes source code at the very heart of Microsoft consumer and enterprise applications.
    2. If .NET is Such a Security Nightmare (It Is)...

    And "You can write a program in C# or Visual Basic.NET." while factually accurate, ignores Delphi.NET, C++ managed code using the CRL, and other implementations of the CRL (COBOL, etc).

    I think the basic premise of the article, where if someone is using your objects it is obviously a bad thing/security breach, is flawed. If you need to secure your objects, SECURE them! Seal them, see who is calling you, etc.

    Lastly, As shown by previous posts, Obfuscation is not the end-all panacea to security. In my opinion, it's barely a detour. Otherwise, Open Source literally could not be secure.

  • Just one question (Score:5, Insightful)

    by carlmenezes ( 204187 ) on Friday February 20, 2004 @02:52PM (#8341593) Homepage
    Seems to me that stuff like this would make it quite difficult to debug once an application has been released - also, how would things like a memory dump on application crash help to debug anything here?
  • by BlueFall ( 141123 ) on Friday February 20, 2004 @02:53PM (#8341612)
    It sounds to me like the author of the article is talking about two completely different issues. The first is code decompilation and static obfuscation. The second is about runtime obfuscation.

    In theory, if you don't run the binary you have, you don't need to worry about it modifying itself. The same techniques that work on obfuscated byte code now should work on the the binary. Now if you were trying to reverse engineer a program by running it and tracing it, that's where PSCP seems like it would help.
  • by Macrobat ( 318224 ) on Friday February 20, 2004 @02:55PM (#8341628)
    Details of Quake reverse-engineering can be found here. [catb.org] But I'm not so sure obfuscation would have helped in this case. It seems like Quake's design just put too much information in the hands of the client systems; it might have taken a day or two extra to decode, but the question is, why was such data allowed to be controlled by the client in the first place?
  • by Unoti ( 731964 ) on Friday February 20, 2004 @02:56PM (#8341646) Journal
    You mentioned lazy programmers, and having Java crush your performance. In my experience with Java, perceiving that the language is crushing performance is often a symptom of the programmers becoming lethargic and lazy. Perhaps you're having the same experience and don't realize it?
  • by dtio ( 134278 ) on Friday February 20, 2004 @02:56PM (#8341650)
    Nonsense. They don't see the forrest for the trees? I beg your pardon?

    You're a asuming that there is a Microsoft way to look at code and that every MS developer is a robot brain washed to think that way. MS hire very capable and brilliant people, you couldn't tell the difference between a bunch of NT kernel hackers and a bunch of Linux kernel hackers, both groups are extremely knowledgeable and manufacture high quallity code.

    MS has the biggest industrial infrastructure in the world of quallity assurance. Every developer should go trhough an internship in Redmond to see this.

    Large software *is* complex, period. Given a finite amount of talent and time, bugs are depedent of the size of the project, it really don't make a difference whether you're code is open source or not.

    How many people do you think actually look at open source code to look for bugs?

    Moral: if MS releases buggy, exploitalbe and potentially unsafe code is *not* because they are sloppy or because propietary code is inherently worse than open source, is because large software is complex and takes a lot of time to do it right [joelonsoftware.com].

  • Re:Won't work (Score:3, Insightful)

    by alan_dershowitz ( 586542 ) on Friday February 20, 2004 @02:57PM (#8341652)
    The GPL explicitly states source code must not be obfuscated.

    Personally though, I think advanced obfuscation would make it REALLY easy for closed source applications to conceal swiped GPL code. I'm sure its already happened, but up to this point, you could do some binary comparisons, or trace it in runtime. With this stuff the article was talking about, you couldn't do that anymore. What's to stop companies from violating the GPL license? their own sense of ethics. Yeah, we're in trouble.

    By the way, if someone's interested in investigating a possible GPL violation, take a look at the Dolphin Gamecube emulator. Their last version had error messages from a GPL powerPC emulation core, and the binary is obfuscated :-/
  • by hazee ( 728152 ) on Friday February 20, 2004 @02:58PM (#8341665)
    Yeah, and self-modifying-code was eventually abandoned because it played havoc with the then-new CPU caches and pipelines.

    Have these people learned nothing?
  • by Sporkinum ( 655143 ) on Friday February 20, 2004 @02:58PM (#8341667)
    If it changes how it executes every time, it sounds like it would be a fantastic way to introduce unreproducable bugs.

    I'm sure this would make QA testing a nightmare.
  • by happyfrogcow ( 708359 ) on Friday February 20, 2004 @03:05PM (#8341736)
    hehe. i resisted star trek for 24 years, just this year started watching the reruns. imagining worf reading this is pretty funny.

    man i feel like a big dork.
  • by plierhead ( 570797 ) on Friday February 20, 2004 @03:06PM (#8341751) Journal
    From what I can make out (could be wrong) it blasts out a myriad of possible branches and loops that appear to be program logic but are actually executed solely to confuse the reverse engineer.

    If so, this raises a couple of issues:

    • Code bloat - very much an issue for a technology targetting client-side apps; and
    • It negates Cringley's open source point:

      And there is even an Open Source aspect to this new form of protection: It can be used as a new form of attribution. Who wrote what part of that Open Source program? Copyright notices and comments can be removed, but the PSCP code renaming signature can't be.

      Open source code which can only submitted while obfuscated (thus preserving its signature) is not open source any more, so I don't buy this as a benefit of the technology.

    I think the main dangers to people protecting their source code in the medium term will remain what they are now: Incompetence and conspiracy from within (witness the win2k code leakage)

  • by Mathi€u ( 165818 ) on Friday February 20, 2004 @03:08PM (#8341777)
    I agree: changing the code, even through an automated process, implies testing! So it seems for me that obfuscating the source will double the amount of testing required...

    "Is this a bug in the code morphing program or in the original application?" - wow, testing will get even funnier :).
  • Yawn (Score:4, Insightful)

    by ENOENT ( 25325 ) on Friday February 20, 2004 @03:12PM (#8341817) Homepage Journal
    WHOOOO CAAAARRRES???

    Yeah, users demand that their executables should change randomly at runtime. I'm sure that there can never be any bugs introduced by this process. Applications won't randomly crash for no reason...

    Oh, wait. I guess this is MSFT. They wouldn't care about random crashes, data corruption, security holes, or any of that boring stuff.

  • by qnxdude ( 520409 ) * on Friday February 20, 2004 @03:13PM (#8341827)
    All it takes is a code following disassembler, I use one for reverse engineering obfusticated firmware as a regular part of my job. Eventually the processor has to run the code, If you do a just in time disassembly, it doesnt matter how the fusk with the code, you can still understand it.
  • Yes (Score:5, Insightful)

    by Tim Macinta ( 1052 ) * <twm@alum.mit.edu> on Friday February 20, 2004 @03:18PM (#8341897) Homepage
    Obfuscation seems to be useful only for client-side Java applications that contains super-secret valuable algorithms. I mean, who cares if somebody decompiles your code to see how you did sortable JTables or whatever?
    There are plenty of good reasons to use an obfuscator on code targeted at the client-side. Retroguard [retrologic.com] will strip out unnecessary information from your class files and will rewrite variable, class, and method names, usually to a substantially shorter size. This can save enough space in the deployment size to make obfuscation worthwhile for the space savings alone in environments where every byte counts (particularly, J2ME/MIDP).

    Obfuscation does also provide a speed bump to those attempting to disassemble your code. Without obfuscation, anybody with a casual interest could just glance at your code using javap, etc. Retroguard [retrologic.com] fits saemlessly enough into the build process that adding a simple level of protection to the code is usually simple and transparent.

  • Re:Won't work (Score:3, Insightful)

    by Fulcrum of Evil ( 560260 ) on Friday February 20, 2004 @03:20PM (#8341931)

    The only thing even close is "the source code for a work means the preferred form of the work for making modifications to it." I don't see that as explicitly ruling out obfuscation.

    Try telling the judge that you name all your variables v0001, v0002, v001, and so on. Simply stated, obfuscation is something you do specifically to stop somebody from understanding the code.

  • by Jorrit ( 19549 ) on Friday February 20, 2004 @03:21PM (#8341953) Homepage

    If I did my own garbage collection, I could free the memory as soon as I'm done with it


    Or you can do like many C++/C projects do and simply forget to free the memory (i.e. memory leaks).

    Greetings,
  • Re:Yes (Score:5, Insightful)

    by tcopeland ( 32225 ) * <tom&thomasleecopeland,com> on Friday February 20, 2004 @03:32PM (#8342120) Homepage
    > in environments where every byte counts

    Right, yup, obfuscation reduces class file size. Certainly, that can be important in some environments.

    > anybody with a casual interest could just
    > glance at your code using javap,

    Sure. But what will they learn? How the code processes MouseEvent.MOUSE_CLICKED? How you use sockets? How you show that nifty splash screen? I mean... who cares?

    Going off topic now, but, anyhow, nifty [pensamos.com]!
  • Re:Won't work (Score:3, Insightful)

    by Golthur ( 754920 ) on Friday February 20, 2004 @03:33PM (#8342137)
    Yes, but this is a simplistic solution that still won't work. Once programs obfuscate themselves as they run, someone's going to make an automated tool to de-obfuscate it - e.g. a custom VM that justs dump out the bytecode on the path of execution as it executes it to a file.

    Automation just breeds counter-automation. It's an arms race, and I don't really think JITO (Just In Time Obfuscation) is the answer.
  • Re:Resource Waste (Score:3, Insightful)

    by plover ( 150551 ) * on Friday February 20, 2004 @03:34PM (#8342150) Homepage Journal
    No, doing this to an entire program would obfuscate the critical sections, and that's the entire point. A reverse engineer can't just pop in, debug the stuff that looks complex (and therefore must be hiding the good stuff) and pop out again. That engineer will have to start with step one, every single time.

    Remember, computers are now large enough and fast enough that there are plenty of cycles going to waste anyway. This theory is that if those idle cycles are spent rearranging the code, the reverse engineers' lives will be more miserable, and therefore the precious code is safer.

  • by djtack ( 545324 ) on Friday February 20, 2004 @03:40PM (#8342241)
    Open source code which can only submitted while obfuscated (thus preserving its signature) is not open source any more, so I don't buy this as a benefit of the technology.

    Yes, I was puzzled by his statements about watermarking open-source code. You would still have to distribute the original, unobfuscated source to allow people to make changes. The GPL even explicitly forbids distributing obfuscated code. It says something the like the code must be distributed in the "preferred format for making changes".
  • Re:do what i do (Score:3, Insightful)

    by B'Trey ( 111263 ) on Friday February 20, 2004 @03:41PM (#8342264)
    I'm not a fan of Hungarian notation but this is quite simplistic. Is InstanceCount an int, a long or a short? Or is it a pointer to one of the above? Is FirstName a C-style string (ie a char *) or is it an instance of class String? Is DateReceived an int holding a Unix-style number-of-seconds-from-some-starting-date, is it a string holding the date (and in what format?) or is it an instance of class Date?

    Hungarian notation was designed for large, multi-developer projects where you're frequently working on or with code you didn't originally write and the answer to questions like the above aren't necessarily obvious or quickly answered. It's one thing to say that HN is ugly or introduces other problems of its own (a stand I agree with) but it's another to say that the problem it addresses is non-existent or is easily solved by descriptive variable names.
  • by Grishnakh ( 216268 ) on Friday February 20, 2004 @03:56PM (#8342498)
    If you need a tamper-resistant client-side binary, don't use Java. It's that simple. A good engineer understands many different tools and selects the best one for the job.

    You're obviously not living in the "real world". Here, an engineer uses the tool that the PHB management selects for him, based on buzzwords, what competitors are doing, and what schmoozing vendors have sold to them.

  • Re:do what i do (Score:5, Insightful)

    by Bugmaster ( 227959 ) on Friday February 20, 2004 @03:59PM (#8342542) Homepage
    I have always thought that if you need Hungarian notation to tell if "userName" is an integer or a string, then your program is too messy. Instead of using "lpszfoobarName", it would be better to organize the program to use smaller functions (methods, whatever) and more compact modules.

    Hungarian notation is only truly useful in classic Win32 programming, because by now it's really its own programming language based loosely on C, where lpszfoobar takes the place of strong typing. But, if you're starting a project from scratch, you don't need to support legacy LPARAM/WPARAM/WPARAM_which_is_really_LPARAM, and thus there's no need for hungarian notation. Especially if you use a strongly-typed OOP language such as Java, and, AFAIK, C#.

  • Re:Won't work (Score:2, Insightful)

    by alan_dershowitz ( 586542 ) on Friday February 20, 2004 @04:02PM (#8342586)
    Quothe the GPL: "The source code for a work means the preferred form of the work for
    making modifications to it."

    Quothe Dictionary.com: "obfuscate: To make so confused or opaque as to be difficult to perceive or understand"

    If you alter the source to make it hard to read, you broke this rule. To merely say "obfuscate" is actually _more_ vague than what the GPL says, in my opinion. My code is already obfuscated, simply because I'm a lazy programmer.
  • Re:do what i do (Score:5, Insightful)

    by __past__ ( 542467 ) on Friday February 20, 2004 @04:07PM (#8342683)
    i is always an integer with local scope, used as a counter in a loop and/or an index into an array or a similar collection. j, k and l are the same, if you need more than one variable that would qualify for being "i". This convention is perfectly clear and has been used for more than 40 years; calling "i" "index", "count" or "currentEmployeeIndex" does not carry any interesting surplus information. The same could be said for "n", which always is an integral number denoting the number of elements in some collection to operate on.

    tmp is less clear, but it certainly would have local scope, and only exists because of shortcomings in the implementation language (like not having a primitive operation for swapping the values of two variables without introducing a temporary variable), but no real significance in the problem domain.

    These variable names are perfectly acceptable and clear - unless you abuse them, of course, but you can abuse all nameing schemes. Nothing stops you from calling a global integer m_pszHelloKitty.

    Hungarian notation on the other hand is problematic because a) it is just a non-functional workaround for the weak typing in C and C++ (and their habit to make type errors crash your program in random unrelated places, or just corrupt your data) and b) there aren't actually enough rules, and if there were, nobody could remember them all. "iSomeInteger" and "sSomeString" are pretty common, but if you happen to use more interesting types, or even a whole C++ class hierarchy, it just doesn't work anymore. The only use of Hungarian Notation is to make clueless middle managers happy, similar to a long-winded format for mandatory comments preceding any trivial function or multi-page e-mail disclaimers. Source code is readable when you can actually read it out loud and people would understand whats going on, not if you encrypt redundant information in variable names.

  • by Anonymous Coward on Friday February 20, 2004 @04:09PM (#8342713)
    I find the topic to be just as useful as discussing the need for lawyers. Why can't companies strive for (accurate | stable | faster | extensiable | portable | open) code rather than put more and more efforts into secrecy, needless complexity and proprietary bases?

    I would love to see any of the first set come first.

    All we need is an obfuscation bug to end up detecting a false compromise situation and cause your entire platform to come crashing down.

    It's just another layer of red tape to allow Microsoft and other paranoia-bound entities to stunt progress.

  • Re:do what i do (Score:5, Insightful)

    by Zooks! ( 56613 ) on Friday February 20, 2004 @04:12PM (#8342755)
    Of course, if you don't know what the type of a variable is you can also just look at the type declaration.

    Unless you're using something like BASIC where variables just suddenly appear out of the ether I really can't see how Hungarian notation is necessary. Especially in an age where we have advanced editors with split windows, and powerful search tools like glimpse, cscope, and ctags.

    Besides, why should I trust some agglutinated letters on a variable name when I can do the same thing the compiler will do and look at the type declaration and be totally _sure_ of the type of the variable? What if some doofus changed the type of the variable in the declaration but was too lazy to update all the instances of Hungarian notation? Hungarian notation can only lead to a code maintainence nightmare!
  • by __past__ ( 542467 ) on Friday February 20, 2004 @04:14PM (#8342793)
    You are aware that hungarian and finnish are closely related languages, right? And that basically no other language is closely related to either of them? (IIRC, basque is, but its chances to become an official UNO language are pretty slim either)

    Writing code and/or comments in finnish or hungarian would be the ultimate obfuscation technique. People who know english have a bigger chance to guess what words could mean if they were written in persian.

  • Crap logic (Score:3, Insightful)

    by arrianus ( 740942 ) on Friday February 20, 2004 @04:17PM (#8342837)
    Cringly somehow equates difficulty of reverse-engineering with security (in the sense of buffer overflows, etc.). Other than weak arguments about security-by-obscurity, it holds no water. The NSA has automated analysis tools that look for buffer overflows and the like. Plenty of attacks come about with people just throwing random packets at a machine, and seeing what crashes it. In addition, in spite of the well publicized NT source release, Microsoft licenses Windows source to universities and other organizations, and it is fairly wide-spread. Anyone who really cares can get it.

    Very few people will reverse-engineer source code to make a competing product. With the exception of file formats and the like (Word format, DeCSS, etc.), it is generally much faster to reinvent than it is to reverse engineer -- this is often true even when you have the original source code, with comments. I guess the only other place I can think of where reverse-engineering might make sense is highly-optimized algorithm (3d rendering, video compression, etc.), but even there, it's sketchy as to whether there is any real benefit.

    He goes on to talk about how source code watermarks are impossible to remove. Quite frankly, I've never seen a watermark in a non-lossy data format that's impossible to remove. They just take different amounts of time and effort.

    I used to think this guy had a clue, or some insight once in a while. This article is just so confused, and wrong in so many ways, that we Cringeley has no grasp of basic technology. Damn. it sucks.
  • by __past__ ( 542467 ) on Friday February 20, 2004 @04:20PM (#8342894)
    If you need a tamper-resitant binary, don't use anything that can be executed natively by any known processor architecture either. Compiled code for the JVM might be slightly easier to understand than code compiled for the x86 arch, but there are good decompilers for both.

    If a CPU can understand your code, so can a human. The solution to this problem are licenses, not obfuscators.

  • I think it pretty much is impossible in RTS, in a client-client model, at least.

    Then change the client-client model, is the more research you can do.

    If you still wanted to stay with a two person model you could send your data to a third party playing the same game, who would verify your data, and you would verify theirs. And you could massively distribute this.

    One thought would be to be if one person accused you of cheating, you'd have to get your data verified by more people around the country, etc, etc, so you couldn't say just one person had a vendetta against.

    This is a solvable problem. You just have to get past the old "playing across town over the modem" model.
  • Re:do what i do (Score:5, Insightful)

    by angst_ridden_hipster ( 23104 ) on Friday February 20, 2004 @04:22PM (#8342908) Homepage Journal
    The problem with Hungarian, of course, is that it lies.

    It's like the comments. They tell you what the programmer *meant* to do, not what he or she did.

    Similarly, Hungarian notation tells you the *intended* scope, type, etc, but the compiler may have a very different view of things.

  • by SmurfButcher Bob ( 313810 ) on Friday February 20, 2004 @04:24PM (#8342952) Journal
    Yep, harkens back to the failures of the old Apple ][ era.

    Self modifying code did little more than provide an extra 30 minutes of amusement.

    It didn't stop any of us back then, it sure as hell won't stop anyone now. Apparently, these idiots have never heard of things like Soft-ICE.

    Reverse engineering isn't hard, it's just tedious without the source. OTOH, we've been doing it for decades without source... it's only recently that we've had the luxury of (sometimes) having it. Regardless, these boneheads seem to confuse "reverse engineering" with "decompiling" - the two have nothing to do with each other.

    "Changes variable names"... rofl, that's really gonna screw up DEBUG, isn't it...
  • Re:do what i do (Score:4, Insightful)

    by jelle ( 14827 ) on Friday February 20, 2004 @04:27PM (#8342988) Homepage
    "Is InstanceCount an int, a long or a short? Or is it a pointer to one of the above? Is FirstName a C-style string"

    Those are questions that the editor/gui should be able to answer without the need to add typing work for the programmer. I'm sure there are a lot of variables with erroneous hungarian notation, either because of programmer error, or programmer misunderstanding, or a 'forgot to update that' type of thing...Usually no information is better than misinformation.

  • by dpbsmith ( 263124 ) on Friday February 20, 2004 @04:45PM (#8343206) Homepage
    ...that obfuscator had better be completely bug-free.

    Just suppose that every once in a while the obfuscated version of the code just isn't exactly 100% functionally equivalent to all the others.

    How are you ever going to debug that?

    It's far worse than a bug in a compiler optimizer.

    Worse yet, this could even be used to attack competitors. Let's say the obfuscator has the ability to distinguish code from different vendors in some way... (well, for example, let's supposed the code is signed). It could subtly sabotage the products of certain vendors so that they seemed to be buggy or unreliable... and the victim would never know what had happened or have any way of knowing what had happened (assuming the victim could not reverse the obfuscation).

  • by alispguru ( 72689 ) <bob@bane.me@com> on Friday February 20, 2004 @05:06PM (#8343480) Journal
    The original article quotes the code-morphing guy as saying:

    And the increase in processing overhead is trivial. PSCP, if done right, costs almost nothing.

    I don't believe it. This stuff can't cost "almost nothing" if it works with threads. If you have multiple paths of execution running through the same code, and the code is being dynamically morphed as the threads run, then either:

    The morpher is fully thread-aware, to keep morph operations for thread A from pulling the rug out from under thread B (or C, D, ...). This implies extra sempahores, locking, unlocking, and the overhead of handling them.

    The morhper is not fully thread-aware, and every so often the morpher for one thread will clobber another thread.

    Am I missing something here?

  • by radish ( 98371 ) on Friday February 20, 2004 @05:07PM (#8343494) Homepage
    The two things probably work in coordination. From my experience, Java does generally represent a performance drop from, say, C++. And so I think that's true regardless.


    I disagree, the theory of an optimising JIT compiler (such as HotSpot) disagrees, and a whole bunch of studies disagree. But you are entitled to your opinion.

    If I did my own garbage collection, I could free the memory as soon as I'm done with it, but under Java GC is done only periodically, and only sweeps items that fulfill certain qualities (so it might not get everything as soon as it should).


    Firstly, there are a number of GC strategies available, you only describe the most common. Secondly, what is the performance hit of an object staying around for too long? As soon as memory becomes tight the JVM will become more aggressive with GCs in order to free up space. If memory is not tight however, then NOT doing a GC is actually more performant (a no-op is always faster than a dealloc). Thirdly, Java GC is not really "conservative". An object is eligable for collection when (and only when) the reference count is zero. Off the top of my head I can't think of any reasonable argument that says it is safe to dealloc an object to which there is a live reference. If you know that the object is dead - just null the reference and voila - it's eligable for GC. In the apps I am responsible for the number one overriding priority is reliability - memory leaks, buffer overflows and magically vanishing objects are way too much of a risk and hence C++ is not a suitable environment for us.
  • Comment removed (Score:4, Insightful)

    by account_deleted ( 4530225 ) on Friday February 20, 2004 @05:07PM (#8343510)
    Comment removed based on user account deletion
  • Re:do what i do (Score:5, Insightful)

    by __past__ ( 542467 ) on Friday February 20, 2004 @05:38PM (#8343923)
    Long names are good
    Sorry, but I can't agree with that without further qualification.

    The language I use most of the time is Common Lisp, which started as a compromise between several Lisp dialects that have evolved since the late 50ies, together with new functionality designed in the 90ies. This lead to standard function names ranging from cdr and rplacd to update-instance-for-redefined-class. While there are more or less consistent rules explaining either of these, I think they are all bad. The trick is to come up with names that are both unambigous and short. This is very hard (and the fact that ANSI CL defines nearly 1000 names, all on one package, doesn't help). In fact, I often have the feeling that coming up with good names for my functions, classes etc. is harder than their actual implementation. But it is also more important, because that is what others will have to use and understand, and communication between humans is a more serious issue in programming than communication with the computer.

    Besides, any sensible IDE would allow you to search for an exact name or at least a regular expression, so that a search for "i" would not find all mentions of "update-instance-for-redefined-class".

  • Re:do what i do (Score:2, Insightful)

    by the_diesel ( 754959 ) on Friday February 20, 2004 @05:40PM (#8343960)
    For the love of God and all that is Holy, never use the letter 'l' as a variable. Why? k = 1; l = 2; m = l + k; depending on your font, it may be very hard to figure that out, especially if you are skimming. Also, it's better for to double up, as in: ii = 1; jj = 2; which makes searches and replaces easier. Still, you don't want: kk = 1; ll = 2; mm = ll + k;
  • I call bullshit (Score:3, Insightful)

    by heironymouscoward ( 683461 ) <heironymouscoward@yah3.14oo.com minus pi> on Friday February 20, 2004 @06:17PM (#8344521) Journal
    The article describes the encryption technique as a way of signing open source code. But psudo-randomly changing all the program's variable names, in the source code, apart from being impossible to do at 'runtime' (it's source, remember), makes the open source code aspects null.

    How can you submit a kernel patch that contains mangled code?

    Bah. A useless article that hypes a junk technology designed to solve a false problem created by a weak solution to a weakness in a marketing-driven architecture that answers what is, anyway, a pretty simple question... how to write software people can use.
  • by edgedmurasame ( 633861 ) on Friday February 20, 2004 @06:20PM (#8344557) Homepage Journal
    Well, some people value their "Intellectual Property", and the results they bring to people. As for protecting something, you should go as far as you need to make the pool of people as low as possible that can easily defeat it. That's how Ubisoft worked in the past with their products, and they only went far enough to get the sales numbers out, they knew they were going to get cracked - just that Ubisoft delayed it while they sold to the people who would buy it. I dont exactly like the idea of IP, but I dont like some other ideas, but I live with both and deal with both when things go Horribly Wrong(TM).
  • Huh? (Score:4, Insightful)

    by JacobO ( 41895 ) on Friday February 20, 2004 @07:05PM (#8345071)
    I can't help but feel like there's something I should already know (but don't) when reading Cringely's material. The articles that I have just read (linked and related) seem to go into some detail about a topic (obfuscation, interpreters, high tech secrets) but then without any good reason he expects us to believe that we are somehow "vulnerable" because some module of code can be reverse engineered. Perhaps we are to believe that because of .NET we are all going to have our secrets stolen.

    The result is that nearly every emerging Microsoft product is vulnerable, including the OS itself

    Now, it seems to be that the only conclusion being drawn is that my OS is vulnerable because someone can reverse engineer its code as if understanding it makes it less secure. Is Linux any less secure than Windows because everyone has access to its source code? Isn't this really an issue for people who "need" to keep their source code from prying eyes so their IP is not stolen?

    This one is quite confounding:

    Microsoft is absolutely committed to .NET, yet .NET as it stands today is very vulnerable to security lapses

    What is a "security lapse" and why does lack of good obfuscation tools allow it? Am I vulnerable without tried and trusted security through obscurity?

    Looking further back at the article on .NET from November 8, 2001, there is an interesting theory on how .NET is Microsoft's way of tracking all "calls" through "Windows' communication system" (whatever that is) to record any use of non-MS services so the third-party provider can be summarily squished.

    Watch out everybody, the black helicopters are circling overhead.
  • by mindstrm ( 20013 ) on Friday February 20, 2004 @10:28PM (#8346609)
    Quake is a poor example, because it's not about security, though the client-server model parallels. THis is a case of putting processing on the client end because it's faster to do so, and the overall system is far more workable. Yes, it does mean people can cheat in certain ways.. that didn't stop the game from being a wild success.

    A better example would be online gambling... if the javascript on the website or the java client or flash client are responsible for validating data, and the server just takes what it is told, you have a disaster waiting to happen. Client side validation should be done for aesthetic and usability purposes only, not security.

  • by Power Luser ( 751304 ) on Saturday February 21, 2004 @12:33AM (#8347227)
    Why does the moron get space on /. at all? Surely people can see the glaring errors, the ridiculous assertions and the "I'm at the center of the tech universe, so if I happen to have a half-baked idea about something then it must be so!" attitude that Cringely articles reek of.

    I feel dirty after reading them. God help the world if Enderle and Cringely ever start working together.

All seems condemned in the long run to approximate a state akin to Gaussian noise. -- James Martin

Working...