Follow Slashdot stories on Twitter


Forgot your password?
Programming Security

Mystery of Duqu Programming Language Solved 97

wiredmikey writes "Earlier this month, researchers from Kaspersky Lab reached out to the security and programming community in an effort to help solve a mystery related to 'Duqu,' the Trojan often referred to as 'Son of Stuxnet,' which surfaced in October 2010. The mystery rested in a section of code written an unknown programming language and used in the Duqu Framework, a portion of the Payload DLL used by the Trojan to interact with Command & Control (C&C) servers after the malware infected system. Less than two weeks later, Kaspersky Lab experts now say with a high degree of certainty that the Duqu framework was written using a custom object-oriented extension to C, generally called 'OO C' and compiled with Microsoft Visual Studio Compiler 2008 (MSVC 2008) with special options for optimizing code size and inline expansion."
This discussion has been archived. No new comments can be posted.

Mystery of Duqu Programming Language Solved

Comments Filter:
  • Source Code? (Score:4, Insightful)

    by deemen ( 1316945 ) on Monday March 19, 2012 @11:57AM (#39403843)
    How did they deduce it was an unknown programming language? By looking at the compiled machine code? How could they tell this wasn't just regular C?
  • Re:Source Code? (Score:5, Insightful)

    by Baloroth ( 2370816 ) on Monday March 19, 2012 @12:09PM (#39403973)

    There are certain characteristics to the way C++ behaves (the manner in which you pass parameters, etc). Mainly, through having looked at lots and lots of code samples, they can say what they expect the compiled code to look like. If they know C++ compiled code looks like x, regular C looks like y, and this looked like z, it can't be C. Essentially, the code did things you simply can't do in C++ or C (even Objective C) by itself. The problem is, that method only allows you to compare to known languages. More details here [].

    It's basically like identifying an animal by footprint. Once you know a deer leaves a certain kind of footprint, you can identify more deer by examining footprints. But you can't identify an unknown animal that way: if you haven't seen a given footprint before, you won't know what animal it is, only what general characteristics it has (weight, etc.)

  • Re:Source Code? (Score:5, Insightful)

    by Sarten-X ( 1102295 ) on Monday March 19, 2012 @12:15PM (#39404027) Homepage
    Knowing the language and techniques used can speed up analysis of future variants found, because they'll know what patterns to look for first.
  • by j33px0r ( 722130 ) on Monday March 19, 2012 @12:44PM (#39404405)


    Why did the authors of Duqu use OO C? While there is no easy explanation why OO C was used instead of C++ for the Duqu Framework, Kaspersky experts say there are two reasonable causes that support its use [More control over the code & Extreme portability]. These two reasons indicate that the code was written by a team of experienced ‘old-school’ developers

    Why OO C? Because it worked, because they new how to use it, because they knew it would throw Kaspersky for a loop, because they thought it was cool. There are many many reasons and they do not all have to be logical.

    Kaspersky experts might want to consider that the programming wheel of life may have turned and that what was once old-school is now new-school. Whose to say that the under-estimated script-kiddies cannot grow up to be formidable adults with a whole new bag of tricks?

  • Re:Source Code? (Score:2, Insightful)

    by UnknownSoldier ( 67820 ) on Monday March 19, 2012 @03:39PM (#39406511)
    I can tell you have never taught another programmer nor learned the benefits of reverse engineering so you can write better code! e.g. I used to work on a professional C/C++ compiler for consoles. Customers would sometimes ONLY provide assembly code and it was your job to figure out why the compiler was generating invalid code.

    Here is an perfect example -- a friend of mine was taking a CS course and the assembly code the prof provided was absolute shit -- a perfect example of how to NOT write code. I cleaned up the assembly code into a properly commented assembly and then provided a mid-level source. By having the 3 versions to compare against my friend was able to get a better handle on reading and writing assembly code, understanding how a compiler would translate a mid-level language to a low level language, learn some good commenting styles, etc.

    First, the original crap assembly provided by the Prof:
    0000 RD R5 Inpt // Read the no. of integers to be added from the input buffer
    0004 MOVI R6 0 // Set a counter to reg-6 and initialize to 0
    0008 MOVI R1 0 // Set the Zero register to its value
    000C MOVI R0 0 // Clear Accumulator
    0010 LDI R10 Inpt // Load address of input buffer into reg 10
    0014 LDI R13 Temp // Load address of temp buffer into reg 13
    0018 LOOP1: ADDI R10 4 // Point to the next address of input buffer by adding 4
    001C RD R11 (R10) // Load the content(data) of address in reg-10 in reg-11
    0020 ST (R13) R11 // Store the data in the address pointed to by reg-13
    0024 ADDI R13 4 // Point to the next address of temp buffer
    0028 ADD I R6 1 // Increment the counter
    002C SLT R8 R6 R5 // Set reg-8 to 1 if reg-6 < reg-5, and 0 otherwise
    0030 BNE R8 R1 LOOP1 // Branch if content of Reg- 8 and Reg-1 is not equal
    0034 MOVI R6 0 // Reset the counter to Zero
    0038 LDI R9 Temp // Loading the address temp into reg 9
    003C LOOP2: LW R7 0(R9) // Loads the content of the address in reg-9 in reg-7 , reg-9 is
    // B-reg . 0 is the offset
    0040 ADD R0 R0 R7 // Add the content of accumulator with reg-7 and stored in acc.
    0044 ADDI R6 1 // Incrementing the counter by 1

Houston, Tranquillity Base here. The Eagle has landed. -- Neil Armstrong