Forgot your password?
typodupeerror
Open Source KDE Software

Are You Sure This Is the Source Code? 311

Posted by timothy
from the not-as-simple-as-md5-sum dept.
oever writes "Software freedom is an interesting concept, but being able to study the source code is useless unless you are certain that the binary you are running corresponds to the alleged source code. It should be possible to recreate the exact binary from the source code. A simple analysis shows that this is very hard in practice, severely limiting the whole point of running free software."
This discussion has been archived. No new comments can be posted.

Are You Sure This Is the Source Code?

Comments Filter:
  • Bogus argument (Score:5, Insightful)

    by Beat The Odds (1109173) on Thursday June 20, 2013 @02:22PM (#44063025)
    "Exact binaries" is not the point of having the source code.
    • Re:Bogus argument (Score:5, Informative)

      by Anonymous Coward on Thursday June 20, 2013 @02:29PM (#44063133)

      The guy who submitted that article is the person who wrote it. Awesome "work", editors.

      • Re:Bogus argument (Score:5, Insightful)

        by icebike (68054) on Thursday June 20, 2013 @02:54PM (#44063457)

        But too his credit, he did say a "simple analysis" although when reading TFA he omitted the word "minded" from the middle of that phrase.

        Virtually all of his findings are traced to differences in date and time and chosen compiler settings and compiler vintage.
        Unless he can find large blocks of inserted code (not merely data segment differences) he is complaining about nothing.

        He his certainly free to compile all of his system from source, and that way he could be assured he is running
        exactly what the source said. But unless and until he reads AND UNDERSTANDS every line of the source he is
        always going to have to be trusting somebody somewhere.

        Its pretty easy to hide obfuscated functionality in a mountain of code (in fact it seems far too many programmers pride
        themselves their obfuscation skills). I would worry more about the mountain he missed while staring at the
        mole-hill his compile environment induced.

        • Re:Bogus argument (Score:5, Informative)

          by Lumpy (12016) on Thursday June 20, 2013 @03:07PM (#44063641) Homepage

          There are very talented people that can hide things in only a few lines of code. See http://ioccc.org/ [ioccc.org] for some examples that will make your skin crawl.

          • Re:Bogus argument (Score:5, Informative)

            by frost_knight (885804) <winter@frostmarch.com> on Thursday June 20, 2013 @03:47PM (#44064061) Homepage

            For true malice there's also The Underhanded C Contest [xcott.com].

            From their home page: "The goal of the contest is to write code that is as readable, clear, innocent and straightforward as possible, and yet it must fail to perform at its apparent function. To be more specific, it should do something subtly evil."

            • by Rakarra (112805)

              Oh man. I sortof feel sorry for the runner-up of the 2009 contest. From the evaluation: "The bug is plausibly deniable as poor coding, and rests on your caffeine-addled inability to notice a ‘0’ instead of a ‘\0’ when testing for end-of-string. The comparison in safe_strcmp has unnecessary terms, which achieves two evil goals: first, it sets up a pattern that fools your eyes, and second, it looks just amateurish enough that the bug, if found, looks like a sophomoric mistake rather th

          • Re:Bogus argument (Score:4, Insightful)

            by donaldm (919619) on Thursday June 20, 2013 @06:58PM (#44065739)

            There are very talented people that can hide things in only a few lines of code. See http://ioccc.org/ [ioccc.org] for some examples that will make your skin crawl.

            True, but any programmer that works in a Professional way should document their code so that it is maintainable. Those programmers that think that their code should be hard to read because that is a good way of keeping their job eventually come down to earth with a thud when their manager tells them that "The door is over there, please watch your fingers on the way out". Usually hard to read code is thrown out and a fresh start is made since it sometimes is so much quicker to do this especially if the System Designer (not the programmer) has documented the concept properly. On a more serious note companies that don't have well documented overview design and code are asking for trouble down the time line.

        • Re:Bogus argument (Score:5, Informative)

          by Andy Dodd (701) <atd7&cornell,edu> on Thursday June 20, 2013 @03:14PM (#44063735) Homepage

          Yeah. Unfortunately, the issues he presents here DO make it more difficult to prove that someone is providing a binary that could NOT have possibly originated from the provided source code.

          As an example, the kernel source initially released for the Samsung GT-N8013 (USA Wifi Note 10.1) was not what was used to build the binaries in question.

          The "difficult to prove but obvious" - Any kernel built from the provided source had a massively broken wifi driver that would completely stop functioning, usually within 5-10 minutes, requiring the module to be removed and reinserted. Pulling the wifi module source from a different Samsung tarball (such as a GT-I9300 release) would result in a working driver. But how do you prove the source provided is correct?
          In the case of the N8013, we were lucky - Samsung changed a bunch of debug printk()s slightly in their released binary. Small stuff, not functionally relevant, such as typo fixes and capitalization differences in their touchscreen driver's debug printk()s - but at least provable to be different.

          So we could prove that the kernels didn't match, but couldn't necessarily prove that the biggest functional problem was due to a source difference.

          We asked Samsung to provide source that corresponded to the UEALGB build for that device, and their response was, "That build is a leak and hence we are not obligated to provide source for it." Effectively admitting that the provided source was not meeting the requirements imposed by the GPL for that build, and then claiming that the software build preinstalled on every device sold in the USA for the first 1-2 months after launch was a "leak" and thus they didn't have to provide source for it.

          Needless to say, between that and other situations, that was my last Samsung device.

        • Re:Bogus argument (Score:5, Informative)

          by Hatta (162192) on Thursday June 20, 2013 @03:20PM (#44063815) Journal

          But unless and until he reads AND UNDERSTANDS every line of the source he is
          always going to have to be trusting somebody somewhere.

          Even if he reads and undertands every line of the source, he's still trusting someone. He has to read and understand every line of the source code of the complier he is using, and the compiler that compiled that compiler, and so on.

          Reflections on trusting trust [bell-labs.com] is almost 30 years old now. It should be well known.

      • by briancox2 (2417470) on Thursday June 20, 2013 @02:59PM (#44063527) Homepage Journal
        This looks like the shortest, most consise piece of FUD I've ever seen.

        I wonder if next week I could get a story published that say, "I don't know if Microsoft is spying on you through your webcam. So it could be true."
    • Re:Bogus argument (Score:5, Insightful)

      by CastrTroy (595695) on Thursday June 20, 2013 @02:33PM (#44063207) Homepage
      Ok, maybe not exact binaries, but what if you can't even make a binary at all, or if you do make one, how do you ensure it's functioning the same? That's the problem that many people have with open source code that exists in languages that you can only compile with a proprietary compiler. Take .Net for instance. It's possible to write a program that is open source, and yet you're at the mercy of Microsoft to be able to compile the code. Even when I download Linux packages in C, it's often the case that I can't compile them, because I'm missing some obscure library that the original developer just assumed I had. What good is code if you are unable to compile it is right up there with "what use is a phone call, if you are unable to speak". Some code only works with certain compilers, or with certain flags turned on in those compilers. Simply having the source code doesn't mean you have the ability to actually use the source code to make bug fixes should the need arise.
      • Re: (Score:3, Insightful)

        by Anonymous Coward

        To borrow from The Watchmen:

        Who compiles the compiler?

      • Re:Bogus argument (Score:5, Insightful)

        by ZahrGnosis (66741) on Thursday June 20, 2013 @02:51PM (#44063417) Homepage

        If you're worried about the lineage of a binary then you need to be able to build it yourself, or at least have it built by a trusted source... if you can't, then either there IS a problem with the source code you have, or you need to decide if the possible risk is worth the effort. If you can't get and review (or even rewrite) all the libraries and dependencies, then those components are always going to be black-boxes. Everyone has to decide if that's worth the risk or cost, and we could all benefit from an increase in transparency and a reduction in that risk -- I think that was the poster's original point.

        The real problem is that there's quite a bit of recursion... can you trust the binaries even if you compiled them, if you used a compiler that came from binary (or Microsoft)? Very few people are going to have access to the complete ground-up builds required to be fully clean... you'd have to hand-write assembly "compilers" to build up tools until you get truly useful compilers then build all your software from that, using sources you can audit. Even then, you need to ensure firmware and hardware are "trusted" in some way, and unless you're actually producing hardware, none of these are likely options.

        You COULD write a reverse compiler that's aware of the logic of the base compiler and ensure your code is written in such a way that you can compile it, then reverse it, and get something comparable in and out, but the headache there would be enormous. And there are so many other ways to earn trust or force compliance -- network and data guards, backups, cross validation, double-entry or a myriad of other things depending on your needs.

        It's a balance between paranoia and trust, or risk and reward. Given the number of people using software X with no real issue, a binary from a semi-trusted source is normally enough for me.

        • Re:Bogus argument (Score:4, Insightful)

          by CastrTroy (595695) on Thursday June 20, 2013 @03:01PM (#44063555) Homepage
          I'm not really even talking from a trust point of view, but more the other point of open source software, which is, "if there's a bug in the code, you can fix it yourself". Without even going down that whole tangent of recursively verifying the entire build chain, there's the problem of being able to even functionally compile the source code so that you can make fixes when you need to.
      • Re:Bogus argument (Score:5, Insightful)

        by oGMo (379) on Thursday June 20, 2013 @03:00PM (#44063547)

        Simply having the source code doesn't mean you have the ability to actually use the source code to make bug fixes should the need arise.

        And yet, it still means that you can fix it, or even rewrite it in something else, if you want. Not having the source code means this is between much-more-difficult and impossible. The lesson here should be that everything we use should be open source, including compilers and libraries, not "well in theory I might have problems, so screw that whole open source thing .. proprietary all the way!"

      • by houghi (78078)

        how do you ensure it's functioning the same?

        If you have any reason to doubt that it might not do what it says it does, do not use the binary, but compile it yourself.
        If you are unable to compile it AND you do not trust the binary, don't run it. Read and rewrite the code so it does work.
        If you are unable to rewrite the code, do not trust the binary AND are unable to compile. Look for an alternative.
        Or hire somebody who is able to write the code for you so that you are able to read it, compile it and change

      • Bad choice of target (Score:5, Informative)

        by ray-auch (454705) on Thursday June 20, 2013 @04:25PM (#44064475)

        Bad choice of target - .Net does actually have multiple compilers available, including open source. But more to the point for this discussion, it has multiple DEcompilers available, including open source.

        Want to know what that nasty MS compiler put in your .Net binary ? - run it through ILSpy.

        Don't trust the ILSpy binary - decompile it with itself, or with a.n.other decompiler.

        In fact, because .Net decompiles so well, the problem of this article (binaries don't compare) just doesn't occur. Want to check your .Net binary against the supposed source ? - easy (well, a hell of a lot easier than with C++). Build your binary from the source, decompile both binaries and compare the two sets of decompiled source. It works, it is consistent and reliable, and it is one hell of a lot more useful at showing up differences than comparing two binaries.

    • by lgw (121541)

      "Exact binaries" is not the point of having the source code.

      The use case is "we're using this binary in production, which we didn't build ourselves". That's how open source is generally used in practice, after all - you download the binaries for your platform, and you (maybe) archive the source away somewhere just in case.

      Isn't that the strongest practical use case for Open Source in the business world? Sure, you don't plan on maintaining it yourself but you could if you have to. The problem is, if the source doesn't match the object, you can't just fix a bug - y

      • by Chuckstar (799005)

        No. The strongest practical use case for Open Source in business is that the Open Source version is some combination of better/cheaper than alternate versions, with "better" including the fact that Open Source projects often get updated faster when security bugs (and sometimes other bugs) are found. The possibility of bringing development fully in-house is not a practical solution for 99.99% of businesses. (I'm exaggerating a little, but not much).

    • by jythie (914043)
      Yeah.. it really strikes me that the person is over exaggerating the importance of a narrow set of use cases. Reproducible builds are nice, and in some cases important, and in an ideal case compiling should be sufficiently deterministic one should be able recreate any given binary, but I would not say that is the 'point' of having access to source code.
    • by OakDragon (885217)
      Already at 5, Insightful, so please enjoy this virtual "+1 Insightful"...
    • Re:Bogus argument (Score:5, Interesting)

      by Aaron B Lingwood (1288412) on Thursday June 20, 2013 @03:07PM (#44063631)

      "Exact binaries" is not the point of having the source code.

      You are correct. However, it is a method to confirm that you have received the entire source code.

      The point being made is that a binary could always contain functions that are malicious, buggy or infringe on copyright while the supplied source does not.

      Case Study:

      A software company (lets call them 'Macrosift') takes over project management of a GPL'd document conversion tool. Macrosift contribute quite a bit of code and the tool really takes off. Most users are obtaining this tool be either the Macrosift-controlled repository or a Macrosift partner-controlled repository as a pre-compiled binary. It can even convert all kinds of documents flawlessly into Macrosift's Orifice 2015 new extra standard format which no other tool seems to be able to do.

      Newer versions of OpenOffice, LibreOffice, JoeOffice come out and this tool just doesn't seem to be doing the job. Sure, it converts perfectly from everything into MS .xsf but doesn't work so well the other way and won't work at all between some office suits. The project gets forked by the community to make it feature complete. The project managers start by compiling the source, and to their surprise, the tool will not work as well as the binary did. After a year passes, the community realizes they've been had. By painstakingly decompiling the binary, they discover that the function that converts to MS proprietary .xsf is different to that in the source. Another hidden function is discovered in the binary that introduces errors and file bloat after a certain date if the tool is being used solely on non-MS documents.

      How else can I ascertain whether you have supplied me with THE source code for THIS binary if I can not produce said binary with provided source code?

      • Re:Bogus argument (Score:5, Informative)

        by mrogers (85392) on Thursday June 20, 2013 @05:08PM (#44064825)

        The latest alpha release [torproject.org] of the Tor Browser [torproject.org] uses a deterministic build process for exactly that reason: users of open source software (or the small minority of users with the necessary technical skills) should be able to check that the published binaries match the published source exactly - no malware, no easter eggs, no backdoors. If someone detects a mismatch, they can alert the rest of the community.

        Mike Perry, who spent six weeks getting deterministic builds working for Tor, has some interesting thoughts [stanford.edu] on why this is an important issue for security tools, even if the users completely trust the developers.

        I'd like to see more open source projects following Tor's lead. Gitian [gitian.org] is a deterministic build tool that might help - it enables multiple people to build a binary from the same source and check that they get identical results.

    • Re:Bogus argument (Score:5, Informative)

      by aristotle-dude (626586) on Thursday June 20, 2013 @03:55PM (#44064131)

      "Exact binaries" is not the point of having the source code.

      Uh, you must not have worked in a shop that does continuous integration automated builds? Do you really think QA should be handed binaries that you compile and have them trust them?

      The problem is that GCC will always give you a different binary every time you compile from the same source. This makes it impossible that the binary you received comes from the source you claim to have used. You can get around this by never receiving binaries from anywhere but the automated build machine but it would still be useful to be able to test that a build that you received was built from the code you expect.

      There were several reasons why Apple moved away from the GCC tool chain to LLVM and Clang but one of the abilities of the LLVM stack is that you can actually get identical binaries from the same source compiled on different machines at different times.

      • by c++0xFF (1758032)

        First off, [Citation Needed]. This is simply not true from my experience. I've done this many times with GCC and produced identical output (or so diff says). One caveat: make sure you start from a clean directory structure each time, because your Makefile might list dependencies in different orders for the linker if not everything is recompiled, and I think that can produce different result. But this is the build system presenting different input to the compiler, not the compiler itself producing differ

      • Re: (Score:3, Informative)

        by gawbl (941021)

        I used to work on GCC, and the randomness you describe would have made it impossible to find bugs.

        GCC is deterministic. If you feed it the same input and launch it with the same options, it generates the same output. GCC developers would never tolerate random behavior.

        Is it possible that you have address randomization turned on in your OS? I used to to use watchpoints & similar in the heap, and this would only work if randomization (ASLR/PAX) is disabled.

    • Re:Bogus argument (Score:4, Insightful)

      by UnknownSoldier (67820) on Thursday June 20, 2013 @03:59PM (#44064173)

      Exactly.

      I've recompiled Vim because I wanted to fix Vim's broken design of being unable to distinguish between TAB and Ctrl-I, doesn't support CapsLock remap, and wanted a smaller executable not needing all the bells and whistles of the kitchen sink.

      I've recompiled Notepad++ due to bug (couldn't select a font smaller then 8 pts because the array was hard-coded in two different places. WTF?)

      If you want to be able to quickly tell the quality of an open source project, see how easy it is to follow the directions to even produce an executable. Most open source projects have shitty docs on how to even compile it.

    • by AJWM (19027)

      Sometimes it's exactly the point of having the source code.

      Take voting machines for example. I used to work for a company that certified same. This involved obtaining everything that the vendor didn't write (compilers, OS, libraries, etc) from the 3rd party vendors (Microsoft, etc) including Linux from Scratch for the linux-based systems, then compiling it all (thus creating a "trusted build") and comparing the binaries.

      No exact match, no certification. (This was after the vendor's source code went thro

  • by intermodal (534361) on Thursday June 20, 2013 @02:24PM (#44063047) Homepage Journal

    Given the scale of most modern programs' codebase, good luck actually reviewing the code meaningfully in the first place. That said, if you're really that concerned about the code matching the source, run a source-based distro like Gentoo or Funtoo. For most practical purposes, though, users find binary distributions like Debian/Ubuntu or the various Red Hat-based systems to be more effective in regards to their time.

    • We frequently discover a bug and need to fix it without upversioning the whole package (which could result in other incompatibilities with the rest of the system).

      So we track down the code for the version we're using, get it building from source with suitable config options, and then fix the bug. In the simple case the bugfix is present in a later version and we can just backport it. In the tricky case you need to get familiar enough with the code to fix it (and hopefully in a way that the upstream mainta

    • You can also get the source packages from debian/ubuntu and compile it yourself, all in one command:
      apt-get -b source packagename

      Source debs have also the good habit of putting the modifications to the upstream package in a separate diff.

  • by Chrisq (894406) on Thursday June 20, 2013 @02:24PM (#44063049)
    If you are that paranoid study the source code then recompile
    • by gl4ss (559668)

      If you are that paranoid study the source code then recompile

      yeah if he is bothering to read through it he should quite easily be bothered enough to compile it as well.. that's what he was going to do anyhow to compare.

      also, you could clone the compile chain of popular linux distros as well, without fuss. it's not like they hide their build system behind closed doors.

    • by Lumpy (12016)

      If you are truly paranoid you write it yourself.

    • by MrEricSir (398214)

      If you're really going to be paranoid, how do you know your machine isn't compromised? I hope you're doing a bit-for-bit comparison on your hard drive twice a day to make sure there's no file changes you didn't approve, and that you've soldered the top off our CPU and put it under a high power microscope to ensure the circuits haven't been changed.

  • touch o' hyperbole (Score:5, Insightful)

    by ahree (265817) on Thursday June 20, 2013 @02:27PM (#44063099)

    I'd suggest that "severely limiting the whole point of running free software" might be a touch of an exaggeration. A huge touch.

    • by gmuslera (3436)

      Is a big point anyway. Indepent auditing. That someone, somewhere, could say that the binary that my distribution gave me had a backdoor instead of the code they published (i.e. because forced by law to do and not disclose it), and that i even could check or rebuild it. With closed source you don't have that freedom, is even against the law to try to find that. And in current US pushed cyberwar state of things (they are trying this kind of things already [slashdot.org]), to have the possibility of independent auditing of

  • by Microlith (54737) on Thursday June 20, 2013 @02:28PM (#44063123)

    A simple analysis shows that this is very hard in practice, severely limiting the whole point of running free software."

    No it doesn't. The whole point of running free software is knowing that I can rebuild the binary (even if the end result isn't exactly the same) and, more importantly, freely modify it to suit my needs rather than being beholden to some vendor.

    • by Shoten (260439) on Thursday June 20, 2013 @02:48PM (#44063375)

      A simple analysis shows that this is very hard in practice, severely limiting the whole point of running free software."

      No it doesn't. The whole point of running free software is knowing that I can rebuild the binary (even if the end result isn't exactly the same) and, more importantly, freely modify it to suit my needs rather than being beholden to some vendor.

      There's another point too...which incidentally is the whole point of running a distro like Gentoo...that you can compile the binary exactly to your specifications, even sometimes optimizing it for your specific hardware. I don't get at all this idea he has about "reproducible builds;" if he builds the same way on the same hardware, he'll get the same binary. But what he's doing is comparing builds in distros with ones he did himself...and the odds that it's the same method used to create the binary are very low indeed.

      If he's concerned about precompiled binaries having been tampered with, he's looking at the wrong protective measure. Hashes and/or signing are what is used to protect against that...not distributing the source code alongside the compiled binary files. If you look at the source code and just assume that a precompiled binary must somehow be the same code "just because," you're an idiot.

      • by cecom (698048)

        The whole point is that the distro build is supposed to be 100% reproducible, with the exception of things like timestamps and signatures. And it is with Debian, as he found out. But not the other distros he tried. And that is a real problem.

        Why? naive people might ask. Because that is the only way to verify that a binary is what is claims to be. And is the only way to reliably support and diagnose something. It is shocking how few people on Slashdot realize that.

  • Not a concern (Score:5, Insightful)

    by gweihir (88907) on Thursday June 20, 2013 @02:29PM (#44063141)

    If you need to be sure, just compile it yourself. If you suspect foul play, you need to do a full analysis (assembler-level or at least decompiled) anyways.

    The claim that this is a problem is completely bogus.

  • by erroneus (253617) on Thursday June 20, 2013 @02:30PM (#44063153) Homepage

    It's a fair argument. If you are not compiling your binaries, how do you know what you have is compiled from the source you have available?

    Truth? You don't. If you suspect something, you should investigate.

    • by SirGarlon (845873)

      If you suspect something, you should investigate.

      And on an open-source OS, you can.

    • by jdunn14 (455930)

      Sorry to tell you, but Ken Thompson talked about you how you pretty much have to trust someone back in 1984: http://cm.bell-labs.com/who/ken/trust.html [bell-labs.com]

      If no one else, you have to trust the compiler author isn't pulling a fast one on you....

    • It's a fair argument. If you are not compiling your binaries, how do you know what you have is compiled from the source you have available?

      Truth? You don't. If you suspect something, you should investigate.

      You're right, of course. But that's not quite the (non) argument he was making, I think.
      My understanding was that he wanted to check how easy it was to get the same result if compiled the public-available source and compared it to the objects.
      Turns out that, due to datestamps etc. slightly different, but no biggie.

      Anyway, in a production environment you should be compiling from source, since - security concerns aside - that's the only way to be sure you've got the correct source for your objects.

  • by tooslickvan (1061814) on Thursday June 20, 2013 @02:32PM (#44063167)
    I have recompiled all my software from the source code and verified that the binaries match but for some reason there's a Ken Thompson user that is always logged in. How did Ken Thompson get into my system and how do I get rid of him?
    • by tepples (727027)

      I have recompiled all my software from the source code and verified that the binaries match

      How many different compilers did you use? Did you try any cross-compilers, such as compilers on Linux/ARM that target Windows/x86 or vice versa?

      How did Ken Thompson get into my system

      See bunratty's comment [slashdot.org].

      and how do I get rid of him?

      See replies to bunratty's comment.

  • by jedidiah (1196) on Thursday June 20, 2013 @02:32PM (#44063173) Homepage

    > Are You Sure This Is the Source Code?

    Yes. Yes I am sure. I built it myself. It even includes a few of my own personal tweaks. It does a couple of things that the normal binary version doesn't do at all.

    • But given that the optimization phase of compiling/building can be significant, and their are lots of different optimization options; Why would you not just be better to leave that up to code maintainers?

  • by vikingpower (768921) <exercitussolus&gmail,com> on Thursday June 20, 2013 @02:33PM (#44063185) Homepage Journal
    1) Submitter is the one who wrote the blog post 2) No cross-reference, no references, no differing opinions at all 3) "severely limiting the whole point of running free software" is more than a bit of an exaggeration
    • I honestly don't understand the blog post, I'm not severely limited in any way. I somehow feel the user doesn't even know how to compile software and doesn't know anything about Open Source. It doesn't matter if the binary is the same, maybe his is compiled with different flags than mine or maybe I added a patch.

      This honestly smells of someone out to discourage usage of Open Source.

      • by arth1 (260657)

        This honestly smells of someone out to discourage usage of Open Source.

        Please run this statement through Hanlon's Razor.

        (Or, to put it another way, I don't think you're deliberately misleading when you use the word "honestly". The alternative is much more likely.)

  • Trust (Score:5, Insightful)

    by bunratty (545641) on Thursday June 20, 2013 @02:33PM (#44063193)

    I took a graduate-level security class from Alex Halderman (of Internet voting fame) and what I came away with is that security comes down to trust. To take an example, when I walk down the street, I want to stay safe and avoid being run over by a car. If I think that the world is full of crazy drivers, the only way to be safe is to lock myself inside. If I want to function in society, I have to trust that when I walk down the sidewalk that a driver will not veer off the road and hit me.

    When you order a computer, you simply trust that it doesn't have a keylogger or "secret knock" CPU code installed at the factory. It's exactly the same with software binaries, of course. In the extreme case, even examining all the source code will not help [win.tue.nl]. You must trust!

    • by jdunn14 (455930)

      So very true. In the end it all comes down to trust and as I posted above (before noticing yours) Thompson explained it extremely well.

    • Maybe those just aren't good examples, but both have way more than simple trust involved. There's a huge disincentive to perpetrate either of those actions. In the case of a driver, there's car repairs, court costs, plus the downstream effects; running down a pedestrian, especially one on a sidewalk is a life altering action that no sane individual would perform just on a lark. In the case of an insecure computer, the company would be ruined if it came out that they were doing this to all the systems it
    • by mbone (558574)

      You have an odd notion of trust. And of security, for that matter.

      Blindly trust nothing except the laws of physics. Everything else is subject to investigation and verification. Just because verification is difficult or may fail is not excuse for not trying. By being vigilant, you can approach security, although you will never fully get there.

      When I walk down the sidewalk, for example, I pay attention to the surroundings. How much attention is based on prior experience and knowledge of how likely drivers (

    • Re:Trust (Score:4, Interesting)

      by Kjella (173770) on Thursday June 20, 2013 @03:30PM (#44063903) Homepage

      So your argument is that there will always be risk, so there's no point in managing or minimizing it? To continue your car analogy, even if I'm at a pedestrian crossing I don't really trust cars to stop and I always throw a glance to make sure they've noticed me. An uncle of mine was witness to a horrible accident, old lady got run over in broad daylight in the middle of a well-marked crossing, perpetrator was an old half-blind fool who should have lost his license already or had and didn't care. Doesn't help the old lady one bit no matter how much they punish him anyway. You always trust lots of people, you trust the factory who building the brakes on your car and the mechanic who serviced them, you trust the people who built the bridge it won't collapse from out under you but only because you lack any other practical alternative.

      With software you do have more and better choices, not perfect choices but it's a helluva lot harder for the NSA to place a spy bug in Linux than in Windows where they can just show up with a national security letter that is both instructions and gag order and violating either can land you in jail. If there are reasonable ways to prove that these are the exact versions and compiler settings used to produce this binary, then that is much stronger than trust. Trust is something that can be betrayed, while reproducible steps is something you can verify. In science, if one scientists told you here are the steps of my experiment, feel free to reproduce my results and the other said "I can't show you the data but the results are correct, trust me", who would you trust?

  • by 0dugo0 (735093) on Thursday June 20, 2013 @02:33PM (#44063199)

    ..are a bitch. The amount of hoops eg. the bitcoin developers jump through to proof they didn't mess with the build are large. Running specific OS build in emulators with fake system time and whatnot. No easy task.

  • by RichMan (8097) on Thursday June 20, 2013 @02:37PM (#44063247)

    I do IC design. Logical Equivalency Checking is well worn tool. You can futz about with the logic in a lot of different ways. LEC means we can do all sorts of optimization and still guarantee equivalent function. We can even move logic from cycle to cycle and have it checked that things are logically equivalent.

    You run two compilers on the same source code you won't get the same code. You run two different versions of the compiler on the same code you wont' get the same code. You run the same compiler with different options you won't get the same code. They should however all be logically equivalent.

  • Unless I'm missing something pretty profound, even having the exact *source* won't always result in the exact binary. My understanding (and I could be wrong about this) is that you can take a well written program and plug it into multiple compilers. GCC may be one of the most popular options, but it's not the only one.

    But compilers all optimize differently. GCC 3.x optimizes somewhat differently than GCC 4.x. You can tweak this behavior by manually setting compiler flags, or you can compile binaries that ex

  • by SuperBanana (662181) on Thursday June 20, 2013 @02:51PM (#44063415)

    This a problem that doesn't exist. You establish a chain of evidence and authority for the binaries via signing and checksums, starting with the upstream. Upstream publishes source and there's signing of the announcement which contains checksums. Package maintainer compiles the source. The generated package includes checksums. Your repo's packages are signed by the repo's key.

    You can, at any point in time with most packaging systems, verify that every single one of your installed binaries' checksums match the checksums of the binaries generated by the package maintainer.

    If you don't trust the maintainer to not insert something evil, download the distro source package and compile it yourself.

    If you suspect the distro source package, all you have to do is run a checksum of the copy of the upstream tarball vs the tarball inside the source package, and then all you need to do is review the patches the distro is applying.

    If you suspect the upstream, you download it and spend the next year going through it. Good luck...

  • by ElitistWhiner (79961) on Thursday June 20, 2013 @03:03PM (#44063581) Journal

    Finally, someone gets it. The backdoor is never where you're looking for it.

  • by mrr (506) on Thursday June 20, 2013 @03:05PM (#44063613)

    I work in the gaming (Gambling) industry.

    Many states require us to submit both the source code and build tools required to make an exact (and I mean 'same md5sum') copy of the binary that is running on a slot machine on the floor.. to an extent that would blow you away.

    They need to be able to go to the floor of a casino, rip out the drive or card containing the software, take it back to THEIR office, and build another exact image of the same drive or SD card.

    md5sum from /dev/sda and /dev/sdb must match.

    I can tell you the amount of effort that goes into this is monumental. There can be no dynamically generated symbols at compile time. The files must be built compiled and written to disk exactly the same every time. The filesystem can't have modify or creation times because those would change.

    This is a silly idea for open source software, the only industry I've seen apply it is perhaps the least-open one in the world.

  • by taara (687286) on Thursday June 20, 2013 @03:10PM (#44063685)
    One example being Philips TV or BluRay built on Linux. When asked for source code, it is provided, but there are no way to ensure that the source code is for the device, because the provided binaries are encrypted and signed.
  • What difference does it make?

    Do you think your smart enough to detect tampering by reading source code?

    To detect tampering run strings on the binary and pipe it to grep. If the following string appears 1.3.6.1.4.1.981 you are fucked.

  • by WOOFYGOOFY (1334993) on Thursday June 20, 2013 @03:44PM (#44064033)

    Not only is limited in that way- which itself is an interesting fact, but it's limited in a lot of other ways also.

    For one, source code is often bad, as in impenetrable, just off the top of my head-

    * Realms of private, non-API / SPI code which is effectively *how the program actually works* which is also completely undocumented.

    * Grotesque architectural errors made by (affordable) beginners which have nevertheless been cast in stone by exposing them publicly (God classes filled with global variables, etc. )

    * Telegraphic and or misleading method and variable names, e.g. .VariablesWithMissingVowels, also known as Varwmvwls which nevertheless often serve as the ONLY documentation for that variable or method,

    * Unfortunate architectural decisions made early on by experienced programmers who may be proud of those decisions. (tunneling package private methods out to "friend classes") and thus subverting the purpose of package private classes and making the source code scope modifiers an effectively an unreliable indicator of source code scope, for instance)

    *500 -1000 line methods with some or all of the above characteristics.

    * Just massive code bases- I am facing one with literally half a million classes right now...That's right almost 450,000 classes, in a code base that is deliberately architected to defy built-in scoping rules of the language, so virtually anything could call anything ...

    And on and on.

    All of these things will never be fixed for reasons we all understand, I presume, but reflect on of what this implies for open source. It implies that the much vaulted idea that more developers will iteratively make the code base better over time is a fiction with respect to the actual quality of the code base itself.

    No team is going to stop adding features and create more work for itself in the form of resolving conflicts for the sake of enabling their program to do what it already can do.

    This doesn't even get into the whole ego thing.

    Worse still, anything exposed as public in any way may have a million clients depending on it and change effectively becomes impossible, open source or not. All things public, or even more precisely all things reachable in the code base by "outsiders" through any device found in the host language whatsoever, intended or otherwise, are effectively unchangeable.

    In lieu of a successful campaign to stop development and do a rewrite, only a fork will make any of the above better. Forks are becoming more common, but they fail to sustain their branching a high percentage of the time (57%) and anyways presume the power TO fork and on large project this is harder to achieve.

    The net effect is, open source code bases fail to live up to one of the major the promises of open source, iterative improvement of the code base.

    It's true that some people may fix bugs that they are motivated for external reasons to correct and it's helpful to look at the code base if you're writing a plugin through a public API, but the code itself is often awful and this awfulness , often produced because of limited time and resources has the ironic effect of driving away many times those resources in the form of all the would-be developers who are just turned off. For those who do partake, the existing code has the effect wasting many multiples of the time originally *saved* as each new developer struggles to make sense of the impenetrable code base.

    In my experience there is no easy fix or even pricey one. Original authors are quick to fix on the (self serving) idea that whatever documentation which exists *ought* to be enough and anyone who still has questions must be an *idiot*. Wasting time incrementally slogging around this code becomes some sort of test that the dev is *serious* and *smart* when the reality is more like smart, serious devs came, saw and left without saying a word.

    Code quality is only subjective at the edges. Undocumented code should not exist. F

  • by Skapare (16644) on Thursday June 20, 2013 @04:32PM (#44064551) Homepage

    ... not only is this the source code for the binary I am running, but also that the build system actually works. This is because not only might I want to make changes to the source to improve it, but I might want to do so in a hurry to fix a security hole. Since I might need to rebuild and run the built binary, I might as well test and make sure what the build system built really runs. So I just install the binary I built. Then I know for sure. Who needs the distributed binary (it might have a root kit in it).

"If I do not want others to quote me, I do not speak." -- Phil Wayne

Working...