Please create an account to participate in the Slashdot moderation system

 



Forgot your password?
typodupeerror
×
Software

Are 64-bit Binaries Slower than 32-bit Binaries? 444

JigSaw writes "The modern dogma is that 32-bit applications are faster, and that 64-bit imposes a performance penalty. Tony Bourke decided to run a few of tests on his SPARC to see if indeed 64-bit binaries ran slower than 32-bit binaries, and what the actual performance disparity would ultimately be."
This discussion has been archived. No new comments can be posted.

Are 64-bit Binaries Slower than 32-bit Binaries?

Comments Filter:
  • by (1337) God ( 653941 ) on Saturday January 24, 2004 @12:06AM (#8072822)
    I read this piece yesterday. Here's a tip for those of you who may currently or need to work on building an x86 to x86_64bit cross-compiler under the Linux operating system.

    One of my tight friends, Dan Kegel (cute pic of him here [kegel.com], oh and he works for Google, so he's super-smart and rich! :-*), has something called the CrossTool at http://kegel.com/crosstool [kegel.com] that should be of major help to anyone working with 64-bit Linux systems.

    You may even be able to list it as COTS on your project even though it's free as in beer. In any case, I've tried it, it's sweet, you should try it, it works great for what it does, just like most *nix apps. I prefer having one small tool do something really well than one large software package do a bunch of things really crappily.

    Anyway, stop by Dan's page and say hi. Tell him I sent ya ;-)
  • Short answer (Score:0, Interesting)

    by Anonymous Coward on Saturday January 24, 2004 @12:07AM (#8072831)
    From reading the article, the answer is: Sometimes, depending.
  • Moving more data (Score:5, Interesting)

    by Sean80 ( 567340 ) on Saturday January 24, 2004 @12:10AM (#8072844)
    I'm no expert in this specific area, but I remember a conversation from a few years back abour the 32-bit versus the 64-bit version of the Oracle database. The guy I was speaking with was pretty knowledgeable, so I'll take his word as truth for the sake of this post.

    In his explanation, he said something of the order of "if you want speed, use the 32-bit version of the binaries, because otherwise the computer physically has to move twice as much data around for each operation it does." Only if you want the extra memory mapping capability of a 64-bit binary, he said, would you need to use the 64-bit version.

    I suppose in summary, though, it depends on exactly what your binary does.

  • gcc? (Score:5, Interesting)

    by PineGreen ( 446635 ) on Saturday January 24, 2004 @12:11AM (#8072847) Homepage
    Now, gcc is known to produce shit code on sparcs. I am not saying 64 is always better, but to be hones, the stuff should at least have been compiled with Sun CC, possibly with -fast and -fast64 flags...
  • by Anonymous Coward on Saturday January 24, 2004 @12:14AM (#8072867)
    The surmise that ALL 64 bit binaries are slower than 32 is incorrect...

    At this stage of development for the various 64-bit architectures, there is very likely a LOT of room for improvement in the compilers and other related development tools and giblets. Sorry, but I don't consider gcc to be necessarily the bleeding edge in terms of performance on anything. It makes for an interesting benchmarking tool because it's usable on many, many architectures, but in terms of its (current) ability to create binaries that run at optimum performance, no.

    I worked on DEC Alphas for many years, and there was continuing progress in their compiler performance during that time. And, frankly, it took a long time, and it probably will for IA64 and others. I'm sure some Sun SPARC-64 users or developers can provide some insight on that architecture as well. It's just the nature of the beast.
  • by gvc ( 167165 ) on Saturday January 24, 2004 @12:15AM (#8072877)
    I recall being very disappointed when my new VAX 11/750 running BSD 4.1 was much slower than my PDP 11/45 running BSD 2.8. All the applications I tested: cc, yacc, etc. were faster on the 16-bit PDP than the 32-bit VAX.

    I kept the VAX anyway.
  • by martinde ( 137088 ) on Saturday January 24, 2004 @12:21AM (#8072904) Homepage
    My understanding is that when you switch an Athlon64 or Opteron into 64bit mode, that you suddenly get access to more general purpose registers than the x86 normally has. So the compiler can generate more efficient code in 64bit mode, making use of the extra registers and so forth. I don't know if this makes a difference in real world apps or not though.
  • by Grey Ninja ( 739021 ) on Saturday January 24, 2004 @12:21AM (#8072906) Homepage Journal
    The guy seemed to have his conclusion written before he started... Or at least that's how it seemed to me. When he was doing the SSL test, he said that the results were ONLY about 10% slower on the 64 bit version. Now I might be far too much of a graphics programmer.... but I would consider 10% to be a rather significant slowdown.

    The other thing that bothered me of course was when he said that the file sizes were only 50% bigger in some cases... sure, code is never all that big, but... still...
  • by yecrom2 ( 461240 ) on Saturday January 24, 2004 @12:31AM (#8072951)
    The main product I work on, which was designed in a freaking vacuum, is so tightly tied to wintel that I've had to spend the greater part of a year gutting int and making it portable. Kind of. We currently use 1.5 gig of for the database cache. If we go any higher, we run out of memory.
    We tried win2k3 and the /3gb switch, but we kept having very odd things happen.
    This database could very easily reach 500 gig, but anything above 150 gig and performance goes in the toilet.

    My solution...

    Get a low-to-midrange Sun box that can handle 16+g and has a good disk subsystem. But that's not a current option. Like I said, this thing was designed in a vacuum. The in-memory data-structures are the network data structures. That are all packed on 1-byte boundaries. Can you say SIGBUS? A Conversion layer probably wouldn't be that hard, if it weren't build as ONE FREAKING LAYER!

    Sorry, I had to rant. Anyway, a single 64 bit box would enable us to replace several IA32 servers. For large databases, 64bits is a blessing.

    Matt
  • This guy is a tool (Score:5, Interesting)

    by FunkyMarcus ( 182120 ) on Saturday January 24, 2004 @12:34AM (#8072970) Homepage Journal
    First, anyone with half a brain already knows what his "scientific" results prove. Second, anyone with two thirds of a brain has already performed similar (but probably better) tests and come to the same conclusion.

    And third, OpenSSL uses assembly code hand-crafted for the CPU when built for the 32-bit environment (solaris-sparcv9-gcc) and compiles C when built for the 64-bit environment (solaris64-sparcv9-gcc). Great comparison, guy.

    Apples, meet Oranges (or Wintels).

    Mark
  • Something is wrong. (Score:5, Interesting)

    by DarkHelmet ( 120004 ) <mark&seventhcycle,net> on Saturday January 24, 2004 @12:34AM (#8072972) Homepage
    Maybe it's me, but how the hell is OpenSSL slower in 64 bit?

    It makes absolutely no sense. Operations concerning large integers were MADE for 64 bit.

    Hell, if they made a 1024 bit processor, it'd be something like OpenSSL that would actually see the benefit of having datatypes that bit.

    Something is wrong, horribly wrong with these benchmarks. Either OpenSSL doesn't have proper support for 64 bit data types, this fellow compiled something wrong, or some massive retard published benchmarks for the wrong platform in the wrong place.

    Or maybe I'm just on crack.

  • by CatGrep ( 707480 ) on Saturday January 24, 2004 @12:36AM (#8072983)
    We've got an Itanic box at work that has WinXP 64bit edition on it so we can build & test some 64bit Windows binaries.

    It's the slowest box in the place! Open a terminal (oops, command shell, or whatever they call it on Windoze) and do a 'dir' - it scrolls so slowly that it feels like I'm way back in the old days when I was running a DOS emulator on my Atari ST box.

    Pretty much everything is _much_ slower on that box. It's amazingly bad and I've tried to think of reasons for this: Was XP 64bit built with debugging options turned on when they compiled it? But even if that were the case it wouldn't account for all of it - I'd only expect that to slow things down maybe up to 20%, not by almost an order of magnitude.
  • Forward thinking (Score:5, Interesting)

    by Wellmont ( 737226 ) on Saturday January 24, 2004 @12:45AM (#8073025) Homepage
    Well considering that manufacturers have been working like crazy to produce both 64 bit hardware and software applications, one could see that there is still some stuff to be done in the field.
    What most of the posts are considering and the test itself are "concluding" is that it has to be slower over all and even in the end when 64 bit computing finally reaches it's true breadth. However when the bottlenecks of the pipeline (in this case the cache) and the remaining problems are removed you can actually move that 64 bit block in the same time it takes to move a 32 bit block.
    Producing to 32bit pipes takes up more space then creating a 64bit pipe in the end, no matter which way you look at it and no matter what kind of applications or processes its used for.
    However the big thing that could change this theory is Hyper Compressed Carbon chips, that should replace silicon chips within a decade. (that's fairly conservative estimate.
  • A Makefile? (Score:3, Interesting)

    by PissingInTheWind ( 573929 ) on Saturday January 24, 2004 @12:48AM (#8073036)
    From the article:
    [...] you'll likely end up in a position where you need to know your way around a Makefile.

    Well duh. What a surprise: compiling for a different platform might requires Makefile tweaking.

    Am I the only one to think that was a dummy article wasting a spot for much more interesting articles about 64 bit computing?
  • by Anonymous Coward on Saturday January 24, 2004 @12:49AM (#8073048)
    but more to the point, why would you advertise your sexuality in a technical post? I mean, if a straight person (and believe me, there is no such thing as "bi", you're either straight, queer, or in denial about being queer) were to post "I love vagina" in every post, you'd rightfully make fun of him.

    But we're supposed to care that you consider a man's ass a sexual input? Stop looking for so much attention and lets talk computers.
  • by Anonymous Coward on Saturday January 24, 2004 @12:53AM (#8073063)
    My understanding of low level languages may not be comprehensive, however I am aware that for (lets use the simplest example I am familiar with) MIPS there are a number of registers for the storing of data that will be 'saved' and returned to the caller function, these registers are commonly known as $S0 - $S7, these registers have to be saved in the subroutine in order not to loose the volatile information stored therein.

    for example: ...
    sw $T0, 0($S0) ..

    having more registers would allow you to bypass this step of writing the data to the mem address of $T0, you could use one of the new registers that are not volatile and store it there, thus removing the need for perhaps 5 instructions at a time on each return from a subroutine alone.

    rather than the Store Word instruction (SW) you could just: ..
    addi $T1, $ZERO, $S0 ..

    which would not be lost in the return to the calling function.

    further to this, and i'm not sure that the intel x86 performs the same way, when you wish to load a
    large number, i think in MIPS its >8bit into a register (16 bit register size) you have to infact perform TWO operations to load ONE number.

    basically you load the first (largest significant bits) first

    number = xFFFF FFF0 ..
    LUI $T0, xFFFF #load to the upper half of the
    # register, because the address
    # space only allows for 8 bit size
    ADDI $T0, cFFF0 # add the second portion of the
    # number. ...

    on the basis that x86 shares some of these things, then 64 bit must be faster GIVEN even ground with compilers and so forth, these are assumed (EVEN THO THAT IS NOT THE CASE) because otherwise its all pissing in the wind.

    if this has errors, forgive me, this is not my area of specialty by a long stretch.

    -Archfile

  • Re: OSNews (Score:2, Interesting)

    by dant ( 25668 ) * on Saturday January 24, 2004 @12:55AM (#8073072) Journal
    But what's the fsking point?

    News flash: 64-bit apps are, usually, slightly slower than 32-bit ones. Duh. Any developer who's been around 64-bit environments for more than a few weeks knows this. It's not like there's some subtle magic going on here; bigger pointers means more data to schlep around.

    I think your parent's complaint is that is sort of like a cursory analysis indicating that triangular wheels aren't quite so good as round ones. If you really needed to be told this, you aren't in the audience that the article sounds like it's trying to address.

    Certainly, many applications need 64 bits to operate. That doesn't mean it's the best tool for all jobs. The tone of the article sounds like it's exploring some big question that nobody's thought about before, and that's just silly.
  • by DeathPenguin ( 449875 ) * on Saturday January 24, 2004 @12:56AM (#8073074)
    If it makes you feel better, programs from 1995 tend to run a lot faster on modern hardware. Gzip a kernel on a 66MHz Pentium and then on 2GHz Opteron and you'll see what I mean.
  • by hobuddy ( 253368 ) on Saturday January 24, 2004 @01:07AM (#8073121)

    You can see official AMD benchmark results of various programs running on Windows XP 32-bit edition vs. Windows XP 64-bit edition beginning of page 36 of this PDF [sun.com]. The results have three columns: time in seconds on WinXP 32-bit w/ 32-bit executable, time in seconds on WinXP 64-bit with 32-bit executable, and time in seconds on WinXP 64-bit with 64-bit executable.

    The results are quite impressive, but I'm not sure we can trust AMD, since obviously they want AMD64 to look good.

  • Re:gcc? (Score:3, Interesting)

    by ctr2sprt ( 574731 ) on Saturday January 24, 2004 @01:21AM (#8073167)
    gcc is known to produce shit code on computers. I find these benchmarks interesting not because of what they say about the hardware, but because of what they say about gcc. It would make me nervous if my 64-bit platform of the future were tied to gcc. I hope for AMD's sake that they are working very hard either on producing their own compiler (maybe they have and I just haven't heard about it) or making gcc stop sucking quite so hard.
  • by trb ( 8509 ) on Saturday January 24, 2004 @01:26AM (#8073180)
    Yes, and programs compiled for 16-bit PDP-11 running on the VAX-11/780 in "compatibility mode" were faster than the same programs compiled for 32-bit VAX native mode running on the same VAX. It makes sense, they were doing pretty much the same stuff, and fetching half as much data. But of course, 11's had limited address space, and the VAX address space was relatively huge.
  • by LoadWB ( 592248 ) on Saturday January 24, 2004 @01:26AM (#8073181) Journal
    The article mentions tweaking the LD_LIBRARY_PATH...

    I was told a long time ago by a number of people I considered to be Solaris gurus -- not to mention in a number of books, Sun docs, etc. -- that the LD_LIBRARY_PATH variable was not only heading towards total deprecation, but introduced a system-wide security issue.

    In its stead, we were supposed to use the "crle" command to set our library paths.

    On all of my boxes I use crle and not LD_LIBRARY_PATH and everything works as expected.

    Any pro developers or Solaris technical folks that can comment on this?
  • Re: OSNews (Score:4, Interesting)

    by Endive4Ever ( 742304 ) on Saturday January 24, 2004 @01:26AM (#8073184)
    I put NetBSD on most of my Sparc hardware. Because then I can run and build from the same exact source tree of packages as I use on my Intel boxes. And run a kernel built from exactly the same source.

    Which brings up a point: both NetBSD/Sparc and NetBSD/Sparc64 will run on an Ultra 1, which is a 64 bit machine. Why doesn't somebody install each NetBSD port on two seperate Ultra 1 machines. Then the benchmark comparision can be between the normal apps that build on both systems, running in parallel on two identical systems. Its exactly the same codebase except for the 32 or 64 bittedness.
  • by Bruha ( 412869 ) on Saturday January 24, 2004 @01:36AM (#8073222) Homepage Journal
    When we get solid state hard drives and if they're reliable and fast as regular ram then ram will be gone and the SSD will take over. So in essence your machine may just allocate itself a huge chunk of the drive as it's own memory space..

    Imagine a machine that can grab 16g for it's memory usage and your video card having a huge chunk for itself also. Along with your terrabits of information space if things pan out well enough.
  • by j3110 ( 193209 ) <samterrell&gmail,com> on Saturday January 24, 2004 @02:01AM (#8073337) Homepage
    Ummm... I beg to differ on the reasons...

    Most 64/32bit hybrid machines probably just split the arithmatic/logic units in half (just takes a single wire to connect them anyhow). Having an extra ALU around will surely push more 32bit numbers through the pipe. It's not going be as fast as a 64bit optimized application would gain from having the combined operations should it need them though.

    I'm beginning to wonder these days how much CPU speed even matters though. We have larger applications that can't fit in cache, page switching from RAM that isn't anywhere near the speeds of the CPU, and hard drives that are only 40MB/s on a good day/sector, with latency averaging around 5-6ms. CPU is the least of my worries. As long as the hard disk is being utilized properly, you'll probably not see significant differences between processor speeds. I'm a firm believer that people think that 500MHz is slow because the hard drive in the machine was too slow. Unless you are running photoshop, SETI, Raytracing, etc., you probably wouldn't notice if I replaced your 3GHz processor with a 1GHz.
  • by sirsnork ( 530512 ) on Saturday January 24, 2004 @02:06AM (#8073363)
    How exactly did you get 2 x 32bit processors running 64bit code?
  • 6502? (Score:4, Interesting)

    by tonywestonuk ( 261622 ) on Saturday January 24, 2004 @02:09AM (#8073374)
    By what method is a processor judged to be 16,32 or 64 bits?...

    The 6502 had 8bit data, but 16 bit address bus, and was considered an '8 bit'
    68000 had 16 bits data, 32 bits address - this was a 16 bit

    So, why can't we just increase the address bus size of processors, to 64,while keeping the databus size at 32bits. have some new 64 bit addressing modes. The processor can still be 32 bit data, so the size of programs can stay the same....Store program code in the first 4Gigs of memory, (zero page!) , and the pointers within that code can remain 32bits, but have the option of using bigger 64bit pointers to point at data in the remaining 2^63 bytes. This should give best of both worlds of 32vs64 bit.
  • Re:Moving more data (Score:2, Interesting)

    by nikster ( 462799 ) on Saturday January 24, 2004 @02:35AM (#8073453) Homepage
    "if you want speed, use the 32-bit version of the binaries, because otherwise the computer physically has to move twice as much data around for each operation it does."

    if that was true, 16 bit would be even faster than 32. this is not the way electron shuffling works.

    i think it's more a question of standardization: the entire PC world has been sworn in on 32 bit, and has optimized the last little bottleneck to perform best on 32 bit data (buses, registers, etc). throughout the entire machine, but probably most notably in memory subsystems...

    there are always specialized apps which will benefit from 64/32/16 bit operations, but for the majority of apps, the memory optimizations will be the only factor.

  • by afidel ( 530433 ) on Saturday January 24, 2004 @03:13AM (#8073572)
    AND you can get the best of both worlds because data sizes do NOT have to be 64 bit just to use the 64bit registers. In fact here is a quote from an AMD manual on driver porting for 64bit-Windows on AMD64 platforms:
    *Stupid freaking AMD engineer made the PDF world accessible but then went and encrypted it so that I can't cut and paste, print it, or do anything with it*
    Well basically INT's and LONG's remain 32 bit while Pointers, LONG LONG's and Floats are 64 bit by default.
    The paper can be found at AMD's website [amd.com]
  • Re:Moving more data (Score:4, Interesting)

    by soundsop ( 228890 ) on Saturday January 24, 2004 @04:47AM (#8073848) Homepage

    In implementation terms, it takes some time to charge up the address bus, so you increase bandwidth and execution speed by charging up address n, but doing a quick read of n+1, n+2, n+3, and more on the latest CPUs. You only have to wiggle the two low-order address lines for the extra reads, so you don't pay the pre-charge penalty that you would for access randomly in memory.

    This is incorrect. It has nothing to do with charging the address lines. Loading multiple sequential locations is slow on the first access and fast on the subsequent bytes because a whole memory row (made of multiple words) is read at once. This full memory row (typically around 1kbit) is transferred from the slower capacitive DRAM storage to faster transistor-based flip-flops. The subsequent sequential words are already available in the flip-flops so it's faster to route them off-chip since the slow DRAM access is avoided.

  • Re:OSNews = UnNews? (Score:2, Interesting)

    by llzackll ( 68018 ) on Saturday January 24, 2004 @08:39AM (#8074390)
    It shouldn't take longer to read 64 bits than 32 on a 64 bit architecture. Theoretically, a 32 bit machine will read 32 bits on a number of clock cycles, and a 64 bit machine should read 64 bits on the same ammount of clock cycles. This doesn't necessarily mean faster execution times on 64 bit though. A lot of it depends on the compiler, and the OS.

    Also, if you just rebuild a an application that was designed around 32 bit in 64 bit mode, you probably aren't going to notice an improvement (if any at all).

    I noticed the article used GCC, which probably hasn't caught on with 64 bit yet.. I'm pretty sure SUN has their own compiler, which probably would produce better 64 bit code than GCC on a SUN box.
  • Re: OSNews (Score:3, Interesting)

    by be-fan ( 61476 ) on Saturday January 24, 2004 @03:39PM (#8076261)
    If the compiler sucks, then it would suck equally for 32-bit and 64-bit binaries! They use the same code generator!
  • by Valdrax ( 32670 ) on Saturday January 24, 2004 @05:57PM (#8077211)
    This is modded Insightful?

    You've completely missed the entire point of the test. This has nothing to do with your next purchase decision -- it's purely designed to test whether or not the common claim that using 64-bit values decreases performance due to memory latency is true. This test makes no claims whatsoever that it has anything to do with whether or not you should be using a 64-bit setup. RTFA.

    The "obsolete architecture" is one of the few where 64-bit and 32-bit operations have no inherent performance advantage on the processor, unlike the Opteron and Itanium processors where 64-bit mode has several advantages over 32-bit mode (extra registers or not being emulated). This makes it a perfect testbed for evaluating this claim. The speed of the processor has absolutely no relevance to the question at hand (with the exception of testing memory access starvation on system with a greater CPU to bus clock difference).

    It's a shame you're too wrapped up in a "buy, buy, buy" mindset to consider the value of curiosity and of testing commonly held beliefs.

Get hold of portable property. -- Charles Dickens, "Great Expectations"

Working...