Slashdot is powered by your submissions, so send in your scoop

 



Forgot your password?
typodupeerror
×
Software

Are 64-bit Binaries Slower than 32-bit Binaries? 444

JigSaw writes "The modern dogma is that 32-bit applications are faster, and that 64-bit imposes a performance penalty. Tony Bourke decided to run a few of tests on his SPARC to see if indeed 64-bit binaries ran slower than 32-bit binaries, and what the actual performance disparity would ultimately be."
This discussion has been archived. No new comments can be posted.

Are 64-bit Binaries Slower than 32-bit Binaries?

Comments Filter:
  • by Transcendent ( 204992 ) on Saturday January 24, 2004 @12:11AM (#8072846)
    Aren't there certian optimizations and, in general, better coding for most 32 bit applications (on the lowest level of the code) because people have used it for so long? Couldn't it just be that we need to refine coding for 64 bit processors?

    Most "tech gurus" I've talked to at my university about the benefites of 64bit processing say that it is in part due to the increase of the number of registers (allowing you to use more at the same time, shortening the number of cycles needed). Could time allow us to write more efficient kernels, etc for 64 bit processors?

    So either the code isn't good enough, or perhaps there's another physical limitation (longer pipelines, etc) on the chip itself? Correct me if I'm wrong.
  • by inode_buddha ( 576844 ) on Saturday January 24, 2004 @12:14AM (#8072869) Journal
    that this whole discussion will compare apples and oranges to death? I could make a fruit salad out of these.
    Seriously though, there's *so* many other factors involved:

    How much cache is ideal for hello.c?
    How many branches does it need? Is the prediction worth a shit?
    Does hello.c run faster at 2 GHz?

    THINK before you post please so my hair doesn't hurt so much... thx.

  • by renehollan ( 138013 ) <[rhollan] [at] [clearwire.net]> on Saturday January 24, 2004 @12:15AM (#8072883) Homepage Journal
    *cough* wider data busses *cough*. 'course this does mean that 64 bit code on systems with 32 bit wide data paths will be slower, but, like, migration always involves speed bumps. I remember the a.out to elf transition pain days of Linux.
  • Re: OSNews (Score:5, Insightful)

    by rainwalker ( 174354 ) on Saturday January 24, 2004 @12:27AM (#8072929)
    Your "analysis" may be valid, but it's really not applicable. The title of the story is, "Are 64-bit Binaries Really Slower than 32-bit Binaries?" The author takes a 64-bit machine, compiles a few programs, and tests the resulting binaries to see which is faster. I'd say that the review is aptly titled and an interesting point to think on. Certainly he didn't compile every open source program known to mankind, as it sounds like he missed some pet app of yours. OpenSSL might be kind of arbitrary, but gzip and MySQL seem like reasonable apps to test. Like the last page says (you *did* RTFA, right?), if you don't like his review, go write your own and get it published.
  • by leereyno ( 32197 ) on Saturday January 24, 2004 @12:32AM (#8072958) Homepage Journal
    The point of a 64-bit architecture boils down to two things really, memory and data size/precision.

    An architecture with 32-bits of address space can directly address 2^32 or approximately 4 billion bytes of memory. There are many applications where that just isn't enough. More importantly, an architecture whose registers are 32-bits wide is far less efficient when it comes to dealing with values that require more than 32 bits to express. Many floating point values use 64 bits and being able to directly manipulate these in a single register is a lot more efficient than doing voodoo to combine two 32-bit registers.

    So, if you have an problem where you're dealing with astronomical quantities of very large (or precise) values, then a 64-bit implementation is going to make a very big difference. If you're running a text editor and surfing the web then having a wider address bus and wider registers isn't going to do squat for you. Now that doesn't mean that there may not be other, somewhat unrelated, architectural improvements found in a 64-bit architecture that a 32-bit system is lacking. Those can make a big difference as well, but then you're talking about the overall efficiency of the design, which is a far less specific issue than whether 64-bits is better/worse than 32.

    Lee
  • Retarded article. (Score:1, Insightful)

    by Anonymous Coward on Saturday January 24, 2004 @12:38AM (#8072994)
    Short answer: No.

    Medium answer: If you're not a programmer, yes. Expect about the same speed, but maybe slightly less.

    Long answer: Direct comparisons like this are in no way valid because the code is identical. It's the same algorithm running at the same clockspeed. Your compiler can't program. Think about this: There's only so much space taken up by a logical operation. The question:

    "is this bit set to one? if yes, do this.. if no, do that"

    ..does not get any faster just because of the size of the register the single bit is contained in. It's still bound by the clockspeed. Programmers can rewrite algorithms to do certain things in parallel, but it's probably not unless it's a big memory operation, multimedia app, game or graphics package. For those it will be much better.

    Which is why Intel is more concerned with clockspeed than number of bits.
  • by FunkyMarcus ( 182120 ) on Saturday January 24, 2004 @12:43AM (#8073019) Homepage Journal
    Maybe it's me

    It's you.

    OpenSSL in the 32-bit environment as the guy configured it was doing 64-bit arithmetic. Just because the guy had 32-bit pointers doesn't mean that his computer wasn't pushing around 64-bit quantities at once. It's called a "long long".

    In fact, as he had OpenSSL configured, he was using some crafty assembly code for his 32-bit OpenSSL builds that even used 64-bit registers. His 64-bit builds were using plain old compiled C.

    But he didn't even know that.

    Big whoop.

    Mark
  • by T-Ranger ( 10520 ) <jeffw@cheMENCKENbucto.ns.ca minus author> on Saturday January 24, 2004 @12:49AM (#8073043) Homepage
    GCC's primary feature is, has always been, and likey will be for a long time: portability. GCC runs on everything.

    If you want FAST code you should use the compiler from your hardware vendor. The downside is that they might cost money, and almost definitly implement things in a slightly weird way. Weird when compared to the official standard, weird when compared to the defacto standard that is GCC.

    I though this was common knowladge, at least amongst people who would be trying to benchmark compilers...

  • by superpulpsicle ( 533373 ) on Saturday January 24, 2004 @12:49AM (#8073047)
    Why are we comparing mature 32-bit software with 64-bit software in its infancy?

  • Re: OSNews (Score:4, Insightful)

    by chunkwhite86 ( 593696 ) on Saturday January 24, 2004 @12:55AM (#8073070)
    Your "analysis" may be valid, but it's really not applicable. The title of the story is, "Are 64-bit Binaries Really Slower than 32-bit Binaries?" The author takes a 64-bit machine, compiles a few programs, and tests the resulting binaries to see which is faster.

    How can you be certain that this isn't simply comparing the efficiency of the compilers - and not the resulting binaries???
  • by swordgeek ( 112599 ) on Saturday January 24, 2004 @12:58AM (#8073083) Journal
    64-bit binaries run slower than 32? That's certainly the dogma in the x86 world, where 64-bit is in its infancy. That was the belief about Solaris/Sparc and the HP/AIX equivalents FIVE YEARS AGO maybe.

    Running benchmarks of 32 vs. 64 bit binaries in a 64 bit Sparc/Solaris environment has shown little or no difference for us, on many occasions. If the author had used Sun's compiler instead of the substantially less-than-optimal gcc, I expect that his 20% average difference would have disappeared.
  • by renehollan ( 138013 ) <[rhollan] [at] [clearwire.net]> on Saturday January 24, 2004 @01:06AM (#8073117) Homepage Journal
    Well yes, but there is an advantage to smaller pointers when you can get away with them *if the processor has native support for them*. While exploited in small object allocators, it isn't always the case that the CPU can gallop through instructions as fast as they can be fed to it, multiple functional units notwithstanding. Though, clearly this is an issue only with data and pointers at the closest cache level to the processing units.

    So, memory bandwidth remains an issue, and I concede the point.

    Still, buse widths tend to optimize around typical transfer patterns, and pointers tend to grow to to be "always big enough" -- the cases where we tailor pointers to be within smaller constraints are quite specilized. It's more convenient to have one pointer size -- does anyone remember the four memory models that Microsoft C compiler used to support (probably still does)? tiny (16 bit data and code pointers), small (16 bit data, 32 bit code, IIRC), large, and huge? 'Course that isn't a perfect comparison because of the brain dead segmented x86 memory architecture, but you get the idea. It was (is) a pain.

    But, bus widths and memory capacities will grow to the point where the 64 bit code of tomorrow will be as fast as the 32 bit code of today, and the need to optimize further will occur only in esoteric bits of code.

    Besides, with 64 bits, you can do fun things, like allocate different objects in different virtual memory spaces and use the memory management system to catch wild-pointer bugs (because no two different objects need be adjacent in the logical memory space).

    On the whole the advantages outweigh the disadvantages, and the performance penalties will be moot quite shortly.

  • by KalvinB ( 205500 ) on Saturday January 24, 2004 @01:18AM (#8073155) Homepage
    between precision and speed.

    It's not surprising that 64-bit processors are rated much slower than 32-bit ones. The fastest 64-bit AMD is rated 2.0ghz while the fastest AMD 32-bit is 2.2ghz.

    If you use a shovel you can move it very fast to dig a hole. If you use a backhoe you're going to move much slower but remove more dirt at a time.

    Using modern technology to build a 386 chip would result in one of the highest clock speeds ever but it would be practically useless. Using 386 era technology to build a 64 bit chip would be possible but it'd be massive and horribly slow.

    I'm still debating whether or not to go with 64-bit for my next system. I'd rather not spend $700 on a new system so I can have a better graphics card and then have to spend several hundred more shortly after to replace the CPU and MB again. But then again, 64-bit prices are still quite high and I'd probably be able to be productive on 32-bit for several more years before 32-bit goes away.

    Ben
  • Re: OSNews (Score:5, Insightful)

    by Guuge ( 719028 ) on Saturday January 24, 2004 @01:34AM (#8073209)
    News flash: 64-bit apps are, usually, slightly slower than 32-bit ones. Duh. Any developer who's been around 64-bit environments for more than a few weeks knows this. It's not like there's some subtle magic going on here; bigger pointers means more data to schlep around.

    That is the sort of "obvious" conventional wisdom that the article is questioning. In fact, 64-bit architecture means a lot more than pointer size, and merely counting bits is no way to estimate performance.
  • The video drivers are probably not optimized for 64-bit at all. In fact, I wouldn't be suprised if the box doesn't have native drivers at all, and is using MS's standard SVGA/VESA drivers. Those drivers are slow and any PC using them is going to feel horribly sluggish, even if it has a 3Ghz P4.
  • by Brandybuck ( 704397 ) on Saturday January 24, 2004 @01:52AM (#8073286) Homepage Journal
    You're right. A 2GHz 6502 would be a screamer. But the drawbacks are numerous. When the world finally went to 32bit, I jumped for joy. Not because I thought stuff would be faster, but because I could finally use a flat memory space large enough for anything I could conceivably want. Integers were now large enough for any conceivable use. Etc, etc.

    Of course, my conceptions back then might be getting a bit dated now. But not too terribly much. 32 bits will probably be the optimum for general use for quite some time. There's not too many applications that need a 64 bit address space. Not too many applications need 64 bit integers. We'll need 64 bit sometime, but I don't see the need for it in *general* purpose computing for the remainder of the decade. (Longhorn might actually need to a 64 bit address space, but that's another story...).

    Remembering back to the 80286 days, people were always running up against the 16 bit barrier. It was a pain in the butt. But unless you're running an enterprise database, or performing complex cryptoanalysis, you're probably not running up against the 32 bit barrier.

    But of course, given that you're viewed as a dusty relic if you're not using a box with 512Mb video memory and 5.1 audio to calculate your spreadsheets, the market might push us into 64 bit whether we need it or not.
  • Old news (Score:2, Insightful)

    by t0ny ( 590331 ) on Saturday January 24, 2004 @01:54AM (#8073295)
    Ah, this has all been heard before when we switched from 16-bit to 32-bit programs.

    The fact is as true as it was then: some applications are going to run faster just because 32-bit compilers are more 'mature'. Once the newer method becomes mainstream, you will see either the same speed, or a gain in speed.

    Needless to say, the guy in the other post who stated an anology with an abacus had it right- something small is obviously going to execute faster. We arent switching to 64-bit processors so we can run

    10 print "64-bit is k3wl"
    20 goto 10

    The more complex applications of the future generation, as well as the ability to move large amounts of data from memory to cpu, is what is driving the move.

  • by Anonymous Coward on Saturday January 24, 2004 @01:54AM (#8073299)
    Adding more/more complex features to a cpu rarely speed it up by itself, however, it might allow the next generation of CPU to scale beyond the current generation.

    Both in terms of direct CPU performance and for the software that runs on it.

    This has happened a bunch of times during history. Remember the introduction of MMUs for instance? Definately slows down the software running on the machine, but without an MMU we all know that it was virtually impossible to do stable multitasking.

    1/2 GB of memory basically the standard these days with XP.

    A lot of people are buying home computers with 1 GB or more.

    Dell in Japan (where I live) has a special offer these days on a lattitude D600 with 1GB of ram. That is, they expect to sell this thing in quantities.

    I think a fair amount of PC users will hit the 4GB limit within a few years. Personally, I already swear about having just 1GB in my desktop at times when I have a handful of images from my slide scanner open in photoshop + the obvious browsers/mail programs and maybe an office program or 2 open.

    Introducing 64bit does not make todays HW any faster than their counterparts, but it will make it possible to continue making machines better, faster and capable of handling increasingly more complex tasks.
  • Ah well (Score:2, Insightful)

    by fullofangst ( 724732 ) on Saturday January 24, 2004 @01:58AM (#8073323)
    I figured I would post a comment about AMD and their 64 bit chip benchmarks. Then I realised I was already beaten to it by about eleventy billion other people. Guess I should at least do a FIND through the comments before posting in future!
  • by Bored Huge Krill ( 687363 ) on Saturday January 24, 2004 @02:04AM (#8073358)
    the tests were all run on a 64-bit machine. The argument is not so much about whether 32-bit or 64-bit binaries run faster, but which is the faster architecture. I'm pretty sure we don't have any apples-to-apples test platforms for that one, though
  • by sirsnork ( 530512 ) on Saturday January 24, 2004 @02:10AM (#8073376)
    You have also fallen into the "Clock speed is the measure of speed" myth. AMD could have easily boosted the clock speed of the AMD64's simple by extending the pipeline, just as intel did with the P4 and have rumoured to do with the Prescott code. This gives you the ability to clock the CPU higher but it does less per clock cycle
  • Re: OSNews (Score:2, Insightful)

    by Endive4Ever ( 742304 ) on Saturday January 24, 2004 @02:13AM (#8073394)
    Solaris isn't any harder. It's just closed source and there isn't anywhere near as much free software avaiable for it. There certainly aren't as many 'guide for the clueless' websites as there are for Linux, needless to say. That can sometimes be a positive thing. To run free software packages, you can try to coerce the Zoularis thing and build software from the NetBSD pkgsrc tree on it, I guess. The interface between 'free software' and Solaris just has a lot more rough edges, in my experience, than running a Free OS on it from the start. I run Solaris on my SS10sx, because there's no free-software X Server for it that supports 24 bit color on it's dual cgfourteen framebuffer, but other than the ability to 'boast' about running Solaris at home, there's not much other reason to run it. I guess that's a status thing, or something.

  • by iamacat ( 583406 ) on Saturday January 24, 2004 @03:45AM (#8073679)
    I guess you didn't have the "pleasure" of using near, far and huge pointers in DOS compilers. In your model, every library function would have to have two versions - one that takes 32 bit pointers and one that takes 64 bit.

    Uniform and simple is good...
  • Useless tests (Score:2, Insightful)

    by msobkow ( 48369 ) on Saturday January 24, 2004 @04:20AM (#8073769) Homepage Journal

    No, it's not a test of whether 32 or 64 bit is faster. It's a test of whether an obsolete architecture whose fastest younger siblings are still outperformed by IBM, Intel, and AMD.

    The results tell you nothing about whether you should seriously consider 64 bit, nor where you should actually be using a 64 bit setup.

    Maybe someone can post the performance results for Doom running on a new AMD 64 bit box with a top-end ATI or NVidia card. It'd be about as relevant as the performance of a SPARC5 is to making a purchase decision.

  • by Anonymous Coward on Saturday January 24, 2004 @04:33AM (#8073803)
    The other factor is that developers who are targetting a 32-bit machine will generally avoid things like "long long int" in their C programs because of the horrible performance of doing 64-bit ops on a 32-bit processor. Often programs are even optimized as written for 32-bit platforms, and moving to 64-bit will only hurt the performance until those design decisions are changed.
  • Re:gcc? (Score:3, Insightful)

    by orbitalia ( 470425 ) on Saturday January 24, 2004 @05:39AM (#8073944) Homepage
    You mean like this portland compiler [amd.com]

    Actually I wouldn't say that gcc produces particularly bad code on all computers, it's sorta average, but not bad. Certainly the 3.3.x series are alot better than 2. Pretty good at number crunching [randombit.net] and it is more standards compliant than most.

  • by Anonymous Coward on Saturday January 24, 2004 @06:23AM (#8074046)
    Well, the ALU of 64 bits is a bit slower than of 32 bits, it's more clock cycles per opteration.
    A CPU with a lot of slow transistors

    is s worse than a CPU with fews quick transistors, so, short paths are better.

    The page-translation in long mode is very slow!!!
    In Opteron: 4-level for 64 bits VS 2-level for 32 bits, 512*512*512*512-4KiB vs 1024*1024-4KiB, so, legacy mode is quicker than long mode.

    And, the cache penalization is a little high:
    With 1 MiB of L2 cache, an array of 10'000'000 longs is a bit slower than an array of 10'000'000 ints.

    And for building like-LEG0-machines, is better with AthlonXP than with the expensive Opteron.

    open4free

  • Re: OSNews (Score:3, Insightful)

    by BigFootApe ( 264256 ) on Saturday January 24, 2004 @07:20AM (#8074192)
    You are correct, although the issues are more subtle than your examples (not hard).

    A benchmark is useless without interpretation. The people at OSNews have failed to give us any technical background information on the SparcV chip (penalties running in 64-bit as well as benefits), a proper breakdown of the type of math done by the example programs, as well as analyses of bottlenecks in the benchmarks (MySQL, for instance, is possibly I/O limited).

    They've given us raw numbers, with no thought behind them. This is what makes a bad article.
  • sizeof(int) (Score:3, Insightful)

    by wowbagger ( 69688 ) on Saturday January 24, 2004 @10:33AM (#8074677) Homepage Journal
    The biggest fault I can see with this test depends upon sizeof(int) -

    I don't know about Sun, but in some other environments in which a 32 bit and a 64 bit model exist, the compiler will always treat an int as 32 bits, so as not to cause structures to change size. Hell, even on the Alpha, which was NEVER a 32 bit platform, gcc would normally have:

    sizeof(short) = 2
    sizeof(int) = 4
    sizeof(long) = 8

    Now, consider the following code:

    for (int i = 0; i 100; ++i)
    {
    frobnicate(i);
    }

    IF the compiler treats an int as 4 bytes, and IF the compiler has also been informed that the CPU is a 64 bit CPU, then the compiler may be doing dumb stuff like trying to force the size of "i" to be 4 bytes, by masking it or other foolish things.

    So, the question I would have is, did the author run a test to insure that the compiler was really making int's and unsigned's be 64 bits or not?
  • Re:Benchmarks (Score:5, Insightful)

    by Shanep ( 68243 ) on Saturday January 24, 2004 @10:45AM (#8074726) Homepage
    Benchmarks are meant to ideally test minimal pairs

    And they often show disparity in their results due to being interupted. This would be a baddly carried out benchmark under less than ideal conditions. This is human error. Of course there are slight variations in subsequent runs, but these should be able to be explained and compensated for. It is most certainly not a benchmark lie though. If it took that long, then it took that long, now find out why!

    But in benchmarking scientific rigor is always lost

    Failing to retain a scientific approach is a human failing. It does not always happen and is not the benchmark telling lies, but due to poor procedure.

    But the benchmark choice is frequently meaningless or misleading.

    [poor] "Choice", "Meaningless" and "misleading" [results] each require an incompetent person. Don't blame the benchmark. Even if they wrote the benchmark, they might not understand the results.

    Benchmarks do not elucidate any fact.

    Yes they do. Very very specific facts which can later be used to make considerations for future decisions. It could be a specific application, algorithm, overall CPU ALU, FPU or single CPU instruction, it could be bus type, etc. Specific facts leading to educated decisions.

    You will always see in CPU tests LAME encoding. The p4 will always win against an Athlon.

    If this is the case, then LAME as it stands is specifically faster on a P4 than an Athlon. That would be a coarse benchmark though. Some would call it "real world". And it is. It is specific to LAME, but not specific at a lower level where it could be found why this might be the case and how to improve LAME on both P4's and Athlons seperately (with an end result that might have the Athlon out-perform the P4, due to new insight gained from benchmarking specific areas).

    The reviewer will not explain why this is the case and that LAME encoding is simply clock cycle dependent.

    So the reviewers fault becomes the benchmarks fault?

    Benchmarkers need to be able to explain all the dependent variables, to tell why the results happen.

    Thus my original statements?

    In graphics cards Q3 benchmarks above a certain magnitude are meaningless.

    Bad choice of benchmark is the fault of the benchmark?

    Benchmarks need to be interpetted by someone competent enough to do so. Just because someone carried out a poor benchmark procedure or could not understand the results, does not mean the benchmark lied.

    The reviewer with meaningless variables creates an inauthentic conditioned desire in the consumer that leads to bad and lax software and hardware engineering.

    Incompetent reviewer, ignorant consumer, deceitful engineering.

    Morrowind and other games have horrible problems with their graphics engine that can not be saved by faster GPUs and dx9.

    So they are CPU bound? Memory? Sounds like maybe they don't know how to profile their code too well. When profiling, it helps to know how to benchmark and make meaning out of the results.

    You cannot improve that which you do not understand, through anything other than luck. Benchmarks provide specific facts which, when correctly interpreted, can bring about improvements. People who can't interpret them, say they are meaningless.
  • Re: OSNews (Score:3, Insightful)

    by cruelworld ( 21187 ) on Saturday January 24, 2004 @11:54AM (#8074988)
    And what if the compiler sucks/has no optimizations for 64-bit binaries?
  • Re:retarded. (Score:5, Insightful)

    by JacobO ( 41895 ) on Saturday January 24, 2004 @11:55AM (#8074997)
    I just wonder why some are so offended by the article. I have to believe that some people feel that he has "disagreed" with them or something to have such violent reactions. It's just some benchmarks, as he infers, it's better than some people just supposing the answer to a things they are wondering about.
  • bits (Score:2, Insightful)

    by Saville ( 734690 ) on Saturday January 24, 2004 @02:57PM (#8076022)

    and merely counting bits is no way to estimate performance.

    If you only have room for 16k of data in your L1 cache and all your size_t, pointers, and in most cases longs too take twice as much memory at worst it is like you have only 8k of cache now compared to the 32bit version!
    At best it is going to make no difference, but at worst it is like your system now has only half the cache and half the memory bandwidth. Seems to me that by counting bits you can estimate your performance will be between 100% and 50% of the 32bit version, all other things equal.
    A noteable exception would be when you need a 64bit value and are forced to emulate that.

Two can Live as Cheaply as One for Half as Long. -- Howard Kandel

Working...