Please create an account to participate in the Slashdot moderation system

 



Forgot your password?
typodupeerror
×
Intel Software Hardware Linux

Linux 2.6 And Hyper-Threading 51

David Peters writes "2CPU.com has posted an article on Hyper-Threading performance in Linux. They use Gentoo 1.4 and kernel 2.6.2 and run through several server-oriented benchmarks like Apache, MySQL and even Java server performance with Blackdown 1.4. The hardware they use in the tests is border-line ridiculous (3.2GHz Xeons, 3.2GHz P4 and P4 Prescott) and the results are actually quite interesting. It's a good read as he even takes the time to detail his system configuration all the way down to the CFLAGS used while compiling the software."
This discussion has been archived. No new comments can be posted.

Linux 2.6 And Hyper-Threading

Comments Filter:
  • Has anybody run into a problem with Hyper-Threading and per-CPU licensing?
  • Says who? (Score:5, Interesting)

    by Anonymous Coward on Monday February 23, 2004 @07:51PM (#8368586)
    The hardware they use in the tests is border-line ridiculous

    I'm typing this on a 3.0 GHz Pentium 4 that has hyperthreading. The entire system cost me $1200 to build just before Christmas - including 1GB of RAM, a Radeon 9800 Pro video card and a 120GB SATA hard drive. Dell and IBM sell 3GHz notebooks now for a similar price.

    My point is that a 3.2GHz CPU is not ridiculous in an age where 2.66GHz processors are considered entry-level (FYI, Dell is currently selling a 2.66GHz desktop for $499).

    What are you still running on? A 486?
  • Redo. (Score:5, Funny)

    by BrookHarty ( 9119 ) on Monday February 23, 2004 @07:54PM (#8368615) Journal
    Ok, Time to redo the benchmarks, Kernel 2.6.3 is out.
    [joking]

    Be nice when we see some nice Opteron benchmarks vs the new Xeons.

    -
    "But Calvin is no kind and loving god! He's one of the _old_ gods! He demands sacrifice!"
    • Why compare a 64 bit CPU to a 32 bit one....

      Why not compare a Chevy Sprint to a Top Fuel Dragster while your at it... Those would be just as interesting :)
  • by MerlynEmrys67 ( 583469 ) on Monday February 23, 2004 @07:55PM (#8368617)
    Of course my opinion is why not use as large of a -j as you can, and distribute the problem. Take a server farm and turn your compile into ccache and distcc (look up the projects on samba.org CCache [samba.org] distcc [samba.org])

    The first one performs semi-miracles on repetative build times where you aren't doing "incremental" builds. The second lets you distribute your compile to multiple build servers on the network (beware - there be deamons here)

    Build times went from hours to minutes - it was great

  • Tantalizing . . . (Score:5, Interesting)

    by Mysteray ( 713473 ) on Monday February 23, 2004 @08:09PM (#8368760)

    Those sure are some interesting numbers. On the order of a 49% increase or 35% decrease in performance depending on the application. I always figured those high-GHz CPUs would be completely IO-bound. I guess this sometimes allows threads to run with what they've got in the on-chip cache.

    Makes you wonder if a kernel could detect if it was helping or not and selectively enable it.

    I did some informal testing between VC++ native and C# to .Net bytecode. I had a little loop calculating primes. The native C++ kept everything in registers, while the CLR made everything relative memory accesses to BP. I figured that would devastate performance, but on the Pentium 4, it was only 5% slower! It seems to have an L1 cache that's as fast as the registers. That will certainly make it easier on the compiler writers.

    Sort of off topic, did anyone else see that article in MSDN about using .Net for serious number crunching? The author seemed to write the whole article as if he thought it was a good idea. Not that there wouldn't be some advantages to doing that (such as the possibility of tuning for the processor at runtime), but the one graph he showed comparing with native code had .Net running 50% to 33% slower!

    • by metalix ( 259636 ) on Monday February 23, 2004 @08:26PM (#8368956)
      I did some informal testing between VC++ native and C# to .Net bytecode. I had a little loop calculating primes. The native C++ kept everything in registers, while the CLR made everything relative memory accesses to BP. I figured that would devastate performance, but on the Pentium 4, it was only 5% slower! It seems to have an L1 cache that's as fast as the registers. That will certainly make it easier on the compiler writers.

      oops you just violated the VS.NET EULA by posting a performance benchmark. shame on you!
    • Re:Tantalizing . . . (Score:3, Interesting)

      by be-fan ( 61476 )
      The P4 seems to handle indirect accesses extremely well. They did a benchmark of bcc awhile ago. Bcc is a version of GCC that does bounds checking. Now, bounds-checking in C sucks because you have arbitrary pointer arithmatic. So a pointer balloons from a 4-byte word that fits in a register, to a 12-byte structure that must be accessed indirectly. On a P3 and an Itanium, the penalty was huge, reaching 117% for the P3. However, the penalty on the P4 was only 34%.
  • Money? (Score:1, Redundant)

    by dJCL ( 183345 )
    OK, he cannot afford to buy a benchmark, but he has a trio of top of the line Intel systems to play with! WTF? Either he has a weird idea of money well spent, or someone has a lucrative agreement with the hardware vendors. I'm guessing the latter, and really wish I could write well enough to sucker them into sending me cool hardware to play with.

    I'll live with my 2800+(2.133Ghz) AMD MP(only one for now, I'll upgrade when I need it) I'm running Seti, playing music, encoding DVD's and sometimes messing with
    • Re:Money? (Score:3, Insightful)

      by Anonymous Coward
      Well the hardware is provided by the manufacturers for review (it is a hardware site after all). SPEC doesn't just go around handing out copies of their (very expensive) benchmarking applications.
  • by Anonymous Coward
    My entire lab at school is filled with Dual 3.2GHz Xeons with Quadro fx 1000 cards. People have those types of machines... or 100 of them.
  • What if they discovered they could shrink down an entire 8086 processor to Truly Ridiculous Proportions (that's a technical term) and pile like a thousand or a million of them into the space of a single modern day chip? Ok, since we're a 32-bit world now maybe we'd need to go to bunches of 386's instead. But the point remains--I wonder what kind of modifications to current software would have to be made to exploit this, or if it could all be done in hardware.

    It'd be massively parallel computing. Like a h

    • An 8086 topped out at about 10Mhz and contained 29,000 transistors. How many of them were you planning to put on a die? 10x10? That's 2.9 million transistors which is about the same number as a Pentium MMX. Assuming perfect scaling and no overhead for interconnecting those mini 8086 cores, 100x10Mhz is 1 Ghz, but you won't get close to that in real life, so you probably won't be far off from the 233Mhz top speed of the P-MMX. I'm not even going to guess what it takes to interconnect those 8086 cores and bui
    • Interesting point...

      I wonder how hard it would be to cram 64 200MHz 486 class CPUs onto a single die. It would give an theoretical max 'speed' of 12GHz. Maybe give it a nice wide 128bit planar bus and clock it at the same speed.

      Have to tune the OS to handle that many CPUs efficently but it should still be a pretty nimble (and relatively low power) computer.

      Reminds me of an April Fools article several years ago I think PCW magazine had where someone made a computer of a couple of hundred Z80 class CPUs ea

egrep -n '^[a-z].*\(' $ | sort -t':' +2.0

Working...