Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
Programming IT Technology

Syscall Speed On Linux And Windows 5

1010011010 writes: "IBM has tested the syscall speed of Linux 2.2.16, 2.4.2 and Windows 2000. As it turns out, Linux is a little more than twice as fast. This may be interesting to people who have been reading the LKML recently, as a debate has been doing on about syscall speed. Also, a method ("magic page") for further improving syscall speed is being developed by the kernel developers. The rate at which all aspects of Linux is improving -- kernel, GUIs, etc. -- is phenominal. I think Linux is pretty cool now; I can't wait to see it in 18 months."
This discussion has been archived. No new comments can be posted.

Syscall Speed On Linux And Windows

Comments Filter:
  • by Anonymous Coward
    A friend and I have been playing with this code tonight. I was using 2.4.2 SMP and he was using 2.2.x and FreeBSD. I have 2 PIII 500's and he has a Duron800 and it would seem that SMP does nothing, my time was about .0975 usec. He tried it on FreeBSD and got like 6.xx usec and in Linux got like .04xx usec. I think those are close anyways. So if there is a good reason that FreeBSD is soooo much slower i would like to know. I'm really not trying to start a BSD vs. Linux but I am curious as to why this does this or if this test is at all accurate.
  • by Stormie ( 708 ) on Thursday May 03, 2001 @03:03AM (#249392) Homepage

    Did you actually read the article? It does not by any means "test the syscall speed" of Linux vs. Windows! It introduces timing routines for Linux and Windows which will be used for future articles comparing various things between Linux and Windows. The point of the article is not to reveal that Windows QueryPerformanceCounter() takes 1.945 usec and is therefore less than half as fast as a Linux gettimeofday(), but rather to demonstrate that BOTH systems are capable of providing sub-2-microsecond timing resolution, and that therefore the benchmarks to be performed in future articles will be accurate!

    Feel free to interpret this as "Linux r0x, Windoze suxx!!", but really, it's about as significant as saying "gettimeofday() is only 14 characters long, and only lower-case, and can therefore be typed faster that the Windows equivalent, QueryPerformanceCounter(), which is 25 characters and mixed-case! Therefore programming under Linux is quicker and easier!".

    Anyway, both methods are a wank. They should just use some inline asm to query the performance counters directly. Same code for both OS then.. :-)

  • by Socializing Agent ( 262655 ) on Thursday May 03, 2001 @02:19PM (#249393)
    High-resolution timers in Linux are a joke. The early POSIX-RT patches simply multiplied gettimeofday() by 1000.

    The best way to get performance data on linux or windows is via the Intel chip's time-stamp counter; here's some example gcc code to do it:

    static unsigned long long rdtsc(void) {
    register unsigned long long d;
    __asm__ __volatile__ ("rdtsc" : "=A"(d));
    return d;
    }

    The previous method takes about 13 cycles on an Athlon 750. (DO NOT try and make it inline -- or gcc might optimize your to-be-timed code out from between the rdtsc() calls.) It is a straightforward manner to read the cpu clock speed from /proc/cpuinfo and use that to convert cycles into milli- (or micro-) seconds.

    As with any timing method, take care to execute it a few times before you gather any information, to prime the i-cache.

    Apparently the lameness filter believes that this is a "junk character post", so I'll type some more. Intel has a useful whitepaper that describes how to do this in an M$ compiler, available here: http://developer.intel.com/software/idap/resources /technical_collateral/pentiumii/RDTSCPM1.HTM
  • If you'd like to disable out-of-order execution for your timing code (maybe necessary on PPro and later processors -- I've found that it doesn't make a lot of difference for most real benchmarking tasks), add a cpuid instruction before the rdtsc. Note that the cpuid instruction will clobber eax, ebx, ecx, and edx (you can give these registers to the GCC __asm__ directive).

    The CPUID instruction forces all instructions in the pipeline to complete. Using the serialized rdtsc takes about 40 cycles on an Athlon 750.

    OT: I really had to wrestle with the lameness filter to get this through -- even one line with the inline asm declaration was a "junk character post". Perhaps the lame"ness" filter should recognize that "C/asm code" is different from "ASCII goatse"...

  • I tried the test program on an Athlon/850 with PC100 ram, running NetBSD. [netbsd.org] The measurment with the gettimeofday() calls was around 6 usec/call. Replacing the gettimeofday() call with a getpid() call resulted in a time of &lt 0.35 usec/call. The main reason for this is because the gettimeofday call (under *BSD at least) needs to copy the resulting time out to the user buffer. I get a similar drop in call time if I replace the arguments to gettimeofday with NULL (causing the syscall to do almost nothing [saao.ac.za]).
    Of course, since the purpose of the time-timers.cpp [ibm.com] program is to time the timing routine, we want to time the actual overhead, including the user/kernel space copy (and the overhead of the function call),

    I'm not sure why Linux is so much faster on the gettimeofday() call. I'm guessing it perhaps can retrieve the time directly into the final buffer? Or perhaps it has a more efficient way to copyout the data?
    Then again, maybe NetBSD uses a different way to get the time. When gettimeofday is called, NetBSD does a few I/O accesses to the timer chip (Intel 8253) and returns the result of that. What does Linux do? (I don't have a copy of the linux source handy)

    eric

One man's constant is another man's variable. -- A.J. Perlis

Working...