Forgot your password?
typodupeerror
Programming IT Technology

Faster Chips Are Leaving Programmers in Their Dust 573

Posted by CmdrTaco
from the or-maybe-they've-already-wrapped-around-to-zero dept.
mlimber writes "The New York Times is running a story about multicore computing and the efforts of Microsoft et al. to try to switch to the new paradigm: "The challenges [of parallel programming] have not dented the enthusiasm for the potential of the new parallel chips at Microsoft, where executives are betting that the arrival of manycore chips — processors with more than eight cores, possible as soon as 2010 — will transform the world of personal computing.... Engineers and computer scientists acknowledge that despite advances in recent decades, the computer industry is still lagging in its ability to write parallel programs." It mirrors what C++ guru and now Microsoft architect Herb Sutter has been saying in articles such as his "The Free Lunch Is Over: A Fundamental Turn Toward Concurrency in Software." Sutter is part of the C++ standards committee that is working hard to make multithreading standard in C++."
This discussion has been archived. No new comments can be posted.

Faster Chips Are Leaving Programmers in Their Dust

Comments Filter:
  • 2005 Called (Score:5, Funny)

    by brunes69 (86786) <slashdot AT keirstead DOT org> on Monday December 17, 2007 @01:47PM (#21727168) Homepage
    ....it wants it's article back.

    Seriously - any developer writing modern desktop or server applications that doesn't know how to do multi-threaded programming effectively deserves to be on EI anyway. It is not that difficult.
    • Re:2005 Called (Score:5, Insightful)

      by CastrTroy (595695) on Monday December 17, 2007 @01:55PM (#21727302) Homepage
      It's not just making your app multithreaded, it's completely changing your algorithms so they they take advantage of multiple processors. I took a parallel programming course in University, so I'm by no means an expert, but I'll give what insight I have. You can't just take a standard sort algorithm and run in multithreaded. You have to change the entire algorithm. In the end, you end up with something that sorts faster than n log (n). However, doing this type of programming where you break up the dataset, sort each set, and then gather the results can be very difficult. Many debuggers don't deal well with multiple threads, so that adds an extra layer of difficulty to the whole problem. Granted, I don't think that we really need this level of multithreadedness, but I think that's what the article is referring to. I think that 10+ core CPUs will only really help for those of us who like to do multiple things at the same time. I think it would even be beneficial to keep most apps tied to a single CPU so that a run-away app wouldn't take over the entire computer.
      • Re:2005 Called (Score:4, Interesting)

        by gazbo (517111) on Monday December 17, 2007 @02:01PM (#21727406)
        In the end, you end up with something that sorts faster than n log (n).

        Not without an infinite number of processors you don't.

      • by ByOhTek (1181381)
        sometimes it is as simple as adding multiple threading without changing the logic.

        It depends on where you are splitting your logic.

        Lets take a binary search example:
        Your bank accidentally left a back door in their database, and now the hackers/crackers want to grab their enemies credit and accoutn information, which will allow them to get it faster?

        The database is sorted:
        1) Perform a binary search on the data with each thread doing 1/Nth the data, where N is the number of threads per search
        2) Perform a bina
      • by ZeroFactorial (1025676) on Monday December 17, 2007 @02:21PM (#21727724)
        This sounds to me like a great example of passing the buck.

        EE Guy #1: We can't seem to build faster chips.
        EE Guy #2: No problem. We'll just put tons of processor cores in instead.
        EE Guy #1: But people have spent the past 30 years creating algorithms for single core machines. Almost none of the programmers have any experience writing multi-core algorithms!
        EE Guy #2: Exactly! We'll be able to blame the programmers for being lazy and not wanting to learn new complicated algorithms that require an additional 4 years of university.
        EE Guy #1: Brilliant! We should come up with a catchy headline like "The Free Lunch is Over" or something like that.
        EE Guy #2: Yeah, and we could get Slashdot to post a link to the article. Slashdot users are sure to sympathize with our devious plans...
        • Re: (Score:3, Funny)

          by Pollardito (781263)
          actually they divided the buck into 10 dimes and then passed them all in parallel.

          seriously though, it was only a few years ago that people were scoffing at the usefulness of dual processor desktop machines and arguing the value of being able to run multi-threaded apps and multiple apps faster at the expense of poorer performance on the vast majority of apps and games which people were running in isolation. it doesn't seem like applications or operating systems have seen a major overhaul since that time
          • Re: (Score:3, Insightful)

            by try_anything (880404)
            More and more cores? Consumer desktops and laptops have gone up to a whopping two cores -- four cores only if you blow a wad of dough for bragging rights. Two processors is definitely not overkill for the average user, especially since most users have a browser full of Ajax-ridden web pages open 24/7. I doubt that four cores will be overkill, either, once we start to realize all the various ways we've crippled applications to make them well-behaved citizens of the vanishing single-core desktop.

            The massiv
      • Re: (Score:3, Insightful)

        by Darinbob (1142669)
        There's a wide variance in what "parallel computing" means. For multicore, you've essentially just got a cheaper version of SMP (symmetric multiprocessing). This is worlds away from what occurs in a parallel computer and what most parallel programming algorithms deal with. With multicore and SMP you program mostly like you're doing multithreading on a single CPU.

        The algorithms programmers have to deal with here involve concurrency, and have been in use for decades by anyone writing an OS or device driver
      • High level language (Score:3, Interesting)

        by oliderid (710055)
        I guess it will a dumb question but:

        Why a Java virtual machine can't take the burden of the multi-core adaptation?

        They have promised "write once run anywhere"!

        Lazy coder :-)

    • Re: (Score:3, Insightful)

      by MrSteveSD (801820)
      A lot of multi-threading up until now has been about keeping applications responsive, rather than breaking up tasks. That makes sense since muti-core chips haven't been around that long in most peoples homes. Another issue is that once you have more than one processor, two threads really can run at the same time which can show up all kinds of bugs you would never notice on a single core system. The main problem I can see is with testing for errors. With multiple threads it's up to the OS on how it juggles t
    • Re:2005 Called (Score:4, Informative)

      by chaboud (231590) on Monday December 17, 2007 @02:37PM (#21727936) Homepage Journal
      Well, 2005 called...

      it wants its reply back.

      The parent is exactly how I would have replied a couple of years ago. I was doing lots of threading work, and I found it easy to the point of being frustrated with other programmers who weren't thinking about threading all of the time.

      I was wrong in two ways:

      1. It's not that easy to do threading in the most efficient way possible. There's almost always room for improvement in real-world software.

      2. There are plenty of programmers who don't write thread-safe/parallel code well (or at all) that are still quite useful in a product development context. Some haven't bothered to learn and some just don't have the head for it. Both types are still useful for getting your work finished, and, if you're responsible for the architecture, you need to think about presenting threading to them in a way that makes it obvious while protecting the ability to reach in and mess with the internals.

      The first point is probably the most important. There are several things that programmers will go through on their way to being decent at parallelization. This is in no strict order and this is definitely not a complete list:

      - OpenMP: "Okay, I've put a loop in OpenMP, and it's faster. I'm using multiple processors!!! Oh.. wait, there's more?"
      Now, to be fair, OpenMP is enough to catch the low-hanging fruit in a lot of software. It's also really easy to try out on your code (and can be controlled at run-time).

      - OpenMP 2: "Wait... why isn't it any faster? Wait.. is it slower?"
      Are you locking on some object? Did you kill an in-loop stateful optimization to break out into multiple threads? Are you memory bound? Blowing cache? It's time to crack out VTune/CodeAnalyst.

      - Traditional threading constructs (mutices, semaphores): "Hey, sweet. I just lock around this important data and we're threadsafe."
      This is also often enough in current software. A critical section (or mutex) protecting some critical data solves the crashing problem, but it injects the lock-contention problem. It can also add the cost of round-tripping to the kernel, thus making some code slower.

      - Transactional data structures: "Awesome. I've cracked the concurrency problem completely."
      Transactional mechanisms are great, and they solve the larger data problem with the skill and cleanliness of an interlocked pointer exchange. Still, there are some issues. Does the naive approach cleanly handle overlapping threads stomping on each-others' write-changes? If so, does it do it without making life hell for the code changing the data? Does the copy/allocation/write strategy save you enough time through parallelism to make back its overhead?

      Should you just go back to a critical section for this code? Should you just go back to OpenMP? Should you just go back to single-threading for this section of code? (not a joke)

      Perhaps as processors get faster by core-scaling instead of clock-scaling this will become less of a dilemma, but to say that "[to do multi-threaded programming effectively] is not that difficult" is akin to writing your first ray-tracer and saying that 3D is "not that difficult." Somtimes it is. At least at this point there are places where threading effectively is a delicate dance that not every developer need think about for a team to produce solid multi-threaded software.

      That doesn't mean that I object to threading being a more tightly-integrated part of the language, of course.
  • by scafuz (985517) <scafuz@scafuz.com> on Monday December 17, 2007 @01:48PM (#21727176)
    just start a multithread process: 1 core for the program itself, the remaining 7 for the bugs...
    • For the very first time, 8 and more cores CPU will enable exclusively windows users to run an extensive whole botnet spitting out spam... ..all this running on 1 single multicore CPU.

      thank you, Microsoft !
  • by Chordonblue (585047) on Monday December 17, 2007 @01:48PM (#21727190) Homepage Journal
    II hhaavvee aann XX22 pprrocceessssoor? Ii ccaann ggooeess TTWWIICCEE aass ffaasstt nnooww?

    • by ByOhTek (1181381) on Monday December 17, 2007 @01:54PM (#21727270) Journal
      my eyes, they bleed.
    • Re: (Score:3, Interesting)

      by Nova1313 (630547)
      When the first AMD x2 chips came out the linux kernel had issues with the clock on those chips. The clock would be several times (presumably 2 times?) faster then it should be, the cores clocks were not synchronized for some reason or the kernel would lose track... When you typed a letter it would repeat multiple times as you described. :)
      • by CastrTroy (595695)
        I've had the same problem on single processor machines running Linux for years. I don't know if it's a problem of the keyboard repeat rate being set too low, or something else to that effect, but I notice that a lot of the time on my Linux machines it seems to double/triple type a lot of letters.
  • OS/2? (Score:5, Interesting)

    by SCHecklerX (229973) <thecaptain@captaincodo.net> on Monday December 17, 2007 @01:50PM (#21727218) Homepage
    I remember learning to write software for OS/2 back in the early 90's. Multi-threaded programming was *the* model there, and had it been more popular, it would be pretty much standard practice today, making scaling to multiple cores pretty effortless, I'd think. It's a shame that the single-threaded model became so ingrained in everything, including linux. For an example that comes to mind, why do I need to wait for my mail program to download all headers from the IMAP server before I can compose a new message on initial startup? Same with a lot of things in firefox.

    Does anybody remember DeScribe?
    • Re: (Score:2, Interesting)

      by shoor (33382)
      I was working at a very small software shop when OS/2 came out. We would get a customer, who wanted something to work on an apollo workstation, another one wanted it for xenix, a third for Unix BSD 4.2 (my favorite), or Unix System V (ugh!), or Dos. So, we got a project to port something to OS/2 version 1.0, and I got it to work, and it used multi-threading which I thought was pretty cute and I was proud of myself for figuring it all out just from the manuals. Then the new revision of OS/2 came out and e
    • Re:OS/2? (Score:4, Insightful)

      by pthisis (27352) on Monday December 17, 2007 @05:01PM (#21730752) Homepage Journal
      It's a shame that the single-threaded model became so ingrained in everything, including linux. For an example that comes to mind, why do I need to wait for my mail program to download all headers from the IMAP server before I can compose a new message on initial startup?

      I'm of the opposite opinion; it's a shame that so many people equate parallel processing with threads. When there's not much shared data, using multiple processes keeps memory protection between your parallel "things", decreasing coupling, increasing isolation, and generally resulting in a more stable system (and for certain things where you can avoid some cache coherency problems, a faster system). Your example is perfect; there's really no good reason to use a thread for such lookups. Another process would do, or even better just use select() and avoid all the pain (and bugs) of a multithreaded solution.

      OS developers spent a lot of engineering time implementing protected memory. Threads throw out a huge portion of that; a good programmer won't do that without very good reasons. Some tasks, where there really are tons of complicated data structures to be shared, are good candidates for threading. More commonly, though, threads are used either because the programmer doesn't know any better or because they allow you to be a slacker about defining exactly what is shared and mediating access to it. The latter is especially dangerous; defining exactly what (and how) things are shared goes most of the way toward eliminating multiprocessing bugs, and threads make it easy to slack off on that and get a "mostly working" solution that occasionally deadlocks, fails to scale, etc.

      Use processes or state machines when you can, and threads when you must.
      • Re:OS/2? (Score:4, Interesting)

        by TheRaven64 (641858) on Monday December 17, 2007 @08:38PM (#21733286) Journal

        I'm of the opposite opinion; it's a shame that so many people equate parallel processing with threads.
        I read that and wished I had mod points. Anyone who has programmed with a language designed for concurrency, like Erlang, Termite, or a few Haskell dialects hates using threads. Threads are something that two kinds of people should use; operating system designers and compiler writers. Everyone else should be using a higher-level abstraction.

        The big problem is not the operating system designers, it's the CPU designers. They integrated two orthogonal concepts, protection and translation, into the same mechanism (page tables, segment tables, etc). The operating system wants to do translation so it can implement virtual memory. The userspace program wants to do protection so it can use parallel contexts efficiently. Mondrian memory protection would fix this, but no one has implemented it in a commercial microprocessor (to my knowledge).

  • Thank god (Score:5, Funny)

    by Fizzl (209397) <{ten.lzzif} {ta} {lzzif}> on Monday December 17, 2007 @01:50PM (#21727228) Homepage Journal
    Thank god that Java, C# and other piles of shit I hate do this quite intuitively and easily.
    Guess I had it coming.
    /me closes his eyes and embraces C++ for the last time before the inevitable doom
    • by ByOhTek (1181381)
      How about some SDL threading [libsdl.org]?
      Not doing all that other stuff? Maybe pthread can save your soul [llnl.gov]?

      • Re: (Score:3, Informative)

        by Fizzl (209397)
        Ugly, clumsy and complicated compared to Java's way.

        I know how to do threading in C++ on every platform I have used for development. It's just that the modern languages have elegant system with forethought given to threading while desinging the platform/language. Why would anyone new want to learn how to do clumsy non-standard threading in C++?
        I think the options are to adapt or continue riding the dinosaur untill they die out and be left behind. Sorry that I am sending mixed singnals. I have always worked
    • Re:Thank god (Score:5, Informative)

      by zifn4b (1040588) on Monday December 17, 2007 @02:11PM (#21727570)

      The only significant thing that managed languages make easier with regard to multithreading other than a more intuitive API is garbage collection so that you don't have to worry about using reference counting when passing pointers between multiple threads.

      All of the same challenges that exist in C/C++ such as deadly embrace and dining philosophers still exist in managed languages and require the developer to be trained in multi-threaded programming.

      Some things can be more difficult to implement like semaphores. You also have to be careful about what asynchronous methods and events you invoke because those get queued up on the thread pool and it has a max count.

      I would say managed languages are "easier" to use but to be used effectively you still have to understand the fundamental concepts of multithreaded programming and what's going on underneath the hood of your runtime environment.

      • Re: (Score:3, Insightful)

        by anwyn (266338)
        Yet another annoying attempt to force garbage collection on C++!

        Garbage collection is a one size fits all solution, that is not appropriate for all the applications in the C++ problem space. Further there is a lot of C++ code already out there that does its own memory management. It would be difficult to retrofit this code to garbage collection.

        Furthermore, many garbage collected languages lack proper destructors. At best they have a finalize method. This interfears with the C++ idiom "object creation is

    • Re: (Score:3, Interesting)

      by gbjbaanb (229885)
      Yeah, but they do it really slowly 'cos you're tied to the framework that has to do it safely no matter what - even if you have 2 threads that never interact with each other, the framework will slap synchronisation all over them anyway.

      (I know - I had a discussion with a chap about C# thread-safe singleton initialisation. A simple app to test performance on my little laptop had a static initialised singleton taking 1.5 seconds, lock-based initialisation in 6 seconds. No big deal, we expect that, but then I
    • Re: (Score:3, Insightful)

      by Richthofen80 (412488)
      Speaking of C#, MS just released a technology preview that adds extensions / namespaces to C# that make it pretty easy to write parallel-executing code:
      http://www.microsoft.com/downloads/details.aspx?FamilyID=e848dc1d-5be3-4941-8705-024bc7f180ba&displaylang=en [microsoft.com]

      Essentially, they turn
      for (int i = 0; i < 100; i++) {
      a[i] = a[i]*a[i];
      }

      into

      Parallel.For(0, 100, delegate(int i) {
      a[i] = a[i]*a[i];
      });

      and the hint tells the .NET runtime to execute the solution in parallel. No shared
      • Re: (Score:3, Interesting)

        Essentially, they turn
        for (int i = 0; i < 100; i++) {
        a[i] = a[i]*a[i];
        }

        into

        Parallel.For(0, 100, delegate(int i) {
        a[i] = a[i]*a[i];
        });

        and the hint tells the .NET runtime to execute the solution in parallel. No shared memory, no locks, all done for you. That's the way parallelism should work, IMHO


        So let me get this straight: the runtime is going to
        1. find one or more other threads to farm this work out to, either by creating new ones or taking them fr
  • The basic problem (Score:5, Insightful)

    by ucblockhead (63650) on Monday December 17, 2007 @01:55PM (#21727292) Homepage Journal
    Some algorithms are inherently not amenable to parallelization. If you have eight cores instead of one, then the performance boost you can get can be anywhere from eight times faster to none at all.

    So far, multiple cores have boosted performance mostly because the typical user has multiple applications running at a time. But as the number of cores increases, the beneficial effects diminish dramatically.

    In addition, most applications these days are not CPU bound. Having eight cores doesn't help you much when three are waiting on socket calls, four are waiting on disk access calls and the last is waiting for the graphics card.
    • Some algorithms are inherently not amenable to parallelization.
      Are you sure about that? If you put 9 women on the task of making a baby it only takes a month...
    • Re: (Score:2, Interesting)

      by Anonymous Coward

      In addition, most applications these days are not CPU bound. Having eight cores doesn't help you much when three are waiting on socket calls, four are waiting on disk access calls and the last is waiting for the graphics card.

      Processors don't "wait" on blocked IO calls. Your program waits while the processor switches to another task. When the processor switches back to your program, it checks to see if the blocked IO call has completed. If it has, it continues executing your program again. If not, your program continues to wait while the processor again switches to other tasks.
      So it is you (as the programmer) that determines if your program just sits and waits for blocked IO to complete. Or you could spawn a thread for blo

      • With more processors, your program and its blocked IO calls will be checked more frequently. So even blocked IO calls will see a performance increase.
        You fail [wikipedia.org] it.
      • Processors don't "wait" on blocked IO calls. Your program waits while the processor switches to another task.


        That's his *point*, nimrod. Processors don't wait on blocked I/O calls, but processes do. Therefore, having umpteen processors doesn't do you much good if there are no processes ready to run because they're all waiting on something.

        Chris Mattern
    • The bigger problem is that which the article mentioned: That programmers don't know how to take advantage of the parallel cores. There are two major parts to this:

      1) Just because a given algorithm can't be implemented multi-threaded, doesn't mean there isn't another algorithm that does the same thing that can. So part of it is learning new ways of doing old things, or inventing new ways of doing things (we haven't discovered every possible algorithm).

      2) Rethinking program design so that even though a given
    • by savuporo (658486) on Monday December 17, 2007 @04:35PM (#21730302)
      So far, multiple cores have boosted performance mostly because the typical user has multiple applications running at a time. But as the number of cores increases, the beneficial effects diminish dramatically.
      They diminish, but they never disappear. Even in algorithms where you completely have to wait the results of previous computation to go on, you can still get a speedup with branch prediction. In essence, while your one core is cracking the numbers, other cores do the what if work, and even if you mispredict in lots of cases, you can still get speedups with large datasets, because in some cases, when your first core comes up with a result, you will discover that the what if computation started out with a right guess.
      Hey, i hear they are doing essentially the same stuff with all those newfangled multiscalar processors and branch prediction anyway.
  • For now the biggest advantage of multiple cores is the ability to run multiple applications with each running at full speed. Within each application the problems get a lot more complex, using current algorithms many tasks are not easily subdivided. With data that is inherently paralizable it's pretty easy - each pixel on your display is relatively independent of the others and drawing on a common dataset. However the majority of other areas are not so easy. Generally, how do you take an algorithm and di
  • Most of the jobs being created are not for achieving maximum speed but standards compliance. Companies want software which is easy to maintain & portable, but not necessarily the fastest. If it still was 1997 there would probably be ubiquitous implementations for SMP & vectored assembly language, but that's not the focus anymore.

  • There is currently no working concurrency model for standard C++. You want to make an atomic access to an object? Hope and pray that you have bug free system libraries and a compiler that doesn't optimize away your locking wrappers and do inappropriate speculative stores. Apparently the next C++ standard will address it, but it seems rather foolish to start a transition to massively multithreaded code without an actual standard.
    • Wait a second! Have you ever coded in C++ ? Even if threads are not in the standard library, you have boost, you have Intel's TBB(threading building blocks), besides the native threading library. Do you trust you library in Java? What if the VM screws everything up. As for the compiler "optimizing" everything there is a little keyword : volatile that just tells the compiler not to optimize memory access for that varible. A think the real problem is working in a new programming paradigm : have a problem with
  • Personal computing? (Score:5, Interesting)

    by Dan East (318230) on Monday December 17, 2007 @02:10PM (#21727536) Homepage Journal
    "processors with more than eight cores, possible as soon as 2010 -- will transform the world of personal computing"

    Exactly what areas of "personal computing" are requiring this horsepower? The only two that come to mind are games and encoding video. The video encoding part is already covered - that scales nicely to multiple threads, and even free encoders will use the extra cores to their full potential. That leaves gaming, which is basically proprietary. The game engine must be designed so that AI, physics, and other CPU-bound algorithms can be executed in parallel. This has already been addressed.

    So this begs the question, exactly how will average consumer benefit from an OS and software that can make optimum use of multiple cores, when the performance issues users complain about are not even CPU-bound in the first place?

    Dan East
    • Not that it takes massive (by today's PC standards) compute power to do decent speech recognition, but it's definitely worth dedicating a core or two.

      And then with Vista, you might need one or two cores dedicated to handling UAC events ("The user tried to breath again: Cancel or Allow?").
    • Re: (Score:3, Insightful)

      by bogie (31020)
      "So this begs the question, exactly how will average consumer benefit from an OS and software that can make optimum use of multiple cores"

      AOL 10.0 will say "You got mail!" .25ms faster.
    • by eth1 (94901)
      The only way I can think of that the average consumer will benefit from an 8-core proc is that they'll be able to be infected by up to 7 botnet clients before their computer starts slowing down...
    • Re: (Score:3, Insightful)

      by kebes (861706)
      You point out that few desktop tasks require parallel processing... but think about the flip-side of this: if we could speed-up many tasks, how would that affect desktop computing?

      There are plenty of tasks that people do routinely on computers that are not "instantaneously" fast (spreadsheets, photo-editing, etc.). Furthermore there are many aspects of modern user interfaces that would be better if they were faster (generating thumbnail previews, sorting entries, rescanning music collections, searching,
    • Re: (Score:3, Insightful)

      Exactly what areas of "personal computing" are requiring this horsepower?

      Video, audio, gaming, emulators, and VMs are starters. But I think you're missing some of the picture. Most computer users have one or two programs open at a time and end up quitting everything when they want to run something processor intensive like a game or photoshop. With the move towards multi-core and with a little work from developers, people might be able to leave 90% of the apps they use running, all the time. Multiple cores also provides something of a buffer. When a thread goes rogue, their ma

    • Re: (Score:3, Informative)

      by ppanon (16583)
      Well, you could parallelize recalculation of large spreadsheets. Create dependency trees for cells and split the branch recalculations among different threads. Some accountants and executives with large "what-if?"-type spreadsheets could find that quite useful.

      Browsers could have separate threads for data transfer and rendering. If the web site is using

      tags and CSS, you could split the rendering work for each div to a separate thread. More rapid and frequent partial screen updating can provide today's gen

  • I agree with some of the previous posters that have faulted programmers for "the state of today." My feeling is that the divide between knowledge of hardware and knowledge of software is far too wide. In my experience, I have witnessed many programmers who spent more time organizing the readability of their code than analyzing the actual effectiveness of it: i.e. whitespace use vs algorithm optimization (be it processor method + instruction or i/o improvement). The end result: bloaty-pooh.

    I feel that by
    • Re:Diaspora (Score:4, Insightful)

      by Chirs (87576) on Monday December 17, 2007 @04:08PM (#21729790)
      For many large-scale software projects (I work in industry so I have some experience with this) it is far easier to find more cpu power than more programmers.

      Making code easy to read and maintain is critical to maximizing the efficiency of the programmer. The efficiency of the code is generally a secondary issue, and is only a factor if the code in question is found to be a bottleneck.

      Brian Kernighan once said,

      "Everyone knows that debugging is twice as hard as writing a program in the first place. So if you're as clever as you can be when you write it, how will you ever debug it?"
  • HPC (Score:2, Interesting)

    by ShakaUVM (157947)
    As someone who got a master's in computer science with a focus in high performance computing / parallel processing, and have taught on the subject, *yes*, it does take a bit of work to wrap one's mind around the concept of parallel processing, and to correctly write code with concurrency. But *no*, it's not really that hard. Once you get used to the idea of having computation and communication cycles over a processor geometry, it becomes little more difficult to write parallel code than serial.

    It's like of
    • Re: (Score:3, Insightful)

      by curunir (98273) *
      I think the one thing that makes parallel computing more difficult, and quite a bit more so than recursion, is the fact that it makes your program non-deterministic. With a single-threaded application, it's pretty obvious when you've made your application non-deterministic...you reference the time or some resource external to your application. And those kinds of non-deterministic behaviors are much easier to understand...they're mostly just data. But if your application is running on multiple processors usi
  • by scorp1us (235526) on Monday December 17, 2007 @02:19PM (#21727696) Journal
    Full disclosure: I am a Qt Developer (user) I do not work for TrollTech

    The new Qt4.4 (due 1Q2008) has QtConcurrent [trolltech.com], a set of classes that make multi-core processing trivial.

    From the docs:

    The QtConcurrent namespace provides high-level APIs that make it possible to write multi-threaded programs without using low-level threading primitives such as mutexes, read-write locks, wait conditions, or semaphores. Programs written with QtConcurrent automaticallly adjust the number of threads used according to the number of processor cores available. This means that applications written today will continue to scale when deployed on multi-core systems in the future.

    QtConcurrent includes functional programming style APIs for parallel list prosessing, including a MapReduce and FilterReduce implementation for shared-memory (non-distributed) systems, and classes for managing asynchronous computations in GUI applications:

            * QtConcurrent::map() applies a function to every item in a container, modifying the items in-place.
            * QtConcurrent::mapped() is like map(), except that it returns a new container with the modifications.
            * QtConcurrent::mappedReduced() is like mapped(), except that the modified results are reduced or folded into a single result.
            * QtConcurrent::filter() removes all items from a container based on the result of a filter function.
            * QtConcurrent::filtered() is like filter(), except that it returns a new container with the filtered results.
            * QtConcurrent::filteredReduced() is like filtered(), except that the filtered results are reduced or folded into a single result.
            * QtConcurrent::run() runs a function in another thread.
            * QFuture represents the result of an asynchronous computation.
            * QFutureIterator allows iterating through results available via QFuture.
            * QFutureWatcher allows monitoring a QFuture using signals-and-slots.
            * QFutureSynchronizer is a convenience class that automatically synchronizes several QFutures.
            * QRunnable is an abstract class representing a runnable object.
            * QThreadPool manages a pool of threads that run QRunnable objects.

    This makes multi-core programming almost a no-brainer.
    • Re: (Score:3, Insightful)

      by Rodyland (947093)
      This makes multi-core programming almost a no-brainer.

      While you did say 'almost', I'm still going to take exception with that statement.

      That is a very dangerous thing to say without reams of qualifications.

      Programming (of any non-trivial nature) is not currently, nor is it likely to be any time soon, a 'no-brainer'. No library, no framework, no toolset, no abstraction takes away from the core fact that programming is hard. Sure, you can take away the boring/trivial stuff and give the programmers more

  • by MindPrison (864299) on Monday December 17, 2007 @02:23PM (#21727754) Journal
    It's not easy... especially since things sort of halted at 4 ghz, what on earth am I typing about? Well...picture this...limitations...yes they do exist..and sometimes it's important to think beyond what lies just straight ahead (such as the next cycle speed)...and think into a second...maybe even a 3rd dimmension to expand your communication speed. I have for over 6 years been thinking..of a 3d-dimmension processor that cross communicates over a diagonal matrix instead of the traditional serial and parallel communication model. Imagine this folks...if your code could "walk" across a matrix of 10 x 10 x 10 instead of just 8 x 8 or 64 x 64 if you want...get the picture, no? Imagine that your data could communicate on a 3 dimmensional axis - imagine that you had 10 stacks of cores on top of each other - and instead of just connecting they communication bus to a parallel or a serial model...they could in fact communicate on a diagonel basis... this would make it possible to send commands...data..etc....in a 3d-space rather than just a "queue". This of course...would demand a different "mindset" of coding... everything would have to be written from scratch....though...but the benefits would be tremendeous .....you could 10 fold existing computational speed by increasing the communication across processor-cores...maybe even more! Even by todays technology standards. Ok..ok...sounds far fetched for you doesnt it? Well..get this...this was my invention 6 years ago (maybe even 9 years ago...I am getting older so I dont really care...I do care for freedom of information and sharing...Not so much wealth so listen on)...The theory of what I just wrote here on Slashdot (which has more implication on your life in the future than you will ever be capable of comprehending...yes...I am full of myself aint i....Who cares? You dont know me) .. point is... There was once a missing brick to the idea of diagonal cross matrix computing....with yesteryears technology it just would not be feasible to do it... but ...if you have ANY understanding of what I write here (yes...I am not kidding...this may change history as we know it...and I am drunk right now...and I dont want to keep a lid on it anymore)...here we go... Please think about what I just wrote - and - look up frances hellman's lecture upon magnetic materials in semiconductors...and you WILL have your 4-th link in the 3-B-E-C (base, Emitter, Collector) construction...to make the Cross Matrix Processor possible....just understand this....JoOngle invented this...Frances made it possible - YOU read it from a drunk nobody of Slashdot.org....) now...go make it real!
  • by Nonillion (266505) on Monday December 17, 2007 @02:24PM (#21727770)
    processors with more than eight cores, possible as soon as 2010 -- will transform the world of personal computing....

    Translation:

    Code will get even more inefficient / bloated and require faster hardware to do the same thing you are doing now. While I'm all for better / faster computer hardware, most if not all Jane and Joe Sixpack users never need Super Computer power to surf the net, read e-mail and watch videos.
  • Erlang (Score:5, Informative)

    by Niten (201835) on Monday December 17, 2007 @02:24PM (#21727778)

    Oddly enough, I just watched a presentation about this very topic, with an emphasis on Erlang [erlang.org]'s model for concurrency. The slides are available here:

    http://www.algorithm.com.au/downloads/talks/Concurrency-and-Erlang-LCA2007-andrep.pdf [algorithm.com.au]

    The presentation itself (OGG Theora video available here [linux.org.au]) included an interesting quote from Tim Sweeney, creator of the Unreal Engine: "Shared state concurrency is hopelessly intractable."

    The point expounded upon in the presentation is that when you have thousands of mutable objects, say in a video game, that are updated many times per second, and each of which touches 5-10 other objects, manual synchronization is hopelessly useless. And if Tim Sweeney thinks it's an intractable problem, what hope is there for us mere mortals?

    The rest of this presentation served as an introduction to the Erlang model of concurrency, wherein lightweight threads have no shared state between them. Rather, thread communication is performed by an asynchronous, nothing-shared message passing system. Erlang was created by Ericsson and has been used to create a variety of highly scalable industrial applications, as well as more familiar programs such as the ejabberd Jabber daemon.

    This type of concurrency really looks to be the way forward to efficient utilization of multi-core systems, and I encourage everyone to at least play with Erlang a little to gain some perspective on this style of programming.

    For a stylish introduction to the language from our Swedish friends, be sure to check out Erlang: The Movie [google.com].

  • Real men think in parallel.
  • by richieb (3277) <richieb@gmail . c om> on Monday December 17, 2007 @02:25PM (#21727798) Homepage Journal
    Check out this article [oreilly.com] on O'Reilly's site. Threads are actually very low level construts (like pointers and manual memory management). Accordingly the future belongs to languages that eliminate threads as a basis for concurrency. See Erlang and Haskell.

  • by jskline (301574) on Monday December 17, 2007 @02:37PM (#21727940) Homepage
    The fact is that programming by and large has gotten lazy, shiftless and sloppy over time and not any better or faster. They really did rely on processing and memory architectures getting faster to overcome their coding bottlenecks. The words; "optimized code" have little or no significance in todays programming shops because of budgets. Because of the push to get stuff out the door as quickly as possible, corners are cut all over the place on many things.

    There once was time when debugging was part of your job. Now; someone else does that and at most, the better coders do some unit testing to ensure their code snippet does what it is supposed to. There generally isn't any "standard" with regard to processes except in some houses that follow *recommended coding guidelines* but these are few and far between. Old school coders had a process in mind to fit a project as a whole and could see the end running program. Many times now, you are to code an algorithm without any regard or concept as to how it might be used. A lot of strange stuff going on out there in the business world with this!

    If there is a fundamental change in the base for C++, et al., this is going to possibly have a detrimental effect on the employment market as there will be many who cannot conceptualize multi-threading methodologies much less modeling some existing processing in this paradigm; and leave the markets.

    I left the programming markets because of the clash of bean counters vs quality, and maybe this will have a telling change in that curve. I always did enjoy some coding over the years and maybe this would make an interesting re-introduction. I have personally not coded in a multi-threading project but have the concepts down. Might be fun!
  • by Animats (122034) on Monday December 17, 2007 @02:51PM (#21728148) Homepage

    I have little hope for the C++ standards committee. It's dominated by people who think really l33t templates are really cool. Everything has to be a template feature. They're fooling around with a proposal for declaring variables atomic through something like atomic<int> n; This allows really l33t programmers to write really l33t code using really l33t lockless programming. But without the proofs of correctness needed to make that actually work reliably.

    It's also long been Strostrup's position that concurrency is a library problem. As long as the OS provides threads and locking, it's not a language problem. This isn't good enough.

    The fundamental problem is that, as currently defined, a C++ compiler has no idea which variables are shared between threads, and which are never shared. The compiler has no notion of critical sections. Fixing this requires some fundamental changes to the language. It's known what to do; Modula, Ada, and Java all have synchronization and isolation built into the language. But there's nothing like that in C++, and the designers of C++ don't want to admit their mistakes.

    It's not just a C++ problem. Python has a similar issue. Python as a language doesn't deal with concurrency adequately. The main implementation, CPython, has a "global interpreter lock" that slows the thing down to single-CPU speed.

  • by steveha (103154) on Monday December 17, 2007 @02:54PM (#21728202) Homepage
    I know that languages like Erlang and Haskell are better for concurrent programming than more traditional languages. However, so far they have not been as popular as more traditional languages.

    Will the new world of concurrency cause a shift in language popularity? Or will traditional languages remain more popular, perhaps with some enhancements? C++ is gaining concurrency enhancements; C++, Python, and many other languages work well with map/reduce systems like Google MapReduce; and even with no enhancements to the language, you can decompose larger systems into multiple threads or multiple processes to better harness concurrency.

    If you know Haskell and Erlang, please comment: do those languages bring enough power or convenience for concurrency that they will rise in popularity? People grow very attached to their familiar languages and tools; to displace the entrenched languages, alternative languages need to not just be better, they need to be a lot better.

    steveha
  • by ClosedSource (238333) on Monday December 17, 2007 @03:07PM (#21728360)
    Instead of developing single-core chips with better performance, chip makers are now making multicore machines and expecting developers to provide the extra performance.

    Without the work of developers, multi-core chips will be like the extra transistors in transistor radios in the 1960s: good for marketing but functionally useless.
    • Even a child's toy like the Nintendo DS from 2004 has two cores. Developers need to remember it isn't the early 1990s anymore and that they will have to deal with multiprocessor machines.
  • by bjb_admin (1204494) on Monday December 17, 2007 @03:19PM (#21728598)
    No need for parallel computing all cores are already used.

    Core one: For the OS
    Core two: Anti-virus
    Core three: Anti-Spyware / Windows Defender
    Core four: Firewall
    Core five: Windows update notifications and installations
    Core six: Windows Genuine advantage checks
    Core seven: Eye Candy (Vista) with XP you get a bonus CPU
    Core eight: What ever the user wants to run, except when you get a virus, then
    you have to share it with the SPAM bot.

    Guess we will be waiting for 16 core CPU's.

    Oh and don't start me on memory requirements :-)

  • by athloi (1075845) on Monday December 17, 2007 @03:21PM (#21728658) Homepage Journal
    When I first started programming, in BASIC on an Apple ][ (not IIe), I remember being baffled by the fact that the computer did not operate with multiple concurrent streams [blogspot.com]. To me, this seemed the point of making something that was "more than a calculator," and the only way we would be able to do the really interesting stuff with it.

    When I first started writing object-oriented code, I was somewhat dismayed to find that OO was an extension to the same ol' linear programming. It seemed to me that objects should be able to exist as if alive and react freely, but really, they were just a fancy interface to the linear runtime. Color me disapointed yet again.

    It's an important paradigm shift [chrisblanc.org] to recognize parallel computing. Maybe when the world realizes the importance of parallel computing, and parallel thinking, we'll have that singularity that some writers talk about. People will no longer think in such basic terms and be so ignorant of context and timing. That in itself must be nice.

    Sutter's article hits home with all of this. His conclusion is that efficient programming, and elegant programming that takes advantage of, not conforms to, the parallel model is the future. Judging by the chips I see on the market today, he was right, 2.5 years ago. He will continue to be right. The question is whether programmers step up to this challenge, and see it as being as fun as I think it will be.
  • by Furry Ice (136126) on Monday December 17, 2007 @08:47PM (#21733364)
    I see a lot of comments indicating that all a programmer needs to do to scale to more cores is just multithread your algorithms. If only that were true! Unfortunately, memory access patterns become extremely important for getting good performance, and that requires some pretty sophisticated knowledge about the hardware and proper tuning is almost a black art. Once large numbers of cores are in use, scaling your software optimally is going to be very difficult. Don't delude yourself. Talented programmers are going to be very much in demand, and I suggest starting to learn everything you can about it now. For starters, Ulrich Drepper has written an incredibly detailed and helpful article available at http://people.redhat.com/drepper/cpumemory.pdf [redhat.com] which should really help dispel any notions that this change to computing is going to be easy!
  • by Zork the Almighty (599344) on Monday December 17, 2007 @10:10PM (#21733876) Journal
    Perhaps I am the only person who thinks this, but is seems to me that threads are not a very good low-level primitive for concurrent programming. They inherently assume that whatever is running on the different processors is independent. As a result, writing a tightly coupled parallel algorithm is "hard".

    I would much rather the operating system switch 4 or 16 synchronized cores completely over to me. Add prefixes to the assembly instructions so that I can explicitly execute instructions on processor 1, 2, 3, etc, in a shared memory model. Add logic similar to simultaneous multithreading to keep unused cores saturated with instructions from other threads when possible. This would help the programmer extract parallelism from tightly coupled algorithms. There seems to be no real multithreaded analogue to assembly language, and I think that is a big part of the problem. If we had such a thing it would be much easier to write tightly coupled parallel code, and higher level parallelization (from compilers) would follow inevitably.

    Of course I'm not saying this is some sort of magic bullet. We would still need to split up computations and use threads as best as possible, but I think this is an obvious tool that we are missing.

Help me, I'm a prisoner in a Fortune cookie file!

Working...