Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

Introduction to Distributed Computing 95

Posted by michael on Friday April 05, 2002 @04:36PM from the help-in-imagining-a-beowulf-cluster dept.

dosten writes "ExtremeTech has a nice intro article on distributed and grid computing." Someday someone will successfully implement something like Progeny's NOW and all of these assorted hacks at building a distributed computing system will be superseded.

This discussion has been archived. No new comments can be posted.

Introduction to Distributed Computing

Load All Comments

Search 95 Comments Log In/Create an Account

Comments Filter:

We could spend millions to do this.. (Score:3, Funny)

by btellier ( 126120 ) writes: <btellier@gmaiUUUl.com minus threevowels> on Friday April 05, 2002 @04:40PM (#3292682)

Or we could just spend 8 hours finding a buffer overflow in Brilliant's Distributed Kazaa software and do it that way.

Share
twitter facebook
Time warp? (Score:1, Troll)

by Pedrito ( 94783 ) writes:

Most new hires came in to work on projects that had the potential to bring in revenue sooner than NOW...

Um, maybe it's me, but how could it be sooner than now? If these guys have a working time machine, maybe they ought to try to capitalize on that instead of writing an OS.
- Re:Time warp? (Score:1)
  
  by czardonic ( 526710 ) writes:
  
  how could it be sooner than now?
  
  Very funny, but I'll bite. Anything that has already happened, happened sooner than now.
- Re:Time warp? (Score:1)
  
  by teaserX ( 252970 ) writes:
  
  Fellas...please,
  NOW == Network Of Workstations
- - Re:Imagine.... (Score:2)
    
    by keesh ( 202812 ) writes:
    
    Some of us have enough karma and spare accounts that the occasional hit doesn't matter.
Interesting distributed computing (Score:1, Troll)

by PhysicsGenius ( 565228 ) writes:

There is a paper due out in Science next week that presents a mathematical model of the universe as a distributed computation. The properties of the various masses (velocities, sizes, angular momenta, etc) can be considered to be the results of computations carried out by the interactions between them.
Of course, computation in a vacuum (ha ha) is useless. Information on the results of the computations is carried around via cosmic rays, neutrinos and the like.
The really exciting thing is that the conclusion of the paper calls for research into the general direction that cosmic rays are flowing which may lead us right to the location of God Himself!
- Re:Interesting distributed computing (Score:1)
  
  by farmerzebra ( 566823 ) writes:
  
  The universe by definition is a distributed system of objects. That's why OO is such a Good Thing(tm). It takes advantage of the way we understand the universe, to let us say what we need to say more succinctly. A really good system is based on reactions. It can keep a relativly stable state, no matter what. It's buffered. As for the location of God Himself(tm), what part of 'omnipresent' didn't you understand?
- - - - How very sad (Score:1)
        
        by PhysicsGenius ( 565228 ) writes:
        
        You cannot counter my scientific arguments so you resort to attacking me personally. "Know the truth and it shall set ye free", brother.
    - Re:Like all honest scientists... (Score:1)
      
      by cadallin451 ( 536419 ) writes:
      
      Why do Christian invariably fall into anthropomorphic and other rediculous arguments? And you confuse cause and effect continuously, "The universe is just right for people to exist, therefore it must have been created for us."
      Is it not far more logical to say, "We exist in this universe because it is the one that has the correct conditions for our existence."
      And that bloody argument about the "vertebrate eye is too complex to have come about by evolution, therefore evolution is wrong." How do people persist in using this absurd statement? Despite the fact that there are organisms possessed of every gradiation from a simple light sensitive nerve on some worms, on up to the vertebrate eye. If you study biology you can see all the stages of the evolution of biological optics. And yet just last week I saw the "vertebrate eye" argument quoted in a newspaper as proof of intelligent-design.
      Ye gods, what fools these mortals be.
Distributed... (Score:3, Interesting)

by Renraku ( 518261 ) writes: on Friday April 05, 2002 @04:47PM (#3292728) Homepage

Distributed computing is actually a pretty simple idea to come up with, seeing as how a lot of things are 'distributed' such as manufacturing, selling products, etc. The thing that makes distributed computing attractive is the speed of data and the unused potential of your average computer. It would be nice to see a company that needed a lot of data processed, and paid people for every data pack they processed and completed. Rules would have to be set up to prevent abuse, but it would be a nice system. Everyone wins.

Share
twitter facebook
- Re:Distributed... (Score:2)
  
  by SamHill ( 9044 ) writes:
  
  Except for the ``paying people'' part, United Devices [ud.com] does just that.
  
  The downside of distributed computing is figuring out how to split a given problem into pieces that can be processed separately. Not all problems can be split up, and for those that can be split, figuring out the best way to do so isn't always trivial.
- Re:Distributed... (Score:2, Insightful)
  
  by Krapangor ( 533950 ) writes:
  
  While the principle is simple, the idea itself is massively overrated these days. It's not that distributed computing is exactly a new idea. Parallel massively machines are around for decades. And distributed computing is just using computers on a (large scale) network as a massively parallel machine. But history has already shown that many problems can't be solved by parallel computations therefore limiting the power of distributed computing. The only new benefit is that you don't need spend $$$ on cray systems etc. Just buy some processing power in a grid. However there won't be as much customers as you would expect.
  This stuff is just overhyped by some companies which think that they can make the big buck.
The whole article at once (Score:5, Informative)

by Spackler ( 223562 ) writes: on Friday April 05, 2002 @04:50PM (#3292741) Journal

The whole thing [extremetech.com]

Rather than a popup ad per page.

Share
twitter facebook
- Re:The whole article at once (Score:1)
  
  by duggy_92127 ( 165859 ) writes:
  
  Yes, all at once is good, but might I suggest www.mozilla.org? I haven't seen a popup for months.
  
  Doug
- Ya know ... (Score:2)
  
  by TheViffer ( 128272 ) writes:
  
  cs clan [tgk] 0wnz0red dis post!
  
  with a large scale distributed system, using the distributed translation project things like this may in the future look like this.
  
  "My buddies and I are wimps so we pretend to be big shots online. So therefore we have created a small group called cs group. Online we are also seen as [tgk] to signify our uniqueness from you. We (being cs group) would like to point out the fact that we know a lot on the topic of distributed systems and would like to tell you our thoughts. We know all our posts will get 5's"
I can just see it now... (Score:1)

by ddeboer ( 197882 ) writes:

April 14, 2011: I sit down at my new copy of Windows 2010 to fill out my tax return. A dialog box pops up - "Sorry, but your computer and 36 million of its comrade workstations are busy working on Bill Gates' tax return. Try back again in a day or two..."
Nice introduction to DISTRIBUTED POPUP ADS (Score:1)

by ZaneMcAuley ( 266747 ) writes:

Thats what I got when i went to that site.
Condor wasn't mentioned (Score:1)

by mrm677 ( 456727 ) writes:

Condor [wisc.edu] is a very good Grid system that is freely available for Linux (binaries only).
- - Re:Condor wasn't mentioned (Score:1)
    
    by mrm677 ( 456727 ) writes:
    
    huh? A donation is required?
The irony of these comments... (Score:1)

by shawnmelliott ( 515892 ) writes:

From the site...

"That was all right at the time, because it was easy to raise money for ambitious development projects such as NOW that could take years to develop and, thus, that might not pay off for years."

and

"...Most new hires came in to work on projects that had the potential to bring in revenue sooner than NOW"
check this out (Score:4, Informative)

by emir ( 111909 ) writes: on Friday April 05, 2002 @05:07PM (#3292843)

if you are interessted in distributed computing over internet check out this url: http://www.aspenleaf.com/distributed/ [aspenleaf.com].

there is short description of all distributed computing projects plus lots of other stuff.

Share
twitter facebook
My university just got a grant to do grid comp. (Score:2, Interesting)

by paulydavis ( 91113 ) writes:

My University just got a 395,000 dollar grant from the NSF. for more info : http://inside.binghamton.edu/March-April/4apr02/gr id.html">
- Re:My university just got a grant to do grid comp. (Score:1)
  
  by paulydavis ( 91113 ) writes:
  
  sorry try this link http://inside.binghamton.edu/March-April/4apr02/gr id.html
- Re:My university just got a grant to do grid comp. (Score:1)
  
  by Vagary ( 21383 ) writes:
  
  What is it with universities and Grid computing? Don't they realise that it scales poorly and is overkill for many applications? Oh right, if they sit up and say "Grid" they get funding...
  
  (At least the NSF and NSERC are consistent.)
Try a non-linux distributed protocol... (Score:3, Informative)

by Frobnicator ( 565869 ) writes: on Friday April 05, 2002 @05:12PM (#3292863) Journal

... like the dogma project [byu.edu] at Brigham Young University [byu.edu] is a distributed application system currently on used on a few thousand machines. It is written in pure Java, requires no persistant storage on the local machine, can be interrupted at any time, and is OS independant, to name a few things.

Share
twitter facebook
Fly in the ointment (Score:2)

by Eric Damron ( 553630 ) writes:

I guess I really like the idea of distributed computing. In a world where everyone works together with common goals we would be able to achieve almost anything. The flies in the ointment, however, are the few individuals who would get their rocks off by ruining it for everyone else, the same type of people who write virii.

Another networking subject that really interests me is wireless networking. I think that someday in the not too distant future we will see neighborhood networks forming and then a linking of various neighborhood networks to form a new kind of "internet." One that is absolutely not controlled by any group.
- Re:Fly in the ointment (Score:2)
  
  by rtaylor ( 70602 ) writes:
  
  Sounds like communism in Russia.
  
  They really could accomplish nearly anything. Problem was the 'details' of everyday life were missed out on.
  - Re:Fly in the ointment (Score:1)
    
    by Wiwi Jumbo ( 105640 ) writes:
    
    "Sounds like
    
    *communism* in Russia."
    Emphasis admittedly mine...
    
    That's the most "American" thing I've heard in a very long time... ;-)
Related links (Score:1)

by qurob ( 543434 ) writes:

There's a Distributed Computing Forum over at Anandtech [anandtech.com]
The problem with distributed computing... (Score:2, Insightful)

by asparagus ( 29121 ) writes:

Is that for most intents and purposes, processor cycles are free.

If a company/organization has an *actual* need for processor cycles (say genome research), it's cheaper to buy 1000 boxes and admin the stuff in-house. Even when ignoring issues such as sending valuable company data to thousands of internet users, most applications that require large compuation also require large amounts of bandwidth, generally provided over a LAN.

This is why you'll never get to render a frame for Toy Story 5: Pixar will need to send you 5GB of data just to get back a 2k image.

Once you consider the costs of admining a network, writing/distributing your code, against having a tangible financial benefit from the results, few companies will have a reason to turn to outsiders for a few minutes on their machines.
- Re:The problem with distributed computing... (Score:1)
  
  by blueg3 ( 192743 ) writes:
  
  I take it this is why companies like IBM have their some of their research software run as a distributed program that eats up the processor cycles of all of their non-research PCs?
  
  While it's true that sending valuable company data across the Internet is a problem, not all problems are going to require that. Also, while not every problem lends itself to distributed computing, a program that properly implements a problem that does lend itself to distribution won't require large amounts of bandwidth.
  
  You know, you could simply check that statement against current distributed problems. For example, neither Seti@home nor the distributed.net client have large bandwidth demands but a high computational demands.
  
  Face it, if Pixar had to pass around that much data to render individual frames, its own network would get overflowed.
  
  I will have you know, though, that processor cycles are far from free. Building a good supercomputer that can do the work of a distributed system is very expensive no matter what route you take. (The purchase, infrastructure, development, and administraton of even a few hundred machines is pricy. Ask Clemson's PARL.)
  - Re:The problem with distributed computing... (Score:2)
    
    by asparagus ( 29121 ) writes:
    
    Yes, but neither Seti@home nor d.net are making any money. They're largely research projects.
    
    The companies looking to get into this are hoping to make money. I'm saying that's a bad business plan.
    
    And yes, Pixar already passes about that much data. Large scenes/complicated renders can even go higher per-frame.
- Re:The problem with distributed computing... (Score:2)
  
  by Greg Lindahl ( 37568 ) writes:
  
  Sorry, that's a bad example. Pixar's existing compute farm doesn't need much networking.
  - Re:The problem with distributed computing... (Score:2)
    
    by Zeinfeld ( 263942 ) writes:
    
    Sorry, that's a bad example. Pixar's existing compute farm doesn't need much networking.
    But it sure needs confidentiality, both of the rendering code itself and the data it is working on. Otherwise we will all see random frames from every Pixar movie in advance.
    Plus the rendering code is quite likely huge and has a lot of dependencies on proprietary codebases. I doubt the stuff would run well on Direct-X.
    The liquid metal effect in Terminator cost a million or so to develop and sold for that the first time after which it was quickly copied so that no you can get it in a movie for a few $10K.
    The idea of using the internet to do distributed computing is as old as the net itself. We were building SETI type configurations back in the mid 80s, as soon as the price performance of the workstation rendered mainframes obsolete.
    Believe it, if Pixar need more compute cycles they will go to Dell and buy a room full of cheapo machines. It will cost much less to manage than scraping processing time up from arround the net.
- Re:The problem with distributed computing... (Score:4, Informative)
  
  by Rajesh Raman ( 115274 ) writes: on Friday April 05, 2002 @06:22PM (#3293243)
  
  You're missing the point. Distributed computing is not about only running on machines that aren't yours, but also efficiently utilizing the machines that are yours (or at least have easy access to).
  
  Consider that a University of Wisconsin study showed that, on average, computers on desktops are idle at least 60% of the time. And that doesn't count the cycles burned lost between keystrokes --- I'm talking about extended periods of time. For example, almost all desktop machines are idle during nights. That's 50% already. Now add lunch time. Meetings, etc.
  
  That's when systems like Condor [wisc.edu] come in. Researchers at Wisconsin got hundreds of years of CPU time on machines they already had without impacting others.
  
  Coming back to your argument, the counter argument is that you may not even need to buy additional boxes --- just use the ones you already have more efficiently by utilizing distributed computing systems.
  
  As far as "freeness" of processor cycles, let me tell you that the optimization researchers can soak up as much cpu as you can possibly throw at them. Also, if you look up Particle Physics Data Grid (PPDG) and GriPhyn, you'll find out that many distributed computing problems are I/O driven.
  
  ++Rajesh
  
  Parent Share
  twitter facebook
  - Re:The problem with distributed computing... (Score:2)
    
    by Fizzlewhiff ( 256410 ) writes:
    
    on average, computers on desktops are idle at least 60% of the time
    
    Many of us need that 60% idle time to keep our CPU's running at a reasonable temperature. I have my CPU and case cooling under control but now I think I need to put muliple A/C zones in my house thanks to distributed.net. :)
  - Re:The problem with distributed computing... (Score:2)
    
    by asparagus ( 29121 ) writes:
    
    I've got no problem with research projects that use distributed computing. I myself run d.net and have thrown cycles to Seti@home and Genome@home. It's a great way to pick up free cpu cycles cheaply, if you've got the time.
    
    However, there's half a dozen companies now that think they're going to make money off people using these programs for large projects.
    
    The reality of the matter is, if d.net had to support itself financially, it'd get rid of it the internet users and stick to in-house boxes.
    
    I'm not dissing distributed computing: it has its benefits. But it will probabally always be limited to research/educational projects.
    
    My point is that if I'm a CGI guy who needs cpu cycles today, it's cheaper to buy them myself then to farm them out to a third party. So long as Moore's law holds up, this will remain true. There's a study on this I can't find right now.
Notes and comments (Score:3, Informative)

by pridkett ( 2666 ) writes: on Friday April 05, 2002 @06:20PM (#3293227) Homepage Journal

First of all, be sure to check out the links at the end of the article to some of the projects that are going on right now. Some of the ones that I find more interesting are the Particle Physics Data Grid and the Access Grid (no link in article).

One of the great benefits of Grid computing over distributed computing is the access to resources, such as storage. This is what PPDG seeks to do, provide access to physicists, in near real time, to the results of experiments. The problem is that the experiments may be performed at CERN and the researcher may be at CalTech. While normally for a telnet or what not, this isn't a problem, it is a problem when an experiment can produce Petabytes of data. For more information on that see http://www.ppdg.org [ppdg.org]. There is another project called NEESGrid [neesgrid.org] that will provide access to earthquake simulation equipment remotely. Truly cool.

I also encourage you to check out Globus [globus.org]. Using a system like the Globus Toolkit along with MDS, I can locate a machine and execute my program on it transparently. This transparency is taken care through a network of resource managers, proxys and gatekeepers. It's pretty cool and is pretty easy to install on your favorite Linux box.

Programming Grid enabled applications is pretty easy. There are software libraries called CoG Kits [cogkits.org] that provide simple APIs for Java, Python and a few other languages. In just a few lines of code you can have a program that looks up a server to run your executable on, connects, executes and returns the data to you.

The current push right now is towards OGSA [globus.org] which is Open Grid Services Architecture. This will form the basis for Globus 3.0. OGSA will take ideas from web services, like WSDL, service advertisement, etc, and implement them to create Grid services. This will be the next thing with services easily able to advertise themselves and clients easily able to find services.

Share
twitter facebook
ignore the speeds and feeds (Score:3, Informative)

by xtp ( 248706 ) writes: on Friday April 05, 2002 @07:12PM (#3293486)

These projects when described in the lay press nearly always skip over any analysis of the kinds of algorithms that can work well on a distributed system. The first metric to look at is the ratio of communication to computation. That is, how many bytes of data does a compute exchange with neighbor(s) before continuing with the next step of computation.

Render farms are embarrasingly parallel requiring no communication with neighbors while rendering a frame. They do require a large amount of data before starting on the next frame, but you can either pipeline that (which they don't do usually) or double up on the number of compute nodes (which is more common).

Suppose instead you want to solve a big mesh problem like a 3D cube with 10^10 points on a side. And its a fairly simple computation. You might need 10^5 or 10^6 nodes and the data traffic between nodes would look like a DOS attack if it took place on the internet.

And then there is the rich space of possibilities between these two extremes and the crossproduct with storage. It is a fascinating area to work in because there is much yet to learn and the possibilities for new networks and processors and storage evolves all the time. Things that were impossible to do last year are within reach this year or next year.

But.... just as 100 Volkswagon beetles may have the same horsepower as a huge earthmoving machine, the beetles cannot readily move mountains... and 100 or 1000 or 10000 PCs with a low-cost interconnect are not equal to a supercomputer or a supercluster that may support 10^6 greater communcations to computation ratio - and thus a much greater range of useful distribution algorithms.

Share
twitter facebook
More a more technical introduction... (Score:2)

by BillGodfrey ( 127667 ) writes:

Have a read of my guide, it's at http://www.bacchae.co.uk/docs/dist.html [bacchae.co.uk]
This one covers issues such as parasite attacks, spoiler attacks, etc.
Slashdot rejected my guide when I submitted it. Whine whine gripe gripe.
TNC (Score:2)

by jc42 ( 318812 ) writes:

This shows a profound lack of knowledge of the Computing literature. Back in 1982 (December issue IIRC), there was an article published describing The Newcastle Connection. This was a fully-distributed unix system built on exactly the same model. It was a unix system that incorporated other systems as components, treating the network as a bus. The result was a large multi-processor unix system.

They weren't nearly the last ones to announce that they had done such a thing. For a while, in the mid-80's, it was somewhat of an inside joke. It seemed that everyone was making their own distributed unix system using the same design.

I built one myself, and so did a fellow down the hall from me (at Project Athena at MIT). We both spent about a month of our spare time on it, and both of ours worked. One of my demos consisted of a Makefile with source scattered across as many machines as I could get accounts on. I showed that, despite the fact that the clocks on some machines were off by hours or days, my code correctly adjusted for clock skews and compiled the right things. I didn't need to modify make or the compiler, I just linked them to my libcnet.a, which replaced all the system calls with my distributed routines, and they corrected for the clock problems.

The problem isn't the difficulty in building a truly distributed system. Any competent software engineer should be able to do that. The problem is that the commercial world has no interest in selling such a thing, and the non-commercial world remains ignorant of things like this that were demoed several decades ago.

One of the true frustrations from having built such a system is having to work with things like NFS, that still can't get its clocks right (at least not without requiring super-user permissions on every subsystem). When I decided to solve this problem so that make would work, it took me a morning, and I didn't use super-user permissions anywhere.

BTW, the Newcastle system was used internally in a number of corporations. But the many attempts to make it more widespread just hit brick walls. So now we have the kludgery of HTTP and URLs rather than the simple, elegant schemes that the various distributed-system people have used.
We're all goin ta Hell... (Score:1)

by teaserX ( 252970 ) writes:

I don't mind giving away cycles to seti@home or d.net but I'm anticipating that something evil is on the way here.
/me grabs tinfoil hat
What if CureTheCommonCold@Home is really help-pfizer-make-$10-a-dose-cold-medicine-that-tur ns-out-to-be-carcinogenic@home or Help-Monsanto-make-deadly-pesticide@home. What if those have running under Kazaa this whole time?

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

We could spend millions to do this.. (Score:3, Funny)

Time warp? (Score:1, Troll)

Re:Time warp? (Score:1)

Re:Time warp? (Score:1)

Re:Imagine.... (Score:2)

Interesting distributed computing (Score:1, Troll)

Re:Interesting distributed computing (Score:1)

How very sad (Score:1)

Re:Like all honest scientists... (Score:1)

Distributed... (Score:3, Interesting)

Re:Distributed... (Score:2)

Re:Distributed... (Score:2, Insightful)

The whole article at once (Score:5, Informative)

Re:The whole article at once (Score:1)

Ya know ... (Score:2)

I can just see it now... (Score:1)

Nice introduction to DISTRIBUTED POPUP ADS (Score:1)

Condor wasn't mentioned (Score:1)

Re:Condor wasn't mentioned (Score:1)

The irony of these comments... (Score:1)

check this out (Score:4, Informative)

My university just got a grant to do grid comp. (Score:2, Interesting)

Re:My university just got a grant to do grid comp. (Score:1)

Re:My university just got a grant to do grid comp. (Score:1)

Try a non-linux distributed protocol... (Score:3, Informative)

Fly in the ointment (Score:2)

Re:Fly in the ointment (Score:2)

Re:Fly in the ointment (Score:1)

Related links (Score:1)

The problem with distributed computing... (Score:2, Insightful)

Re:The problem with distributed computing... (Score:1)

Re:The problem with distributed computing... (Score:2)

Re:The problem with distributed computing... (Score:2)

Re:The problem with distributed computing... (Score:2)

Re:The problem with distributed computing... (Score:4, Informative)

Re:The problem with distributed computing... (Score:2)

Re:The problem with distributed computing... (Score:2)

Notes and comments (Score:3, Informative)

ignore the speeds and feeds (Score:3, Informative)

More a more technical introduction... (Score:2)

TNC (Score:2)

We're all goin ta Hell... (Score:1)

Related Links Top of the: day, week, month.

Slashdot Top Deals