Follow Slashdot blog updates by subscribing to our blog RSS feed

Optimizing distcc 201

Posted by michael on Tuesday March 30, 2004 @05:26PM from the barn-raising dept.

IceFox writes "Having fallen in love with distcc and its ability to speed up compiling (insert anyone who compiles like Gentoo users or Linux developers). I recently got the chance to dive deeper into distcc. By itself distcc will decrease your build times, but did you know that if you tweak a few things you can get a whole lot better compile times? Through a lot of trial and error, tips from others, profiling, testing and just playing around with distcc, I have put together a nice big article. It shows how developers can get a bigger bang for their buck out of their old computers and distcc with just a few changes."

This discussion has been archived. No new comments can be posted.

Optimizing distcc

Load All Comments

Search 201 Comments Log In/Create an Account

Comments Filter:

strlen (Score:5, Funny)

by Anonymous Coward writes: on Tuesday March 30, 2004 @05:27PM (#8718776)

Yep, root of all evil. strlen. Fix strings and you'll fix everything.

Share
twitter facebook
- - - Re:strlen (Score:2)
      
      by Trolling4Dollars ( 627073 ) writes:
      
      What about "here hare"? (Note: You have to be a "Withnail and I Fan" to get this)
Wow... (Score:5, Funny)

by JoeLinux ( 20366 ) writes: <joelinux@nospAm.gmail.com> on Tuesday March 30, 2004 @05:28PM (#8718785)

For some reason, "Imagine a beowulf clusters using this" is on-topic.

This is so weird.

I must drink now.

"I do NOT suffer from a mental condition. I'm enjoying every second of it."

Share
twitter facebook
- Re:Wow... (Score:2, Informative)
  
  by supabeast! ( 84658 ) writes:
  
  Actually, that's sort of off-topic because distcc negates the need to set up a beowulf cluster.
Website bit slow... (Score:5, Funny)

by neonstz ( 79215 ) * writes: on Tuesday March 30, 2004 @05:28PM (#8718786) Homepage

...maybe you should work on disthttpd next?

Share
twitter facebook
- Re:Website bit slow... (Score:3, Insightful)
  
  by Wolfier ( 94144 ) writes:
  
  It sort of already exists...the name is "Bit Torrent"...
  - Re:Website bit slow... (Score:2)
    
    by OneEyedApe ( 610059 ) writes:
    
    Bittorrent is more a distftp (with a few added features).
Nice big article (Score:5, Funny)

by wildzeke ( 191754 ) writes: on Tuesday March 30, 2004 @05:29PM (#8718797)

By the time I read the article, my kdelibs was compiled.

Share
twitter facebook
/.-ed already? (Score:5, Funny)

by Lord of Ironhand ( 456015 ) writes: <arjen@xyx.nl> on Tuesday March 30, 2004 @05:29PM (#8718807) Homepage

Looks like that server won't be doing much compiling soon...

Share
twitter facebook
- Re:/.-ed already? (Score:3, Funny)
  
  by Bombcar ( 16057 ) writes:
  
  The site now says:
  
  What once was here
  is here no more;
  Our slashdotted luser
  has been shown the door.
- - Re:/.-ed already? (Score:2, Informative)
    
    by Thud457 ( 234763 ) writes:
    
    A: 9/11/2001.
- - Re:/.-ed already? (Score:2)
    
    by voidptr ( 609 ) writes:
    
    It's not so much it couldn't handle it as the one of the sysadmins with a stick up his ass wanted to make a point about putting inocent machines on the front page of slashdot without warning the admins first.
anal retentive admin (Score:3, Funny)

by maxbang ( 598632 ) writes: on Tuesday March 30, 2004 @05:30PM (#8718816) Journal

From the article:

I even found different colored cable for the different areas of my cube.

I wonder if he also sealed the empty packaging, waste paper, and dead hardware in neat little foil packets before disposing of them in the proper receptacle, which, of course, sits right next to the cozy for his server. ;)

Share
twitter facebook
- Re:anal retentive admin (Score:2)
  
  by kidlinux ( 2550 ) writes:
  
  Nevermind having 15 computers in a small space. I have 3 and I'd use colour coded cables and lables.
  
  Big messes of cables and wires are a real pain in the ass.
Reliefe for the /. site (Score:2, Informative)

by Anonymous Coward writes:

distcc optimizations - March 30th 2004
and how to compile kdelibs from scratch in six minutes

If you don't already know about distcc I recommend that you check it out. Distcc is a tool that sits between make and gcc sending compile jobs to other computers when free, thus distributing compiles and dramatically decreasing build times. Best of all it is very easy to set up.

This, of course, leads to the fantastic idea that anyone can create their own little cluster or farm (as it is often referred to) out of th
- Re:Reliefe for the /. site (Score:2)
  
  by Homology ( 639438 ) writes:
  
  The main limit for my choosing dealt with the fact that I only had room in my cube for fifteen computers.
  
  I guess he don't mind a lot of noise...
  - Re:Reliefe for the /. site (Score:3, Informative)
    
    by IceFox ( 18179 ) writes:
    
    Hmm didn't finish reading the article did you (that was in the parent poster!)? If you had you would see that in fact the noise level didn't rise in my cube. :D -Benjamin Meyer
    - Re:Reliefe for the /. site (Score:2)
      
      by Homology ( 639438 ) writes:
      
      Nope, I skimmed it since this is Slashdot after all :D
      Putting 12 older PC in the cubicle and have same level of noise could mean that either you put some work into making them quiet, or it's quite noisy already :D
      - Re:Reliefe for the /. site (Score:2)
        
        by gl4ss ( 559668 ) writes:
        
        ..or that they were quite silent to begin with.
        
        there's quite a few of the pentium2 and pentium3 era pc's that only had one fan in the whole system(some compaqs at least).
Servers last words (Score:2, Funny)

by DR SoB ( 749180 ) writes:

After being posted on /.

"Dieing Ben-ja-min" - Short Circuit 2
ccache (Score:5, Interesting)

by Lord of Ironhand ( 456015 ) writes: <arjen@xyx.nl> on Tuesday March 30, 2004 @05:35PM (#8718868) Homepage

ccache [samba.org] is also nice for optimizing compiling. He probably mentioned it in the article, but since it seems /.-ed I wouldn't know... and by the time you've got both distcc and ccache running the article might be available again so you can read if you did it the right way :-)

Share
twitter facebook
- Re:ccache (Score:3, Informative)
  
  by aridhol ( 112307 ) writes:
  
  Yeah, he says that ccache would speed up the compilation, but he specifically disabled it so it wouldn't interfere with his timings (later runs would appear more efficient than they should be).
- When to use distcc and ccache (Score:3, Informative)
  
  by xixax ( 44677 ) writes:
  
  I went to a talk about these two tools, and getting the most out of them depends (to an extent) on knowing the nature of your compile. For example, if you are working only only a small part of a project comprised of many objects, you will probably benefit from ccache more than from distcc (in that only those objects affected by your code changes are rebuilt).
  
  On the same tack, the performance of distcc will (to an extent) depend on the nature of the compilation task used in the test (I am not familiar with
Copy of my article... (Score:4, Redundant)

by IceFox ( 18179 ) writes: on Tuesday March 30, 2004 @05:35PM (#8718876) Homepage

poor web server... I thought it could handle it...
distcc optimizations - March 30th 2004
and how to compile kdelibs from scratch in six minutes
If you don't already know about distcc I recommend that you check it out. Distcc is a tool that sits between make and gcc sending compile jobs to other computers when free, thus distributing compiles and dramatically decreasing build times. Best of all it is very easy to set up.
This, of course, leads to the fantastic idea that anyone can create their own little cluster or farm (as it is often referred to) out of their extra old computers that they have sitting about.
Before getting started: In conjunction with distcc there is another tool called ccache, which is a caching pre-processor to C/C++ compilers, that I wont be discussing here. For all of the tests it was turned off to properly determine distcc's performance, but developers should also know about this tool and using it in conjunction for the best results and shortest compile times. There is a link to the homepage at the end of this article.
Farm Groundwork and Setup
As is the normal circle of life for computers in a corporate environment, I was recently lucky enough to go through a whole stack of computers before they were recycled. From the initial lot of forty or so computers I ended up with twelve desktop computers that ranged from 500MHz to 866MHz. The main limit for my choosing dealt with the fact that I only had room in my cube for fifteen computers. With that in mind I chose the computers with the best CPU's. Much of the ram was evened out so that almost all of the final twelve have 256MB. Fast computers with bad components had the bad parts swapped out for good components from the slower machines. Each computer was setup to boot from the CD-ROM and not output errors when booting if there wasn't a keyboard/mouse/monitor. They were also set to turn on when connected to power.
Having enough network administration experience to know better, I labeled all of the computers, the power cord and network cord that was attached to them. I even found different colored cable for the different areas of my cube. The first label specified the CPU speed and ram size so later when I was given faster computers, finding the slowest machine would be easy. The second label on each machine was the name of the machine, which was one of the many female characters from Shakespears plays. On the server side a dhcp server was set up to match each computer with their name and IP for easy diagnosis of problems down the line.
For the operating system I used distccKNOPPIX. distccKNOPPIX is a very small Linux distribution that is 40MB in size and resides on a CD. It does little more then boot, gets the machine on line and then starts off the distcc demon. Because it didn't use the hard disk at all, preparation of the computers required little more than testing to make sure that they all booted off the CD and could get an IP.
Initially, all twelve computers (plus the build master) were plugged into a hub and switch that I had borrowed from a friend. The build master is a 2.7Ghz Linux box with two network cards. The first network card pointed to the Internet and the second card pointed to the build network. This was done to reduce the network latency as much as possible by removing other network traffic. More on this later though.
A note on power and noise, the computers all have on-board components. Any unnecessary pci cards that were found in the machines were removed. Because nothing is installed on the hard disks they were set to spin down shortly after the machines are turned on. (I debated just unplugging the hard disk, but wanted to leave the option for installation open for later.) After booting up and after the first compile when gcc is read off the CD the CD-ROM also spins down. With no extra components, no spinning CD-ROM or hard disk drives the noise and heat level in my cube really didn't change any that I c
Read the rest of this comment...

Share
twitter facebook
Distccd for cygwin (Score:5, Informative)

by aberant ( 631526 ) writes: on Tuesday March 30, 2004 @05:38PM (#8718909) Homepage Journal

My life changed the day i found out i could get my super fast P4 Windows XP box to compile for my slow linux box. Distcc for cygwin is a miracle. check out the thread [gentoo.org] at Gentoo forums

Share
twitter facebook
- Re:Distccd for cygwin (Score:2)
  
  by bee-yotch ( 323219 ) writes:
  
  This is probably the most productive thing I've used windows for in the past two years ;-).
- Re:Distccd for cygwin (Score:2)
  
  by Lord Ender ( 156273 ) writes:
  
  I find it offensive that you would run Windows on your fastest home machine. Linux should always have the best hardware. You Sir, insult me. /joke
- Re:Distccd for cygwin (Score:2)
  
  by Andy Dodd ( 701 ) writes:
  
  It's a PAIN to install and get running, but DAMN is it worth it!
  
  A similar technique to the distcc + cygwin install can be used to allow a distcc host to provide a GCC version other than its system GCC version. For example, my setup:
  1.7 GHz P4-M (Gentoo box, always is the controlling node)
  1.1 GHz Athlon (RedHat 7.3, sys GCC is 2.96, but I have a 3.3 tree in another location that won't interfere with the 2.96 tree)
  WinXP box with an Athlon XP 1?00+
  
  The XP box has 256M RAM, the other two 512M. Works great.
Martin Pool interview (Score:5, Informative)

by Wise Dragon ( 71071 ) writes: on Tuesday March 30, 2004 @05:39PM (#8718920) Homepage

Martin Pool, the brains behind distcc, was interviewed by ZDNet yesterday. How timely.

http://web.zdnet.com.au/builder/program/work/sto ry /0,2000034960,20283318-1,00.htm

Share
twitter facebook
- Re:Martin Pool interview - clickable link (Score:4, Informative)
  
  by Poisonous Drool ( 526798 ) writes: on Tuesday March 30, 2004 @05:43PM (#8718977)
  
  Developer Spotlight: Martin Pool [zdnet.com.au]
  
  Parent Share
  twitter facebook
- Re:Martin Pool interview (Score:3, Informative)
  
  by Wise Dragon ( 71071 ) writes:
  
  Oh and look, it made Slashdot (in the Developers section). [slashdot.org]
Mirror (Score:5, Informative)

by Rufus211 ( 221883 ) writes: <[rufus-slashdot] [at] [hackish.org]> on Tuesday March 30, 2004 @05:40PM (#8718926) Homepage

I feel like burning my new site in a bit =)

http://hackish.org/~rufus/distcc.php.html [hackish.org]

Share
twitter facebook
- Re:Mirror (Score:2)
  
  by revmoo ( 652952 ) writes:
  
  I notice a connection refused. Would you mind sharing what the bandwidth usage was?
  
  I've wanted to mirror files for /. lately, but I don't want to swamp my server.
  - Re:Mirror (Score:2)
    
    by Rufus211 ( 221883 ) writes:
    
    Bandwith was about, um, 1mb? I took the server down as an unrelated event to get software raid working.
Gentoo Impact(s) (Score:2)

by ViceClown ( 39698 ) * writes:

This was a great read... which I was fortunate enough to do before this poor guy's machine got /.ed. Anyway, an adaption of this article aimed at specific users or tasks (developers, Gentoo users, etc) would be awesome! Kudos for the writeup. Can't wait to go home and try it out!
- Re:Gentoo Impact(s) (Score:5, Informative)
  
  by y2dt ( 184562 ) writes: on Tuesday March 30, 2004 @05:52PM (#8719099)
  
  official gentoo distcc guide:
  http://www.gentoo.org/doc/en/distcc.xml [gentoo.org]
  
  Parent Share
  twitter facebook
  - Re:Gentoo Impact(s) (Score:2)
    
    by ViceClown ( 39698 ) * writes:
    
    Ah ha!
behind the XCode curtain (Score:5, Insightful)

by pohl ( 872 ) writes: on Tuesday March 30, 2004 @05:42PM (#8718958) Homepage

This is cool...I learned something on slashdot today. On a hunch I got a bash shell on my OSX box at home and typed "dist-<TAB>-<TAB>", and lo there be distcc already installed and ready to go. That must be what they use for distributed builds in XCode [apple.com]

Share
twitter facebook
- Re:behind the XCode curtain (Score:5, Informative)
  
  by Anonymous Coward writes: on Tuesday March 30, 2004 @06:01PM (#8719193)
  
  Yup, look at the X code preferences for distributed builds. The cool part is they use Rendezvous to automatically find machines to send work. You can set your box to use these others and/or offer service to others. Also on dual processor boxes is will treat them as two machines and do two compiles at once.
  
  Anyway, you can see distcc running when you have X code enabled for distributed builds and running.
  
  --jim
  
  Parent Share
  twitter facebook
- Comment removed (Score:5, Interesting)
  
  by account_deleted ( 4530225 ) writes: on Tuesday March 30, 2004 @06:45PM (#8719725)
  
  Comment removed based on user account deletion
  
  Parent Share
  twitter facebook
Mirror (Score:2, Informative)

by after ( 669640 ) writes:

The article is loading really, really slooooow, I was able to get a html-only copy [nan2d.com] of it.
Improving builds. (Score:2, Informative)

by Anonymous Coward writes:

(1) Use Scons [scons.org]
(2) Use --jobs=2 (or however many processors you have).

Build times will be greatly improved - and it's cross platform as well.

In my opinion - especially if you have a complicated project - distcc isn't worth it. The machine takes so long pre-processing everything (including header files) - that you loose whatever advantages you might have with offloading the actual compilation work. It's especially useless with MSVC once you start using precompiled headers.
Or... You could do it properly. (Score:5, Informative)

by Moderation abuser ( 184013 ) writes: on Tuesday March 30, 2004 @05:49PM (#8719056)

Install Sun Grid Engine[1] since it's free and now open source and then not only do you get qmake for distributed builds but you also get a general purpose distributed processing system. And hey! It even has the current buzzword "grid" in the title so your PHB will love you.

[1] http://gridengine.sunsource.net/

Share
twitter facebook
- Re:Or... You could do it properly. (Score:2)
  
  by Atzanteol ( 99067 ) writes:
  
  I highly doubt that Sun's 'Grid Engine' is as easy to install and use as distcc is though. Though it does sound interesting.
  - Re:Or... You could do it properly. (Score:2)
    
    by Moderation abuser ( 184013 ) writes:
    
    Grid engine's a doddle to install and use. It's also more useful than a system limited to running distributed compiles.
    - - Re:Or... You could do it properly. (Score:2)
        
        by cant_get_a_good_nick ( 172131 ) writes:
        
        British slang.
        s/a doddle to/a piece of cake to/g
- - Re:Or... You could do it properly. (Score:2)
    
    by Moderation abuser ( 184013 ) writes:
    
    Maybe you need to send your sysadmin on the sysadmin course. It's free:
    
    http://suned.sun.com/US/catalog/courses/WE-1600 - 90 .html
    
    Our grid gets jobs out to an execution host and started in less than a second. All of our applications are distributed out over the execution nodes; Editors, word processors, spreadsheets, The Gimp, software builds, *everything*.
    
    In fact, the less than 1 second latency incurred submitting a grid job is easily and by far overcome by the reduction in time given by starting a proces
Why wasn't a factorial experiment used? (Score:5, Informative)

by alptraum ( 239135 ) writes: on Tuesday March 30, 2004 @05:50PM (#8719069)

Sigh, another experiment that could have benefitted greatly from factorial experimentation. If your unfamiliar with DOE, here is a basic introduction courtesy of NIST:

http://www.itl.nist.gov/div898/handbook/pri/sect io n1/pri11.htm

It appears in this case we have a variety of factors and trying to, in this case, have a response of "elapsed time" for compilation and it is a minimization problem. Instead of looking at factors individually, a factorial DOE would have allowed interactions to be analyzed and to look for a global optima rather than just optimizing individual factors and then tossing them all together, it doesn't work that way a lot/most of the time.

If the author of this article is present: Why wasn't a factorial experiment used?

Share
twitter facebook
- Re:Why wasn't a factorial experiment used? (Score:4, Interesting)
  
  by DarkMan ( 32280 ) writes: on Tuesday March 30, 2004 @11:46PM (#8721976) Journal
  
  Probably because it wasn't needed. And secondly, factorial DOE isn't as good as your implying it to be.
  
  Factorial DOE is useful if you have multiple measurable, continious or quasi continous [0] factors, and want to optimise - particualry when there is some trade off. In this case, however, most of the variables that were altered were clearly discrete (This version of make, or that version of make, for example), or it was clear that the optimum was at an extreme (More CPU speed is always good, for example).
  
  So, the factors I can see that would be suitable to a factorial DOE is the number of machines in the farm. Except, each machine is different, so that's effectivly an n-dimensional set, with 2 options on each dimension, for n machines. If your going to do the stats, you'd want to do them properly, so no handwaving them all together there.
  
  Plus, this is a determanistic situation. There is no real need for empirical analysis - you can do it all from first principles, which would be much more efficent, I think. And, indeed, that's what the author did - by looking at the theoretical background of it all, to use different makes and so on, to optimise.
  
  Finally, if you think that a factorial DOE will get you a global optimum solution, then your sadly mistaken. It's a good procedure for optimising, and it can avoid some local minima - but it's not guarenteed to find a global minima. The only guarenteed method I'm aware of is a synthetic annealing - and if you've got a faster method, I, and a large number of people doing numerical caluclations, would love to hear it.
  
  Oh, and the aim here was _not_ to find a global minima. It was to get something that was good enough. Trying for better than that is wasted effort.
  
  [0] For example, the set of integers, from 0 to 1000 is quasi continous. It's not really continous, but it's close enough for real purposes.
  
  Parent Share
  twitter facebook
- - Re:Why wasn't a factorial experiment used? (Score:3, Insightful)
    
    by alptraum ( 239135 ) writes:
    
    Maybe because he's a system administrator and not a research scientist? That isn't a reason, you use the best solutions that are available for solving a problem type regardless of your position.
    DOE is widely implemented in especially manufacturing processes, however with just basic knowledge of DOE it is easy to see the applications to non-manufacturing processes as well. DOE is readily available in just about any statistics software worth using, R, SAS, Minitab, S-Plus, etc so even if you don't have m
Electric Cloud (Score:3, Informative)

by Anonymous Coward writes: on Tuesday March 30, 2004 @05:50PM (#8719075)

Yes, distcc is nice, but anyone with a really big build (say like hours long) must take a look at John Ousterhout's company Electric Cloud (yeah, John Ousterhout as in Tcl) here [electric-cloud.com]. They've built this replacement for gmake that runs the jobs in parallel but is smarter than distcc because it can break open all the recursive makes and run _everything_ in parallel and it works cross platform too. It's $$$ and not OSS :-) but designed to be ultrareliable.

Share
twitter facebook
- Re:Electric Cloud (Score:2)
  
  by boots@work ( 17305 ) writes:
  
  Or just install SCons. Great parallelism, smarter rebuilds, cross platforms, more reliable than make, and far faster than automake/libtool. Works well with distcc too.
  
  run _everything_ in parallel
  
  What, even things that shouldn't be parallel? Screw that.
  
  Damned if I'm letting an electric cloud near my machine room.
- automake - unsermake (Score:2)
  
  by leuk_he ( 194174 ) writes:
  
  That is why he mentions unsermake as a automake replacement to parralel build files. This makes distcc scaleable over much more machines.
PHP article? (Score:4, Insightful)

by Vellmont ( 569020 ) writes: on Tuesday March 30, 2004 @05:53PM (#8719120) Homepage

If you knew you were going to be slashdotted, wouldn't you link to a static version of the article instead of one running a PHP script?

Share
twitter facebook
Missed the best point (Score:5, Informative)

by MerlynEmrys67 ( 583469 ) writes: on Tuesday March 30, 2004 @06:06PM (#8719253)

He completely ignored the usage of distcc and ccache together. The pair of applications make for a huge win.
There are some problems though - which do you do first ccache or distcc (answer on my benchmarks is ccache - if it isn't in the cache send it on the network) how fast is your "build" machine - this is critical. The build machine is resonsible for preprocessing the file, checking if it is in the cache and then sending it out to be turned into an object. Especially when you interact the results of ccache (which most of your builds are just the same file over and over - very few "changed" files) and distcc - most of your time is spent in the first pass compiler.
In our environment we had boatloads of dual XEON machines around - they made wonderful build machines, and it didn't hurt that we connected them with Gig Ethernet either. Did wonders for our build times.
Over all distcc and ccache are wonderful tools that should be in every large compile environment - making compiles that used to take days take simple minutes. But you want to make sure that the dependancy between ccache and distcc work optimally in your environment.

Share
twitter facebook
- Re:Missed the best point (Score:3, Informative)
  
  by IceFox ( 18179 ) writes:
  
  He completely ignored the usage of distcc and ccache together. The pair of applications make for a huge win.
  Actually I mentioned it in the first paragraph...
  - Re:Missed the best point (Score:2)
    
    by MerlynEmrys67 ( 583469 ) writes:
    
    Yeah - his mention was he was ignoring it...
    there is another tool called ccache, which is a caching pre-processor to C/C++ compilers, that I wont be discussing here. For all of the tests it was turned off to properly determine distcc's performance, but developers should also know about this tool and using it in conjunction for the best results and shortest compile times.
    Seems to me - he is ignoring the hard part of getting the best benefit out of the tool package... Kinda like talking about optimizing c
    - Re:Missed the best point (Score:4, Insightful)
      
      by IceFox ( 18179 ) writes: on Tuesday March 30, 2004 @08:15PM (#8720628) Homepage
      
      In the first paragraph I mention that you should use it and be familiar with it. Assuming that you already do use it, then the rest of the article applies about how you can improve a certain portion of it (distcc). You don't ignore all the books on optimizing C code just because there are plenty of algorithm books do you?
      
      -Benjamin Meyer
      
      Parent Share
      twitter facebook
  - Re:Missed the best point (Score:2)
    
    by Minna Kirai ( 624281 ) writes:
    
    Actually I mentioned it in the first paragraph...
    
    The point is that to a person unfamiliar with "compiler-intermediary" tools like distcc and ccache, the way to use them simultaneously is nonobvious.
    
    Does the master host keep the cache, and farm out jobs on cache misses? Or does each box keep its own ccache, which is used to fulfill compilation jobs from the master? (Obviously, one of those options is drastically worse than the other)
    
    Since you alluded to the possibility of distcc+ccache in the introduct
Perfect timing! (Score:2)

by rice_burners_suck ( 243660 ) writes:

This story arrived with perfect timing, as I just finished reading the one about "Build From Source vs. Packages?" [slashdot.org], and there was some discussion about distcc and Gentoo there. It got me kind of interested, so I thought I'd look into it a bit more, and then this story arrived!
Hell yeah!
- Re:Perfect timing! (Score:2)
  
  by hyc ( 241590 ) writes:
  
  Of course, I wrote about doing this in my presentation to the Sun User Group in 1991 "GNU & You, Building a Better World" [lanet.lv] which I developed while working at JPL. And yes, I wrote the jobserver code that allows gmake to spawn parallel jobs without swamping the machine, the way the old loadaverage-based code did.
  The motivation for my work in 1991 was not much different than this, although back then my problem was building the X11 distro, and all of the imake crap that was in there. Since the paper itsel
jobs/cpu? (Score:2, Interesting)

by swebster ( 530246 ) writes:

Try putting your localhost machine first in the list, in the middle and at the end. Normally you want to run twice the number of jobs as processors that you have. But if you have enough machines to feed, running 2 jobs on the localhost can actually increase your build times.

About the "Normally you want to run twice the number of jobs as processors" part... is that really true? I thought it was best to just run 1 job/cpu by a long shot. Am I confused or is he?
- Re:jobs/cpu? (Score:2, Insightful)
  
  by vadim_t ( 324782 ) writes:
  
  It seems to help a bit actually. Probably due to the idle times caused by disk I/O. While one job is reading/writing from the disk the other one can compile, for example.
"Weak" computers are usefull ... (Score:2)

by Hektor_Troy ( 262592 ) writes:

Maybe the mini-ITX cluster [slashdot.org] would come in handy for an additional *umph* with your large compiles? If they support PXE, you wouldn't even need the cd's.
whoa yours server's been comprimised.. (Score:2)

by DR SoB ( 749180 ) writes:

Dude, check out your server, it's been hacked...
- Re:whoa yours server's been comprimised.. (Score:2)
  
  by voidptr ( 609 ) writes:
  
  Nah, the sysadmins are just being pricks.
Can distcc model be used for other apps? (Score:2)

by -tji ( 139690 ) writes:

Is distcc integrated into the compiler components, or is it another layer below gcc, which divides up tasks?

If it's generalized, it would be cool to see it used for other CPU intensive tasks.. Video processing comes to mind. I would love to have a cluster bring down the times needed to:

- Convert MiniDV home video to MPEG2 DVD's. There are professional tools to do this.. A hobbyist tool that could do clustering would be excellent.
- Convert HDTV captures to MPEG2 for DVD archival. 1080i video processi
- Re:Can distcc model be used for other apps? (Score:2, Informative)
  
  by lisany ( 700361 ) writes:
  
  What really happens is that you can use the so-called "masquarading" method installation, which basically means you set up symlinks called gcc, g++ and whatever to the distcc binary. Prefix your PATH with this directory and calling `gcc` will work.
  
  In my opinion this is easier (and better) than doing `make CC=distcc gcc`
- Mostly... (Score:2)
  
  by DarkMan ( 32280 ) writes:
  
  Yes and no.
  
  First off the generalised methods you allude to are MPI, the older PVM, and there's Mosix too.
  
  MPI and PVM are framework libraries that allow for code to be written to take parallelism into account. They tend to be used for numerics calculations (which was thier birthplace), simply because numerics are CPU bound. There are others, that are even more numerics centric (HPF - a Fortran varient, for example), but MPI should probably be the target of choice for new code, including non-numerics base
  - Re:Mostly... (Score:2)
    
    by Minna Kirai ( 624281 ) writes:
    
    and thus can be used for calulcations that are not trivially parralisable.
    
    MPI is not what he wants. Both of the applications tji asked for are video recompression tasks. Those fall deep in the "trivially parralisable" category.
    
    Just split up the input file into megabyte chunks, allow each helper computer to convert one chunk, then concatenate the results on the master. There is no need for the helper computers to communicate amoungst themselves while the calculation is going on, which is the ability MP
Recursive Make Considered Harmful (Score:5, Informative)

by JWhitlock ( 201845 ) writes: <John-Whitlock.ieee@org> on Tuesday March 30, 2004 @07:45PM (#8720358)

There was an interesting paper by Peter Miller in 1997 called "Recursive Make Considered Harmful". [pcug.org.au] It makes a good case for why recursive make is a bad idea, slowing down compile times and clouding dependancies. Benjamin Meyer has proved the point again, with his use of unsermake - if you generate a non-recursive make, then distributed compiles are twice as fast.
Unfortunately, the makefile creator most people use, automake, creates only recursive makefiles. Maybe a replacement like unsermake will get automake developers thinking about radical changes. I wouldn't mind seeing M4 go away, for one.

Share
twitter facebook
- Re:Recursive Make Considered Harmful (Score:4, Interesting)
  
  by ewhac ( 5844 ) writes: on Tuesday March 30, 2004 @09:41PM (#8721229) Homepage Journal
  
  Seconded.
  When I was at Be, Inc. (RIP), one of our engineers, motivated largely by the above-referenced article, converted our entire build environment to a non-recursive structure using gmake. The result was a large speedup, as well as more effective use of multiple processors (which BeOS utilized very well). gmake would grovel over the build tree for a minute or two, then launch build commands in very quick succession. 'Twas great.
  Schwab
  
  Parent Share
  twitter facebook
- You're a bit outdated. (Score:2)
  
  by devphil ( 51341 ) writes:
  
  Unfortunately, the makefile creator most people use, automake, creates only recursive makefiles.
  
  And there's a damn good reason for it, too, but that's neither here nor there. Anyhow, this was fixed so you can do non-recursive stuff if you want to now.
  Unfortunately, the very latest automake versions are trying to be way, way too clever, thereby breaking stuff in lots of projects. Time to throw it out and use something else.
  I wouldn't mind seeing M4 go away, for one.
  
  Automake is a Perl script.
- - Re:Recursive Make Considered Harmful Considered Du (Score:2)
    
    by Minna Kirai ( 624281 ) writes:
    
    That comment makes a spectacularly bad case. It provides no analysis to back up its wild claims. Approximately zero lines of the comment has to do with the paper [sourceforge.net], which it essentially mischaracterizes.
    
    That paper makes a spectacularly bad case
    
    It makes a fine case. The worst part is that it exaggerates the value of its own minor insight. The grandiose title harkens to the famous "Goto Considered Harmful", which in its time was a more insightful position.
    
    Nobody should be surprised that globally correc
How do you do all of this? (Score:2)

by Xabraxas ( 654195 ) writes:

I use distcc and sometimes it doesn't seem to help because I try to offload my compiles to my two slower computers first because I would rather keep my laptop cpu cooler. The problem is that sometimes it will actually take longer to compile. After reading about unsermake I really want to use it because, I think automake is my bottleneck. The question is how do you do it? Where can I find unsermake and how do I configure distcc to use it? The article is great on explaining what to change but not how to
- Re:How do you do all of this? (Score:2)
  
  by Xabraxas ( 654195 ) writes:
  
  I'm a dumbass. The links are at the bottom of the page.
Scaling (Score:2)

by Capt. Beyond ( 179592 ) writes:

Sure distcc might be good for a few machines, but it doesn't scale well. Trolltech's Teambuilder is much better suited for large scale distributed development environment. Ask Cisco. They evaluated both distcc and Teambuilder on huge multi processor solaris systems. Guess who they chose, as it scaled better. That's right! Trolltech's Teambuilder! Plus, Teambuilder is much easier to setup, and has very nice monitor to monitor your compile farm. Teambuilder [trolltech.com]
- Re:Scaling (Score:2)
  
  by boots@work ( 17305 ) writes:
  
  Got a link for that, little troll? I don't see anything on the cisco, distcc or teambuilder sites.
- Re:Scaling (Score:2)
  
  by IceFox ( 18179 ) writes:
  
  Well for joe shmo developer who has three computers distcc is "free" and is easy to set up. Little incentive to choose anything else.
  
  -Benjamin Meyer
  - Re:Scaling (Score:2)
    
    by Capt. Beyond ( 179592 ) writes:
    
    well for joe schmoe, teambuilder comes in a personal edition allowing up to 3 computers.
  - Re:Scaling (Score:2)
    
    by Zigg ( 64962 ) writes:
    
    Until he gets his electric bill, that is.
- - Re:Scaling (Score:2)
    
    by Capt. Beyond ( 179592 ) writes:
    
    Obviously Cisco tested both extensively. It's their conclusion, not mine.
    Using Teambuilder, you don't have to muck about with heaps of settings, trying to discover which one works best, it just works. Out of the box.
distcc isn't so great (Score:3, Informative)

by Lord Ender ( 156273 ) writes: on Tuesday March 30, 2004 @09:13PM (#8721063) Homepage

My roommate and I both use Gentoo. We also both have AthlonXPs. When we first turned on distcc, cutting our compile times in half, we were overjoyed. But then random compiles started failing. Not until I turned of distcc could I get some packages to compile. The point is, distcc isn't flawless.

Share
twitter facebook
- Re:distcc isn't so great (Score:2, Informative)
  
  by Anonymous Coward writes:
  
  At work, we have a bunch of gentoo boxes and discovered the same thing as you. Emerges would randomly fail. We were hoping we could figure out why, but no luck. We had to pull distcc off the boxes.
  If I had more time I would trace through things and try to figure out why they failed. But I don't have that much time.
  
  I still like the idea behind distcc and hope that someday (soon) they'll get it working correctly.
  - Re:distcc isn't so great (Score:3, Informative)
    
    by KFK2 ( 23515 ) writes:
    
    Well.. here goes a couple of mod points that I spent.. but I'd thought I'd chime in..
    
    My friend recently had the same thing happen, and the conclusion we came to was that the compiler versions were different on the distcc servers (3.2.2) versus the client (3.2.3).. and the preprocessed code being sent off had syntax erorrs or something of the like when it was sent off (something to do with one of the new options in the latest gcc). I don't recall exactly what option it was or what package(s) were failing..
distributed codebase (Score:3, Interesting)

by Doc Ruby ( 173196 ) writes: on Tuesday March 30, 2004 @09:33PM (#8721185) Homepage Journal

It would be cool to use a distcc client which took my local code diffs, distributed them around the Internet, patched the distributed "standard" version, cc'd the code, and sent back binaries to my client. Crypto hashes against the revised code could ensure that I was really getting binaries from my actual uploaded diffs. But then everyone with "difstcc" would be recompiling so much that we'd each return to our original CPU bandwidth ratios :).

Share
twitter facebook
Plug for Xcode... (Score:3, Informative)

by boola-boola ( 586978 ) writes: on Tuesday March 30, 2004 @10:02PM (#8721355)

While we're on the business of discussing distcc, I've gotta say... Xcode supports it quite nicely (including the pretty GUI distcc Monitor), and _ALL_ it takes is checking two boxes in the preference panel. I'm serious.

Share
twitter facebook
Question for ccache (Score:2)

by master_p ( 608214 ) writes:

I understand the usage of distcc, and it seems quite helpful. But what about ccache ? the available info does not say much, except that it "caches" the output in the following way: if the object files are already present, they are not compiled again.

But I thought that the 'make' program does exactly that: if a source code file is newer than the object file, then the source file is compiled; if not, the current object file is used.

What is exactly that ccache does that make does not ?
- Re:Question for ccache (Score:2)
  
  by DarkMan ( 32280 ) writes:
  
  make clean && make
  
  ccache will cache the previous compiles, and, if they haven't changed at all, use the cached results. This allows the certianty of a clean build to be gained in significantly less time. Make won't do that, because it was just cleaned.
  
  Additionally, I belive that ccache uses a global cache. So, if, for example, you are compiling a couple of linux kernals, each patched differently, some of the compilations will be the same between both trees. ccache will recognise this, and only
If you want faster builds (Score:2)

by DrSkwid ( 118965 ) writes:

build smaller things

the record for compiling a plan9 kernel is 15s

I built & installed the kernel and the whole distributed userland in 45 mins on a Duron 800Mhz.
... room in my cube for fifteen computers ... (Score:2)

by mnemotronic ( 586021 ) writes:

I only had room in my cube for fifteen computers.
I wonder how much noise and heat is generated by 15 PCs running in a small cubeacular office environment....
- Re:I wonder... (Score:5, Insightful)
  
  by timeOday ( 582209 ) writes: on Tuesday March 30, 2004 @05:47PM (#8719028)
  
  If he was only interested in helping himself he wouldn't have bothered with a nice writeup for all us to read.
  
  Parent Share
  twitter facebook
- Re:I wonder... (Score:5, Insightful)
  
  by Lord of Ironhand ( 456015 ) writes: <arjen@xyx.nl> on Tuesday March 30, 2004 @05:47PM (#8719034) Homepage
  
  If everyone measured the value of his actions only by the time it will save him/herself, there probably wouldn't be much of a free software community these days.
  
  Parent Share
  twitter facebook
- - - Re:Article Text (Slashdotted Server) (Score:3, Funny)
      
      by djh101010 ( 656795 ) writes:
      
      Are there actually regular participants of Slashdot whose karma _isn't_ listed as excellent? If that's the case, how can "karma whoring" come into it at all? It's not like you get promoted to "excellenter" or "excellentest" or something.
      - Re:Article Text (Slashdotted Server) (Score:3, Informative)
        
        by Anonymous Coward writes:
        
        You must be new so I will explain it to you. There is a class of users that regularly Karma Whore and get their Karma maxed out and then proceed to burn Karma by posting things like goatse.cx at +1. They do this not only to annoy people but to prove the flaws with the moderation system. While this guy may not be one of those people it is important to not reward somebody for posting an article. Users can easily post the article anonymously and avoid this issue altogether.
        
        Re:Article Text (Slashdotted Server) (Score:2)
        
        by djh101010 ( 656795 ) writes:
        
        So, then those are the types who get modded into oblivion. Self-correcting problem. To mod down a legitimate post (text of an unreadable site) because some of the people who do that might then post the goatse link, well, isn't all that realistic. (the fact that the goatse site is down notwithstanding).
        
        Perhaps you're attributing motivations to this behavior (making a useful post) that doesn't apply.
        
        Re:Article Text (Slashdotted Server) (Score:2)
        
        by djh101010 ( 656795 ) writes:
        
        how would you know that the goatse site is... nevermind
        
        Read about it on slashdot, oddly enough.
      - Re:Article Text (Slashdotted Server) (Score:3, Funny)
        
        by naasking ( 94116 ) writes:
        
        It's not like you get promoted to "excellenter" or "excellentest" or something.
        
        You mean you haven't been promoted yet? Ha! n00b... :-)
  - - - Re:Article Text (Slashdotted Server) (Score:2, Interesting)
        
        by Lord of Ironhand ( 456015 ) writes:
        
        If it was posted _seconds earlier_, then how could he know he was being redundant?
        He couldn't. It's simply a risk you take when posting the article. The moderation system is intended to improve things for the reader, not to judge his (undoubtedly good) intentions. You have a point though, maybe Redundant moderations shouldn't decrease karma, just like Funny doesn't increase it.
        btw, posting the article as non-AC is viewed by many as karma whoring, so it's not recommended anyway.
- Re:I don't have long compile times (Score:2)
  
  by Kupek ( 75469 ) writes:
  
  Compiling time isn't an issue for you because your programs aren't large enough. You have no need for this tool. There will always be projects that take a non-trivial amount of compiling, no matter what language or technology you're using.

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

strlen (Score:5, Funny)

Re:strlen (Score:2)

Wow... (Score:5, Funny)

Re:Wow... (Score:2, Informative)

Website bit slow... (Score:5, Funny)

Re:Website bit slow... (Score:3, Insightful)

Re:Website bit slow... (Score:2)

Nice big article (Score:5, Funny)

/.-ed already? (Score:5, Funny)

Re:/.-ed already? (Score:3, Funny)

Re:/.-ed already? (Score:2, Informative)

Re:/.-ed already? (Score:2)

anal retentive admin (Score:3, Funny)

Re:anal retentive admin (Score:2)

Reliefe for the /. site (Score:2, Informative)

Re:Reliefe for the /. site (Score:2)

Re:Reliefe for the /. site (Score:3, Informative)

Re:Reliefe for the /. site (Score:2)

Re:Reliefe for the /. site (Score:2)

Servers last words (Score:2, Funny)

ccache (Score:5, Interesting)

Re:ccache (Score:3, Informative)

When to use distcc and ccache (Score:3, Informative)

Copy of my article... (Score:4, Redundant)

Distccd for cygwin (Score:5, Informative)

Re:Distccd for cygwin (Score:2)

Re:Distccd for cygwin (Score:2)

Re:Distccd for cygwin (Score:2)

Martin Pool interview (Score:5, Informative)

Re:Martin Pool interview - clickable link (Score:4, Informative)

Re:Martin Pool interview (Score:3, Informative)

Mirror (Score:5, Informative)

Re:Mirror (Score:2)

Re:Mirror (Score:2)

Gentoo Impact(s) (Score:2)

Re:Gentoo Impact(s) (Score:5, Informative)

Re:Gentoo Impact(s) (Score:2)

behind the XCode curtain (Score:5, Insightful)

Re:behind the XCode curtain (Score:5, Informative)

Comment removed (Score:5, Interesting)

Mirror (Score:2, Informative)

Improving builds. (Score:2, Informative)

Or... You could do it properly. (Score:5, Informative)

Re:Or... You could do it properly. (Score:2)

Re:Or... You could do it properly. (Score:2)

Re:Or... You could do it properly. (Score:2)

Re:Or... You could do it properly. (Score:2)

Why wasn't a factorial experiment used? (Score:5, Informative)

Re:Why wasn't a factorial experiment used? (Score:4, Interesting)

Re:Why wasn't a factorial experiment used? (Score:3, Insightful)

Electric Cloud (Score:3, Informative)

Re:Electric Cloud (Score:2)

automake - unsermake (Score:2)

PHP article? (Score:4, Insightful)

Missed the best point (Score:5, Informative)

Re:Missed the best point (Score:3, Informative)

Re:Missed the best point (Score:2)

Re:Missed the best point (Score:4, Insightful)

Re:Missed the best point (Score:2)

Perfect timing! (Score:2)

Re:Perfect timing! (Score:2)

jobs/cpu? (Score:2, Interesting)

Re:jobs/cpu? (Score:2, Insightful)

"Weak" computers are usefull ... (Score:2)

whoa yours server's been comprimised.. (Score:2)

Re:whoa yours server's been comprimised.. (Score:2)

Can distcc model be used for other apps? (Score:2)

Re:Can distcc model be used for other apps? (Score:2, Informative)

Mostly... (Score:2)

Re:Mostly... (Score:2)

Recursive Make Considered Harmful (Score:5, Informative)

Re:Recursive Make Considered Harmful (Score:4, Interesting)

You're a bit outdated. (Score:2)

Re:Recursive Make Considered Harmful Considered Du (Score:2)

How do you do all of this? (Score:2)

Re:How do you do all of this? (Score:2)

Scaling (Score:2)

Re:Scaling (Score:2)

Re:Scaling (Score:2)

Re:Scaling (Score:2)