Talk To a Successful Free Software Project Leader 150
Nagios (formerly known as NetSaint) is a GPL network monitor software project that's been getting a lot of buzz lately among *Nix sysadmins. Nagios is unquestionably a free software success story even if it's not as high profile as Apache or Linux. Ethan Galstad leads the project. Perhaps he can tell us why Nagios has done so well, so that other free software projects can enjoy similar success. Usual Slashdot interview rules; post your question below, we'll email 10 of the highest-moderated questions to Ethan about 24 hours after this post appears, and publish his answers soon after he gets them back to us.
I'd like to know (Score:2, Interesting)
what it is (from the official site) (Score:3, Funny)
Does that mean it can predict when a Windows system tries to use my network before the enduser gets a bluescreen? Woah; that's impressive.
Re:what it is (from the official site) (Score:4, Insightful)
That's great for interactive use, but Nagios (along with Big Brother, and most other monitoring packages) doesn't seem to cater well to automating report generation from outside of a web browser. We need to generate weekly reports on the number of outages, etc., and would like to be able to schedule a cron job every Sunday night to say "get me the uptime stats for abc services, so I can put them into xyz reporting package". We need to take the raw data and calculate rolling averages, etc, to give to customers (we're contractully obliged to do so). I.e., the sort of reports we need are typically more complex than is reasonable to expect Nagios to do internally. Was the interactive bias a deliberate decision, or did it just evolve that way. More importantly, are there any plans to improve things in this area?
I was working on something like that (Score:3, Interesting)
There are several free services that do that. As for writing a report, just modify one of the cgi scripts to include your company name and junk and add a wget command to the cron script.
use it like this:
%wget http://flame.dnsart.com/index.php -O report.html
--12:36:21-- http://flame.dnsart.com/index.php
=> `report.html'
Resolving flame... done.
Connecting to flame[192.168.1.1]:3128... connected.
Proxy request sent, awaiting response... 200 OK
Length: unspecified [text/html] 45.34M/s
12:36:22 (45.34 MB/s) - `report.html' saved [47540]
I have a proxy server, and downloaded the startpage for my site, but the usage will be similar for your script. I also had to remove 'junk characters'; damn you lameness filter! Be sure to stream output to null so your daemon doesn't email you weekly.
I might be writing some php scripts to monitor uptime; email me if you would like a copy when they are complete.
Re:what it is (from the official site) (Score:2)
Out of the box (binary distrobutions), you're right, it doesn't. However, Nagios has an extension to store its logging information in a relational database (MySQL or PostgreSQL). It requires you to run configure and build from the sources. However, once done, this should make it a heckuva lot easier to generate reports using Perl DBI or PHP or something to extract the data from the rows. Here's the skinny on how to do this [sourceforge.net] (from the "Advanced Topics" section of the Nagios Documentation [sourceforge.net]).
In your opinion.. (Score:2, Interesting)
Re:In your opinion.. (Score:2)
That should have been...
"In your opinion what's the WORST security practice/vunerability/annoyance that's come out in the past year?
sorry
Re:In your opinion.. (Score:3, Informative)
Re:In your opinion.. (Score:2)
Versus other commercial apps (Score:2, Interesting)
Re:Versus other commercial apps (Score:4, Interesting)
Other items of note for comparison are issues like XML Output, I see that XML status data is planned for Version 3, what depth of information will be able to be queried/reported with XML?
Re:Versus other commercial apps (Score:1)
What future features for products like MOM will be implemented in Nagios, do you see any specific roles currently covered by SMS/MOM/OpenView/etc. that will eventually be done in Nagios?
Marketing & Publicity (Score:5, Interesting)
lose cool names (Score:2)
Re:Marketing & Publicity (Score:1)
why the name change? (Score:4, Interesting)
Re:why the name change? (Score:1)
NetSaint is not affiliated with World Wide Digital Security, Inc. (WWDSI); Richard S. Carson and Associates, Inc; and the marks WEB SAINT, SAINT, SAINTWRITER, SAINTEXPRESS, and SAINTBASIC owned by Richard S. Carson and Associates, Inc.
Looks like SAINT is a little too close to some security-related trademarks, that probably threatened the group when they saw the name.
Re:why the name change? (Score:1)
Re:why the name change? (Score:2)
This is the stupidest thing I have ever heard.
Re:why the name change? (Score:2)
Oh yeah, people are always caving and changing things to satisfy the huge and influential atheist lobby! As an atheist myself, I'm proud that we're able to have such a huge influence on social policy!
You're being silly. The reasons for the change had nothing to do with offended atheists (it was a trademark issue), and besides, why would atheists be offended?
Re:why the name change? (Score:1)
I guess that rules out SATAN then...
Re:why the name change? (Score:1)
I suspect it is due to the fact that "NetSaint" is offensive to those of us that are atheists.
Mmmk... how old are you? Things are not changed because they are offensive to atheists. If anything they are made more conspicuous.
Re:Huh? (Score:1)
The cease and desist comes in the form. . . (Score:2)
Holy Lawyer may well, in a humorous fashion, be considered an oxymoron. In reality such things exist. Who do you think prosocuted the accused during the inquisition? The more socially acceptable "Unholy" lawyer is a real entity as well. The term "Devil's Advocate" is no metaphorical construct as most people seem to believe. This is the official term applied to the
"defense counsul" of the church accused.
KFG
Re:Huh? (Score:1)
Re: (Score:1)
Re: (Score:1)
Re:Huh? (Score:2)
Re: (Score:1)
Re:Huh? (Score:2)
People began using *nix to replace Unix in public, because long ago AT&T owned the trademark Unix and would use legal violence to spank anyone who used that trademark without their permission. In the same way that f**k represents a certain dirty word, people began using *nix to say Unix.
Of course, idiots that came later saw it and assumed it must mean "any unix". Bzzt, wrong.
Direction (Score:5, Interesting)
My twofold question is, what has determined Nagios direction thus far? Was it modeled after OpenView and TNG or something else? Also, where is Nagios going in the future, will it continue to develop the features of OpenView and TNG or is it going somewhere else?
How do set success criteria? (Score:3, Interesting)
Predefined alerts vs dynamic events (Score:5, Interesting)
polls a pre-defined list of conditions. In other
words, if there are 28 things that could go
wrong, there are 28 pre-defined items that
change color from green to yellow, to red.
In my experience, an event based model, where
monitors determine the problem and severity,
works better. The central event manager would
just receive the events and handle display and
notification.
Can your product handle this sort of model ?
For example, could I write a monitor that watched
a database log file, and have it send events
like this ?
severity category host message
high database myhost database memory shortage
medium os myhost fs
Re:Predefined alerts vs dynamic events (Score:2)
This was clearly a design decision and if you prefer this style of monitoring, then I'd suggest Big Brother. For my environment, Nagios made the correct choice. If you are monitoring many applications (many > 100), then with a model that pushes events to the monitoring system, you will (probably) end up with a distributed configuration nightmare.
That said, I think you could probably hack a Nagios setup to do what you want with its distributed monitoring features. I.e., you could write your custom monitoring app to implement the interface that Nagios uses for satellite monitoring instances and then configure Nagios to use your custom monitoring app as a satellite. But I have not tried/done this, so I could be wrong, wrong, wrong.
Regards,
Stephen
Re:Predefined alerts vs dynamic events (Score:2)
The arguments are weaker if you are monitoring things above the network layer, but I think that they still hold a lot of water.
Nagios apparently uses the polling model, which is good, but seems to use TCP, which is bad. It also seems to have support for so-called mid-level managers (MLMs), that watch subsections of a network and aggregate the results for higher levels. This is a good thing. In order to scale, MLMs should not report a lot of detail unless directly queried. I don't know how well Nagios supports the MLM model. Can anyone tell me more?
mass-appeal software (Score:5, Insightful)
Re:mass-appeal software (Score:1)
Re:mass-appeal software (Score:2)
Did the brown stuff ever hit the cooling thing? (Score:2, Interesting)
my question (Score:5, Interesting)
Free Software (Score:5, Interesting)
Great product, silly new name (Score:2)
I have to agree with the others that have posted - why drop a perfectly good (and recognized) name like Netsaint for something we can't even pronounce?
Not so bad (Score:3, Interesting)
My question for Ethan is this:
Network Monitoring is one of those projects that management considers "vitally important" but for which it allocates no human resources. So you end up with $100K Tivoli setups that sit dormant because nobody has time to pay attention to them or configure them properly.
What is your suggestion for getting past this problem, and how would you sell the PHB's on Nagios along the way?
propriety... (Score:5, Interesting)
A. give them a relicensed version that allows them to do whatever they want to it.
B. incorporate any changes they may want on your own and make sure the changes make their way to the GPL codebase.
C. tell them to get bent.
D. make proprietary changes that you leave out of the GPL codebase in order to sell those changes yourself or to other potential clients
E. Some combination of the above.
F. Some other direction I didn't think of
I feel that making proprietary changes to GPL code that you keep (at least temporarily) proprietary is a great business model for certain projects, possibly the best model for certain things. Some projects that come to mind are things like i-tree.org's Secure iXplorer, which has a GPL "lite" version which only supports ssh/scp and a "full" version that also supports sftp. OpenOffice.org and Star Office seem to be of the same ilk... If you need the extra functionallity of Star Office, such as the better
I'm also curious if you have been approached by anyone for this sort of thing.
Re:propriety... (Score:1)
How did it start? (Score:5, Interesting)
I know there was a serious code revision between Netsaint 0.0.7 and Nagios 1.0, which was phenomenal, btw, great job. But after using Netsaint (I still call it that, old habits die hard) for almost 2 full years now, I've always been very impressed with how well everything runs and scales.
How is a project like this supported? (Score:5, Interesting)
I've asked on the two nagios mailing lists and received no answer. How do I, working for a major corporation, promote this software package if there's nobody that can help me fix it? Where do I look for support for a free product?
Re:How is a project like this supported? (Score:1)
Re:How is a project like this supported? (Score:1)
Re:How is a project like this supported? (Score:2)
You state that you are threatened with dumping Nagios because of the issue you have with the plugin. Assuming that your organization requires a network monitoring system, it seems only logical that you would have to replace Nagios with a commercial system, a system that will likely cost a great deal of money.
Could you not get some funds allocated to allow you to contact the writer of the plugin directly and hire them as a consultant in order to fix the bug or implement a feature that you need. I suspect that for a couple of thousand dollars you could have the actual writer of the plugin address your needs directly. Surely this would be far cheaper than the likely hundreds of thousands of dollars that would be necessary to completely replace Nagios with a commercial system. Further, releasing your fix/enhancement to the open source community would advance the entire project that much more.
Re:How is a project like this supported? (Score:2)
Funny, someone answered me quickly when I asked about it. If you didn't give any more details than the post I'm replying to, I can see why you didn't get an answer.
Re:How is a project like this supported? (Score:2)
Re:How is a project like this supported? (Score:3, Insightful)
Uh... riiiight.
I'm sure he has the authority to tell a programmer to shelf whatever they're working on and fix this bug... presuming it is a bug and not just a config error or something. Since the programmer has absolutely zero familiarity with the source, and probably none with the program at all, it's going to take some time to figure the bug out. Even given an above average coder who is familiar with all the necessary tools, it would take at least a couple weeks to figure out the code and fix.
Presuming that said above-average-coder is being paid only $80k, two weeks of their time is worth $3k in salary... which means about $5k once you add in benefits. And you've just delayed some other project -- one that is actually related to your core business -- by 2 weeks or more (probably more - it takes time to gearshift). That delay could cost the company an unknown amount of money - anything from $0 to millions, depending on the importance of the project.
Oh, and lets not kid ourselves. Programmers in large corps (and most small corps) don't work in a vacuum. Most have teams that interact with one another as well as other groups. Pull this senior programmer out of that and you're going to delay all of them too.
Now, how exactly do you justify this to management? Versus just buying an off-the-shelf solution, which -- even at $50-100k may -- be cheaper than tasking a coder to something that's tertiary to your core business.
To some extent this is a worse-case-scenario. To some extent its not. But having the code available doesn't mean jack shit in the real world, because it still costs huge amounts of money to get it fixed. Most successful (as in adopted by businesses) open source projects realize this and provide paid-for support -- because most companies know it's worth the time to pay for support rather than spend their own resources fixing it when something goes wrong.
Re:How is a project like this supported? (Score:1)
But here's the rub: WITHOUT the source, that whole choice is not an option. You can't ask XYZ, Inc. to fix it for you, you have to coordinate with the vendor to get something fixed and are completely at their mercy (read schedule and resource limits) to get a fix. And they can always claim non-issue, must be a problem specific with your setup.
Of course, companies with the foresight to keep inhouse talent strong will easily make that fix from source with a minimum of fuss.
So, what to do about support:
1. Pay for it.
2. Participate in the community and realize free support is worth what you put in (read that carefully).
3. Develop inhouse talent.
Re:How is a project like this supported? (Score:1)
The FS/OSS world will do a lot better commercially when it finally comes to finally, wholehartedly accept that not All The World's A Programmer.
My company's sysadmin is very good at maintaining the network, but he's not a developer, nor should he be. Why do people insist on parroting the "you have the source, fix it yourself!" mentality?
Re:How is a project like this supported? (Score:2)
Prioritization (Score:5, Interesting)
Nagios event handling. (Score:5, Interesting)
Will Nagios be implementing similar event handling functionality or will using utilities such as Swatch remain necessary? And if Nagios will not gain this flexibility, why would you feel that this functionality is unnecessary?
Funding (Score:3, Interesting)
1) License product under GPL
2) ???
3) Profit!
What is #2 for you, or more generally, how do you support your project financially? What do you see as the most sustainable model for supporting Free Software?
Does this scale (Score:2)
Does this software scale to monitoring thousands of servers? The only other reasonably mature open monitoring solution I investigated is mon, and it wasn't close to scaling to an environment of any size.
Re:Does this scale (Score:1)
Monitoring at that high a rate is also good if you have a SLA that's pretty tight.
Another good thing to have is good built-in forensic diagnostics so you don't get paged by operations at 3 am to explain that spurious down event.
Re:Does this scale (Score:2)
Re:Does this scale (Score:2)
people on the nagios mailing list are doing it though, it just takes tuning.
Ewan
Why Nagios? (Score:1, Interesting)
What makes Nagios unique? Thanks.
Re:Why Nagios? (Score:2)
It reacts to things professionally:
It keeps track of downtimes. It lets you SCHEDULE downtime (for specific time windows). It has access controls by user. It has limited views by user. It has notification windows per user.
STuff like that. BigBrother doesnt come close. and MRTG has a completely different design goal, as far as I understand it.
nagios is designed to be a cheap man's replacement for full on HP OpenView, in a true 24x7 NOC.
Raking in the coders... (Score:4, Interesting)
New Features (Score:1)
Daniel
Similarities and differences (Score:1)
Open Source projects are driven by people who enjoy coding in their spare time, people who want to contribute something to the community or by people who have a need for a particular piece of software of functionality.
Commercial projects are driven by the need to produce a product on-time and under-budget in order to sell it to make profit.
In your expierience, how similar is managing an Open Source project to a commerical one? What sort of challenges would you face in an Open Source project that you wouldn't come across in a commercial one? Where do the skill sets required for each differ?
Arm-chair project leads (Score:3, Interesting)
Finding developers that stick (Score:4, Interesting)
Web Application Interoperability (Score:2, Interesting)
Other handy web apps we love include Mantis (bug tracker), CVSWeb and Chora, phpMyAdmin, phpPgAdmin, SquirrelMail and so on. There are lots of great web apps out there these days that can provide web based access to some cool functionality.
One major hassle, though, is that every one of them handles authentication and authorization differently. Setting up one login, or hacking them together into some sort of common framework is a giant hassle. Do you have any thoughts on how to get web applications to work well together?
- H
Re:Web Application Interoperability (Score:2)
Standards are Good.
HTTP auth is a standard. Nagios uses it. This is Good.
I recently merged three web applications we have, one of them being Nagios, to use a single htpasswd file, and control access to the different areas by htgroup.
Bug all free web software writers to support HTTP auth as an option, at minimum.
Re:Web Application Interoperability (Score:1)
Also, if you figure out the authentication and authorization, what about making web apps fit into the rest of the site? Not simply the "look" of things, but the navigational scheme, the general arangement of elements on your site that make things consistant and navigable.
It's an inordanent amount of work. Every try to fit someone else's forum app into your site? Oy veh. Faster to write your own.
Web apps are easy, cross-platform on the server (to an extent) and wonderfully cross platform on the client side. They have so much potential. But I think interoperability is the major failing right now.
- H
Plug-in vs. monolithic work? (Score:4, Interesting)
People issues? (Score:5, Interesting)
If so, how did you deal with those people? Did you ever find yourself forced to burn any bridges as a result of dealing with such people?
Re:People issues? (Score:2)
Normally they just go on to start OpenBSD...
Lack of Dynamically Sizable Containers (Score:1)
I can dream. One thing I must say is that netsaint is a wonderful wonderful piece of software!
Thanks so much.
"stealth" installations (Score:2)
Re:"stealth" installations (Score:1)
What the admins need to do is let the system fail, fix it, then include a "Nagios, a *free*, *no cost*, software product began detecting/predecting/correcting/whatever this in version foo, which was on released on bar" in theeir status reports.
Research (Score:2)
Let me focus the rest of my response on GUI development...
The problem I see with many projects like these is that they fail to innovate as much as they copy. If this world was 100% open source, we'd probably see more GUI fragmentation than we could stand. Going from one platform to another would be a very irritating process (more than it already is anyway).
So honestly, without companies like Apple and Microsoft spending millions a year on user interface research, we wouldn't have seen the tremendious WIMP evolution that we have over the past ten years.
In short, without closed source companies spending their own time and money to advance their products, the open-source competition wouldn't be near as advanced.
porting tools (Score:1)
We have a network of over fifty servers all monitored by Nagios and it has served us well.
My question is this:
Your software came with the option of the new "Object" model which you are switching to. When you have over fifty servers each with multiple services this creates a *huge* object file that sysadmins have to create.
I wrote a PHP application just to manage all of these issues and generate the object files from a database I created. The main nagios server connects to the central DB-server PHP page and wgets a fresh object file for it's child servers once each hour or so to facilitate changes. But I digress, my question is "How come no tools like this were released 'with' nagios?" And, would you be interested in my publishing the source for these programs or are you going to change this object file format at a later date?
This is a project I worked on a while ago and honestly have not looked recently but I remember sitting and scratching my head for a while wondering why this had not been implemented with the release.
Thanks for you time.
-Joel De Gan
-directnic.com
How did you find the time? (Score:1)
Mabye it is that your living arangements were fertile soil for NetSaint, or perhaps you were in a position to put all of your-out-of work hours into it? Did an early embrace from the community help give it momentum?
I'm sorry - i dont even know if your the original author or inherited it.
Ah well - back to work
Better Research (Score:2)
Let me focus the rest of my response on GUI development...
The problem I see with many closed-source projects like these is that they fail to innovate as much as they copy. If this world was 100% open source, we'd probably see more code re-use. Going from one platform to another would be a very easy process (more than it already is).
So honestly, without companies like Apple and Microsoft stealing innovations from open-source authors, we wouldn't have seen the tremendious WIMP evolution that we have over the past ten years.
In short, without open-source projects innovating to advance their products, the closed-source competition wouldn't be near as advanced.
Re:New paradigms for sucess metrics in an OSS worl (Score:1)
Re:Open Source for the rest of us... (Score:2)
FREEdraft [freeengineer.org] Free GPL
LinuxCAD [linuxcad.com] $99
ARCAD [arcad.de] $900 ($80 Student)
OCTree [octree.de] Free for non-commercial use.
VariCAD [varicad.com] $400