Undocumented Open Source Code On the Rise 94

Posted by Soulskill on Sunday June 15, 2008 @12:40PM from the exercise-for-the-reader dept.

ruphus13 writes "According to security company Palamida, the use of open source code is growing rapidly within businesses. However, the lack of documentation and understanding of how the code works can increase the vulnerability and security risks the companies face. OStatic quotes Theresa Bui-Friday saying, 'In 2007, Palamida's Services team audited between 300M to 500M lines of code for F500 to venture-backed companies, across multiple industries. Of the code we reviewed, Palamida found that applications written within the last five years contain 50% or more open source code, by a line of code count. Of that 50% of open source code, 70% was undocumented. This is up from 30% in 2006.' How can businesses protect themselves and still draw on open source code effectively?"

Undocumented Open Source Code On the Rise

This discussion has been archived. No new comments can be posted.

Search 94 Comments Log In/Create an Account

Comments Filter:

Comment removed (Score:3, Insightful)

by account_deleted ( 4530225 ) writes: on Sunday June 15, 2008 @12:47PM (#23801079)

Comment removed based on user account deletion

Source code is its own documentation (Score:5, Insightful)

by mangu ( 126918 ) writes: on Sunday June 15, 2008 @12:47PM (#23801083)

I'd rather have the source code which I can read and try to understand than an executable file alone.

The only reason why we don't see an article "Undocumented Commercial Software On the Rise" is because the public cannot see how badly documented the commercial software is.

Avoid projects with one developer (Score:3, Insightful)

by Animats ( 122034 ) writes: on Sunday June 15, 2008 @12:48PM (#23801089) Homepage

The original article is an ad for a service that looks at code for you. But it's a real problem.
A basic problem with open source is that once you get beyond the top 50 or so projects, the quality is usually crap. Look at the source from a few random projects on SourceForge. There aren't that many real "community" projects, where multiple programmers are working on the same code. The long tail isn't very good.

70% Undocumented, huh? (Score:5, Insightful)

by Devin Jeanpierre ( 1243322 ) writes: on Sunday June 15, 2008 @12:49PM (#23801107)

How do you measure something like how well things are documented with a percentage? Some code simply doesn't need documentation. Other code needs plenty. Is 0% a 1:1 relationship between lines of code and lines of comments? That whole thing seems a bit strange. They could certainly back it up if they wanted to, but that'd be too much effort.

Statistics ... (Score:3, Insightful)

by tomhudson ( 43916 ) writes: <(moc.nosduh-arab ... (nosduh.arabrab)> on Sunday June 15, 2008 @12:50PM (#23801117) Journal

Of that 50% of open source code, 70% was undocumented.

They talked about looking at 300m LOC. I'd hope 70% was "undocumented". 70% of most code is just common-everyday stuff that doesn't NEED to be documented in the sense that comments are completely wasteful. It's the "glue code" that needs to be documented, and the non-intuitive stuff, and stuff that is done for a reason that, on first glance, looks like the writer had a brain fart, but, in this special case, makes sense, or "corner case" situations.
Do *NOT* "insert comments like "for (i=0; i

Same old, same old. (Score:5, Insightful)

by khasim ( 1285 ) writes: <brandioch.conner@gmail.com> on Sunday June 15, 2008 @12:52PM (#23801139)

In today's world of 24/7 and persistent network access, developers dispersed across multi-national sites can include open source, freeware, public domain, evalware (demos of commercial software), etc, into the code they are writing without triggering the usual checkpoints in the procurement process.
I've seen that same issue YEARS ago. And I'm not talking code snippets. I'm talking systems that had "evalware" tools in them.

This has NOTHING to do with "multi-national sites" or any of that.

This has EVERYTHING to do with clearly stating the rules and ENFORCING those rules.

The rules do not enforce themselves. Someone, somewhere has to approve the code that goes in.

The problem is that management does NOT understand code and will happily farm out the work to anyone who says that they can produce X lines for $Y. Without oversight. The less oversight, the less expensive the project is. Which means bigger bonuses for those same executives.

I notice an omission (Score:4, Insightful)

by Todd Knarr ( 15451 ) writes: on Sunday June 15, 2008 @12:55PM (#23801167) Homepage

They talk about how much of the open-source code is undocumented. I notice that they don't bother to mention how much of the in-house code is also undocumented. My experience as a software engineer is that their in-house code's probably at least as poorly documented as the open-source stuff. And if the business finds this state of affairs acceptable for their in-house code, why's it any more of a problem for the open-source parts?

I've also found that when the business does get a consultant in who demands documentation, they usually demand something that's completely useless for the actual developers. Eg., they demand UML models for all the software. Well, that's nice and all, but most of what's in the UML you can see by glancing at the class definitions. The things a developer needs, like what the methods are supposed to do and what gotcha caused a particular way of doing it to be picked and what assumptions the code's making about it's inputs and outputs, have no place in a UML model.

Comments are overrated (Score:2, Insightful)

by Anonymous Coward writes: on Sunday June 15, 2008 @12:56PM (#23801169)

How much of that "documented" code in the article was documented correctly? Good code is easy to understand by good programmers. Documenting things is just another dependency to fall out of sync. Would you rather fire up a few neurons to grok some code yer working on or spend hours pulling your hair out only to eventually figure out your documentation was wrong?

Documentation should be used sparingly and as tightly woven into the development process as possible. The programmer should document their code when necessary as soon as they think it not at some later pass. Provisions for inline documentation should be used. When a programmer modifies some code they are more likely to also modify the documentation when it is immediately adjacent. The probability that documentation will remain in sync is the inverse of the square of that distance.

Documenting is kind of hard. (Score:3, Insightful)

by Aphoxema ( 1088507 ) writes: on Sunday June 15, 2008 @01:00PM (#23801197) Journal

I tried to get into documenting software for Ubuntu. I wanted to help but I didn't really have any programming skills, I think the most complicated stuff I've ever done is scripts for mIRC and some HTML.

After really sitting down with some programs, I realized I just had no idea where to start. There was certainly more to be said than who made the program and what license it was under like many programs have in their 'help' and 'about' menu, but it really does get to be an enormous task and it's a certain amount of responsibility because the few people that will read the documentation first will take everything it says to heart.

I might try again, but I'm going to be sure I really have time to do it and the patience to read through source code. mangu is right, even though I don't know how to program but it's not hard to figure some things out and sometimes there's vital comments 'between the lines'.

I have noticed more programs (included in Ubuntu) have the information I need when I care to look at it now, I generally check documentation for command line arguments and stuff in case --help won't tell me everything or anything at all. At least someone's getting the job done.

Re:Source code is its own documentation (Score:5, Insightful)

by jps25 ( 1286898 ) writes: on Sunday June 15, 2008 @01:40PM (#23801465)

I disagree. This isn't about closed vs open source, this is about decent programming. Comments in code are neccessary and a minimal requirement for any project. At least add one line to any function explaining what the function does, what its input is and what it returns. This isn't so hard and it won't kill you, but it'll make life easier for you and anyone else who will have to deal with the code later. It also makes finding errors easier, as your code may not be doing what your specifications say it should do. I don't understand this hatred for comments and the "code-is-its-own-documentation"-philosophy. I really don't. <code> #include <iostream> #include <algorithm> #include <iterator> #define ch_ty(ty) std::istream_iterator<ty>::char_type #define tr_ty(ty) std::istream_iterator<ty>::traits_type #define cin_iter(ty) std::istream_iterator<ty, ch_ty(ty), tr_ty(ty)>( std::cin ) #define void_iter(ty) std::istream_iterator<ty, ch_ty(ty), tr_ty(ty)>() int main( int argc, char *argv[] ) { while ( (cin_iter(size_t)) != void_iter(size_t) ? ( std::cin.unget(), argc += *cin_iter(size_t) ) : ( printf( "\nsum: %d\n", --argc ), system("exit") ) ); } </code> Perhaps easy to understand, but one comment-line would save you minutes wasted understanding and reading it. or <code> #include <stdio.h> int v,i,j,k,l,s,a[99];main(){for(scanf("%d",&s);*a-s;v=a[j*=v]-a[i],k=i< s,j+=(v=j<s&&(!k&&!!printf(2+"\n\n%c"-(!l<<!j)," #Q"[l^v?(l^j)&1:2])&&++ l||a[i]<s&&v&&v-i+j&&v+i-j))&&!(l%=s),v||(i==j?a[i+=k]=0:++a[i])>=s*k&& ++a[--i]);printf("\n\n");} </code> Well, obviously obfuscated, but one comment and it's immediately clear what it does.

Re:Source code is its own documentation (Score:2, Insightful)

by Anonymous Coward writes: on Sunday June 15, 2008 @02:19PM (#23801825)

Ha! You better believe it.

Recently, we bought a $middleware-thingy$ needed for a specific client installation..

Cost: 2000$.

Documentation and support: zero (apparently the original coder had left [without documenting anything], but the company keeps on selling licenses for something they have no idea of how works).

Gotta love Slashdot (Score:5, Insightful)

by Anonymous Coward writes: on Sunday June 15, 2008 @03:35PM (#23802547)

Gotta love this place. At the time of this posting, there are 11 comments modded 3 or higher. Of those, only ONE makes any reference to the act of documenting where the code is coming from (which is what the article is about). All the rest are talking about writing documentation for code, or commenting code as its written. Way to miss the ball, guys! This article is addressing you specifically, yet you have no idea what they're even saying because you can't be bothered to try to listen. Nice.

Re:Source code is its own documentation (Score:5, Insightful)

by Splab ( 574204 ) writes: on Sunday June 15, 2008 @04:00PM (#23802767)

I keep hearing people pro open source code say "I can check it!" Well can you? Have you done so - in a project spanning more than a few thousand lines of code? Just because the code is there to see doesn't actually mean its doable to waddle through it.

I'm not for either open source or proprietary code, my employer pays me money to produce code, what he does with it is his business, but what I do have, is experience using both proprietary code and open source code - both models have pros and cons.

With proprietary code there are someone I can call and they are by contract obliged to fix problems within a certain time frame. One particular instance is a database we are paying license fees for, I will not name them but to this date I have found more than 10 vectors that causes crashes. Those problems have been addressed by the vendor in a timely manner (I have yet to find bugs that would be show stoppers, but some did require annoying workarounds). With OSS we don't have this possibility, yes, we can log a bug in whatever bug tracker they use and hope someone will address our issue, but we have no guarantee - also in my experience logging a bug with OSS developers can be quite a daunting process, people can have some serious egocentric issues, while this of course is also applicable for proprietary software, there are someone higher in the food chain who can be called.
With OSS we of course got the good fortune of being able to go through the source code and try to fix the code ourselves... right?
Have you ever even considered just how bloody huge the code base is for something like a database? Tracking down a bug, well yes, the gdb can tell you where the program stopped working, but unless you have some really really good code reading skills and are up to date on everything that happens algorithm wise you have close to zero chance of fixing anything without causing major problems.

Also as a developer I got enough to do creating my own applications, I simply do not have the time to dig through thousands of lines of code every time something new breaks. Yes open source is nice, small projects are easy to help get along, fixing small bugs, but at some point the project grows so big that anyone using it needs to have someone they can call at 4 am in the morning to help them.

Oh and just because some software is proprietary it doesn't mean you don't have access to the source code, even at Microsoft you can buy access to the source.

We got builds with debug flags from the database vendor because we cannot share our database with them, therefore stack traces etc. has to be generated locally and shipped to them. (yes this is a bit annoying, but having sensitive records out in the wild is a tad more problematic).

I don't pick OSS over proprietary or visa versa, I pick what ever tool fits my needs.

Re:Source code is its own documentation (Score:3, Insightful)

by Bloater ( 12932 ) writes: on Sunday June 15, 2008 @06:35PM (#23804023) Homepage Journal

I keep hearing people pro open source code say "I can check it!" Well can you? Have you done so - in a project spanning more than a few thousand lines of code?
I've checked a few lines here and there that interest me, other people check the lines that interest them. An awful lot of stuff gets checked.
With proprietary code there are someone I can call and they are by contract obliged to fix problems within a certain time frame.
But that doesn't mean they will and it doesn't mean you get to sue them, and if you do win in court, it doesn't mean that your business is still viable. This is the big proprietary fallacy "There is someone to blame", you can blame all you want, meanwhile you're on the dole because you /can't/ get stuck in, fix it, and carry on your business. With Open Source you can be sure that the worst case isn't financially devastating.
With OSS we don't have [the vendor fixing things on demand], yes, we can log a bug in whatever bug tracker they use and hope someone will address our issue, but we have no guarantee.
I know vendors that just tell you its a feature and that crippling your business is what you paid for. They still make money because not everybody hits the problem but you can't switch because your budget is already spent on this "solution" and hiring more employees is a different budget and never gets accounted as the TCO of the solution in question. A small business will notice but a small business will already be dead by this point.
With OSS we of course got the good fortune of being able to go through the source code and try to fix the code ourselves... right?
Have you ever even considered just how bloody huge the code base is for something like a database?
There are no serious OSS databases where you will have a problem with bug filing or serious bug fixing. Since there is only one serious OSS database (PostgreSQL), this is quite easy to determine. PostgreSQL has a great reputation and the chance of you having to fork out on demand for a bug to be fixed is as slim as with the proprietary vendor. You can also buy a support contract from many of the core developers. OSS and money are not mutually exclusive.
but at some point the project grows so big that anyone using it needs to have someone they can call at 4 am in the morning to help them.
So get a support contract on your OSS software and call them.
Oh and just because some software is proprietary it doesn't mean you don't have access to the source code, even at Microsoft you can buy access to the source.
Buy access? Even after paying for the support contract? surely you should be able to pay for the support contract and have the source for nothing extra, along with the right to extend the functionality and a vendor receptive to the idea of including your work to take the support burden on themselves under you support contract?

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Undocumented Open Source Code On the Rise 94

Undocumented Open Source Code On the Rise More Login

Undocumented Open Source Code On the Rise

Comment removed (Score:3, Insightful)

Source code is its own documentation (Score:5, Insightful)

Avoid projects with one developer (Score:3, Insightful)

70% Undocumented, huh? (Score:5, Insightful)

Statistics ... (Score:3, Insightful)

Same old, same old. (Score:5, Insightful)

I notice an omission (Score:4, Insightful)

Comments are overrated (Score:2, Insightful)

Documenting is kind of hard. (Score:3, Insightful)

Re:Source code is its own documentation (Score:5, Insightful)

Re:Source code is its own documentation (Score:2, Insightful)

Gotta love Slashdot (Score:5, Insightful)

Re:Source code is its own documentation (Score:5, Insightful)

Re:Source code is its own documentation (Score:3, Insightful)

Related Links Top of the: day, week, month.

Slashdot Top Deals

Slashdot