If the Comments Are Ugly, the Code Is Ugly 660
itwbennett writes "What do your comments say about your code? Do grammatical errors in comments point to even bigger errors in code? That's what Esther Schindler contends in a recent blog post. 'Programming, whether you're doing it as an open source enthusiast or because you're workin' for The Man, is an exercise in attention to detail,' says Schindler. 'Someone who writes software must be a nit-picker, or the code won't work ... Long-winded 'explanations' of the code in the application's comments (that is, the ones that read like excuses) indicate that the developer probably didn't understand what he was doing.'"
Seems reasonable (Score:5, Interesting)
Re:The comment may also be complex.. (Score:4, Interesting)
I've always subscribed to the theory that the code explains WHAT a program should do clearly enough that even a computer understands it.
Comments should fill in the gaps and answer the questions of WHY and HOW. For example, if I'm using a common pattern or idiom, I like to highlight that. I like to use the Delegation Pattern when doing SAX parsing of XML in Java. Rather that explain what the Delegation Pattern is about, I'll just cite the pattern name, add a link, and explain the nuances of that particular implementation and move on.
Re:Long-winded comments can be very useful (Score:4, Interesting)
Often better than including the full explanation inline is: "Proof that this works can be found ...". Yes, it's one more reference to look up, but some of the algorithms in, say, Knuth, are long and complex proofs that really will interrupt your code reading if included inline.
A lot of good comments answer one of these questions:
- Why couldn't this be simplified?
- What special case is this trying to handle?
- This looks weird. Why is it right?
- What expectations are we demanding from elsewhere in the code?
A situation where I find myself asking any of those questions and no comment is present can be nightmarish.
I agree, with reservations (Score:5, Interesting)
From 30 years of developing software, I've found time and time again that it actually does seem that people who don't know or care about the difference between "their" and "they're" are also too sloppy, unintelligent or just not anal enough to write clean, supportable and robust code.
However I feel we do need to make more allowance than the article's author did for people who did not learn English as a first language.
How much code is written by outsourcing? (Score:2, Interesting)
Re:Job Security (Score:2, Interesting)
Yeah, good luck with that. Have a second career ready?
I once consulted on a large program that had been written in as convoluted a manner as possible, with few or no comments. The guy who wrote it used to brag about "job security". Well, when the company finally allocated a budget to replace the program, they not only fired the guy, but made sure that as many people in that industry as possible knew about it. He was unable to find a new job in programming, and last I heard, he was trying to sell cars somewhere down south.
You might have job security for a while, but it's gonna catch up with you.
Or a management failure (Score:4, Interesting)
Or it could be just an indication of a management failure.
A couple of years ago I was brought in to save a project that was hopelessly behind schedule and getting nowhere. Pretty quickly I got the idea that whenever I check something into CVS, it gets re-checked by a really helpful girl there, richly decorated with comments. (Now I do comment classes and methods extensively, as well as places where higher elven magic was used, but I do _not_ write stuff like that now I'm iterating through a node's children. If you need a comment to understand that "for" loop, then there's something deeper wrong with my code.)
But, anyway, stuff like a line that said "if (currentNode.isRootNode())" had been decorated with the obviously helpful comment "// when the current node is the root node". I'm still at a loss as to what extra info is conveyed by that comment, since just reading the code out loud gets you almost the same sentence and definitely the same meaning.
And it went like that for every single line. Every single assignment, trivial loop, etc, was dutifully duplicated in that line's comment.
Turns out, they were asked to comment their code extensively, and judged basically by quantity. So she was just abiding by the rules.
OT on long comments (Score:5, Interesting)
To maintain some sense of topicality: I don't particularly agree with the blog post. As someone with good English skills, I've read a lot of code where the English language skills (and thus spelling and grammar in the comments) of the coder are below mine, but their skills in the computer language at issue are superior to mine. Frankly, there's a far greater relationship between accuracy of the comments (do they actually describe what the code does) and the quality of the code, than there is between spelling, subject-verb agreement, and number of spaces after a period and the quality of the code. This relationship does follow the blog author's contention about coders needing to be nit-pickers.
Occasionally in my coding, I write a novel in the function header. Generally, this isn't because I don't understand the problem so much as its because I do understand the problem. I've spent hours or days understanding the problem, and the particular necessray function that implements the solution, and I don't relish spending hours or days 6 months in the future remembering what I know today. The interesting thing is that, most of the time, the novel is multiple times larger than the function - 50 lines of comment for a 20 NCLOC function isn't unheard of.
In my specialty (embedded systems, with especially tight hardware integration), there are functions that need to be written that deal with extraordinarily complex situations. Many times, the bare code tells a misleadingly simple tale - "do this, that, and the other thing", rather than (as Russ Nelson pointed out above)
but to explain all the other code that could have been written, but wasn't
. Oftentimes, the novel is there to explain all the ways to trip up in this 20-line function - e.g. unspecified hardware dependencies, subtle system dependencies, unobvious race conditions. Sometimes its there to explain why, no matter how wrong the function appears, it is actually correct.
Re:The comment may also be complex.. (Score:3, Interesting)
True. Furthermore, it's arguably proper, when implementing a particularly clever and streamlined hack (i.e. impossible to decipher, when debugging the code), to set it off from the code visually and then follow it with a commented-out block of code which does exactly the same thing, albeit slower or less efficiently, and which will be much easier to understand.
As a bonus, if the code must later be changed and the original hack would no longer be possible in the new functionality of the code, the procedure has already been written in a more logical and modifiable fashion, so you can start from that rather than having to re-code it entirely on the spot.
Re:The comment may also be complex.. (Score:1, Interesting)
...back in my university days, we used to scoff at the morons in the labs who would, quite literally, randomly hack their projects until they worked.
Apparently there *is* a dark side to a high-quality unit test suite... it gives idiots a false sense of security and justifies their idiotic development practices.
You sir, are a snob. A programming snob at that, congrats.
Re:The comment may also be complex.. (Score:4, Interesting)
I've gotten into the habit of commenting first, coding second. It is especially useful in complex systems.
I find that if I write the high level logic in comments of all the functions I'm going to write, it helps me find where I should break out repeated logic and solidifies the design. Once everything has been commented, I can go back and write the code.
This lets me know what the variables are going to be and can name them appropriately (why name it `i` when I can call is `block_index`). I don't lose the big picture of what needs to get done. It gives me targets to meet and stopping points at the end of the day (a sense of accomplishment and goals for tomorrow is *really* nice). Future coders can read my comments and see what I was thinking (this has actually happened to me). As a bonus, I can worry more about corner cases as I'm writing the code instead of creating corner cases to worry about later.
I've done it with a Flash memory interface, O/S memory manager, and a kernel module. Each time I've finished, even if it isn't the best code in the world, I know that if the next person reads the comments, they'll know what I was thinking.
Better code all around.
Comments are for future maintainers (Score:4, Interesting)
I feel that comments can be broken into four types:
Coding Drunk (Score:3, Interesting)
Roman Bridges (Score:5, Interesting)
Oh please. Inexactitude is *not* the same thing as not understanding why something works at all. We can build miles-long bridges *specifically* because we understand the underlying physics, and anyone who built a bridge without understanding the physics of why it stood under load would be drummed out of the industry.
I am assuming you refer to the modern physics that we are all so proud of. Let me tell you that in Europe, whenever you get a real serious flooding on a major river, only one kind of bridge survives with no bruises at all: Roman bridges. They are 2000 years old, but they're still up. The crap we're building today won't be up in 2000 years, I can bet on it. Look at the mess with the bay bridge, down twice in 50 years!!!! Ah ah ahah! Kuddos to modern engineering.
That would be because the Romans had some engineering, but not the equations we have today, so they over-engineered their bridges for safety because they knew they couldn't calculate the exact, optimal configuration for the expected loads and stresses. Over-engineering is a good thing if you don't have to account to the bean-counters. The George Washington Bridge across the Hudson River was also over-engineered because they didn't know the exact tolerances, and it has held up rather well.
Re:Coding Drunk (Score:3, Interesting)
Such a cool teacher... and the Computer Science head now...
Re:Real Programmers... (Score:3, Interesting)
I don't fully agree... however, I do think its sometimes true.
Often, one of the ways I know that my code is pretty good and well organized is that I find few places ... // setup the connection... // submit queries ... // check return values"
where comments would even be helpful. Often the comments just end up deliniating sections so I can skip to them easily "// check parameters
Of course, are we counting the comments that document what a function does? As I have been mostly playing with Java lately, the javadoc comments for documenting methods seem to be another beast entirely.
I was always a fan of the Linux Kernel coding style statement that if your code is too complex to be understood by a less than gifted high school student, then you should consider rewritting it.
-Steve
The reality... (Score:3, Interesting)
I tend to write comments of varying lengths - sometimes, writing longer comments not for my own benefit, but for the benefit of the next coder - someone who may or may not have my understanding of the system or code; and more likely than not, either won't have the time to learn the code 100% in-and-out, or are not up to the same par. So sometimes, I will write a large comment block on an important thing in the code - so that they (whoever they are) will be able to understand it quite quickly. However, that is (as it should be) quite few and far between; sometimes documenting complex logic from elsewhere in the system that I have no control over but the programmer needs to be able to understand.
Of course, there is also the typical PHB managers who evaluate their programmers based on LOC, SLOC, and CLOC (LOC = SLOC + CLOC), and look for an even distribution of SLOC/CLOC - e.g. SLOC/CLOC = 1. In some cases that is good; but in most it is not.
All in all, you can't tell how good the program is by just general comments or comment analysis.
BUT if you are sloppy in your comments (grammar, spelling, etc.), you have probably been sloppy in your code. To that end, I do very well agree with the article.
P.S. BTW, I've been in 3 positions where I have replaced people. One fired; one died; and one quit. In two of those three, the original developer was no longer available for inquiry; in the third, it was possible but not easy and only for a short time after he left.
Re:Comments are good (Score:2, Interesting)
Comments are a form of redundancy, usually only figuratively, but sometimes for real.
I once was hired to rewrite some old code from the late 60's or early 70's from OCR'd screen dumps. The mainframe system it ran on had been taken off-line, and wasn't being brought back since the company gave early retirement to anyone who knew anything about it. There was a mix of COBOL, FORTRAN, SAS, JCL, etc. I was rewriting in C (mostly just the numerical stuff that had been written in FORTRAN). No one at the company understood what the code actually "did", but they wanted to duplicate the reports that it produced, exactly. I eventually did enough research to completely understand everything except for a single routine. It was all based on table lookups; tables that were generated based on mathematics derived by a researcher in Canada that were "unpublished." I could find several Bell Systems Journal articles that referenced this paper, but could not find the paper or the math anywhere. My sister in law, a research librarian, even located the author for me and I wrote to him, but he never replied. I knew there were problems with the data in the tables, from the obvious OCR errors like ones replaced with L's, zeros with D's, etc. I wanted to regenerate the tables myself (tables were being used for speed) in order to ensure they were accurate. Eventually, I had to bite the bullet and just use what I had. Fortunately, besides referencing the journal articles containing the original math the tables were generated from, the comment contained a complete commented out copy of a prior version of the function. Before it was moved to IBM hardware in the mid-70s, the original code ran on a CYBER something, and the FORTRAN compiler indexed and initialized multi-dimensional arrays in a different order. I wrote a Perl script to flip the entries in these arrays around to the "new" order, and compare table entries, marking any discrepancies. From the list of discrepancies, it was easy to determine what the OCR error patterns were, allowing me to derive the original table. I still felt uncomfortable, and eventually got the customer to get me a hard copy of the original screen dump used for the OCR process. I was able to verify my results from that.
The ultimate test was ensuring that input from the same data produced output that exactly matched the original output for the same data. This lead to finding and having to work around a bug in AIX's math libraries, but I eventually got there.
In doing that project, the original author's copious comments were *indispensable*.
Re:You are not expected to understand this (Score:1, Interesting)
The hero worship is getting rather tiresome. As someone who has dabbled in OS development, I understand what that is doing, though not for lack of obfuscation.
The biggest problem with that code is the lame function name, it's an abbreviation of something though I'm not sure what exactly [Might be absolute return]; here is how I would have written that:
* then wipe the stack call frames for all our parent callers
* and return immediately all the way to the last function
* to execute a stack state save.
*/
if (rp->flags & PROCESS_STATE_SWAPPED) {
rp->flags &= ~PROCESS_STATE_SWAPPED;
restore_stack_pointer(u.saved_stack_pointer);
}
It's hard to be sure this is completely equivalent given the lame variable names but I am pretty sure this is what is happening; it's essentially a primitive setjmp/longjmp process. As to why, the comment is even more unhelpful in that regard; it looks like it's being used to throw an exception [before exception throwing existed] by returning all the way to the last function that registered a handler to deal with the swap out.
Re:Co-workers (Score:3, Interesting)
One of the things I prefer doing, in fact, is to write out a series of "steps" that I intend to do in comment form, one per line (perhaps multi-line), e.g.:
# 1. Initiate communication with database
# 2. Does the schema exist? If so, goto 6
# 3. Configure the schema
# 4. Upload initial template
# 5. Configure initial node
# 6. Add the node we're working with
# (etc)
Then, in the code that I'm working on, I copy each of the above comment lines, and then fill in the code in between.
I find that this gives a great overview, with an easy-to-parse view of how it happens. And, if someone changes one set of comments without another, it's a great clue to the future maintainer to review the source control history because I would never initially write mis-matched comments.
Or maybe I would, but then, I rarely code maliciously. :)