Become a fan of Slashdot on Facebook

 



Forgot your password?
typodupeerror
×
Programming IT Technology

The Law of Leaky Abstractions 524

Joel Spolsky has a nice essay on the leaky abstractions which underlie all high-level programming. Good reading, even for non-programmers.
This discussion has been archived. No new comments can be posted.

The Law of Leaky Abstractions

Comments Filter:
  • Informative (Score:5, Insightful)

    by Omkar ( 618823 ) on Thursday November 14, 2002 @11:26AM (#4668375) Homepage Journal
    Although I used to program as a hobby, my eyes bugged out when I saw this article. It's actually quite interesting; I finally realize why the hell people program in lower level languages.

    One point that I think could be addressed is backward compatibilty. I really know nothing about this, but don't the versions of the abstractions have to be fairly compatible with each other, especially on a large, distributed system? This extra abstraction of an abstraction has to be orders of magnitude more leaky. The best example I can think of is Windows.
  • by Jack Wagner ( 444727 ) on Thursday November 14, 2002 @11:27AM (#4668384) Homepage Journal
    I'm of the idea that the whole premise that high-level tools and high level abstraction coupled with encasulation are the biggest bane of the software industry. We have these high level tools which most programmers really don't understand and are taught that they don't need to understand in order to build these sophisticated products.

    Yet, when something goes wrong with the underlying technology they are unable to properly fix their product because all they know is some basic java or VB and they don't understand anything about sockets or big-endian/little endian byte alignment issues. It's no wonder todays software is huge and slow and doesn't work as advertised.

    The one shining example of this is FreeBSD, which is based totally on low level C programs and they stress using legacy program methodologies in place of the fancy schmancy new ones which are faulty. The proof is in the pudding, as they say, when you look at the speed and quality if FreeBSD, as opposed to some of the slow ponderous OS's like Windows XP or Mac OSX.

    Warmest regards,
    --Jack
  • Re:timeout (Score:3, Insightful)

    by binaryDigit ( 557647 ) on Thursday November 14, 2002 @11:31AM (#4668438)
    Well I wouldn't say that it's reliable "because there are timeouts". AAMOF, timeouts just compicate things. So you timeout waiting for packet N, you request a resend of it, and in the interim, guess what, packet N shows up, now you have two N's. Your code is now more complex in having to deal with this situation. Timeouts are just another parameter used adjust the behaviour of the algorithms that control the protocol. Getting deterministic results from an undeterministic foundation involves making observations, accepting some compromises, making some simplifying assumptions, and then writing code that takes all those things into account to come up with something that usually works.
  • by Anonymous Coward on Thursday November 14, 2002 @11:34AM (#4668454)
    This is old news to bloggers, everyone and their mother has been linking to this article for days, although I don't find it terribly insightful.
    An abstraction by definition is not 100% fidelistic to the source. All abstractions, like any technology, simultaneously extend and constrain certain aspects of experience. A car extends our range of transportation, but simultaneously constrains our other abilities while driving, hence so many accidents when not completely focused on the road. The only question is, is the abstraction worth the price of its leaks? Most of the time the answer is yes.
  • Good Points.... (Score:3, Insightful)

    by Cap'n Canuck ( 622106 ) on Thursday November 14, 2002 @11:35AM (#4668461)
    This is a well written article that explains many things!

    The more layers, the slower things get.
    As computers speed up, it allows more feasible abstraction layers and next generation languages.

    Optimizing routines is still important.
    This holds true for ALL layers, as users are always expecting faster and faster apps.

    It's still important to know all the layers
    This allows old-timers to look down their noses at those whippersnappers.
  • by cjustus ( 601772 ) on Thursday November 14, 2002 @11:36AM (#4668475) Homepage
    This article does a great job of describing problems in development environments today... But how do we solve it?

    Hire VB programmers with assembly language experience that are network admins and are familiar with assembly language? No - the solution is not to hire people just with skill XYZ, but to hire people that are capable of thinking for themselves, doing research, problem solving, and RTFM...

    It's a shame that so many companies hiring today are looking for skill X, Y, and Z... so some moron with X, Y, and Z on his resume gets hired, while someone that knows X - could learn Y and Z, and could outperform the moron, gets overlooked...

    Yet I see it happen at my IT Staffing company almost every day...

  • by smd4985 ( 203677 ) on Thursday November 14, 2002 @11:37AM (#4668479) Homepage
    though joel sometimes thinks he is cooler than he is, this article he wrote was great. i think the points he make are valid.

    i think the solution is that we need to have CERTIFIED software components. to really evolve the art of software engineering, we need to stop having to deal with low-level ideas and increasingly interact with higher-level abstractions. but joel makes the point that any abstraction we may use is leaky, forcing us to understand how it works! so we need to start having certification processes for abstractions - i should be able to pick up a component/abstraction and believe in its specifications 100%. that way i can stop worrying about low-level details and start using components to build even more interesting systems / applications.

  • by binaryDigit ( 557647 ) on Thursday November 14, 2002 @11:37AM (#4668482)
    Well I'd agree up to a point. The fact is that FreeBSD is trying to solve a different problem/attract a different audience than XP/OSX. If FreeBSD was forced to add all the "features" of the other two in an attempt to compete in that space, then it would suffer mightily. You also have to take into account the level/type of programmers working on the these projects. While FreeBSD might have a core group of seasoned programmers working on it, the other two have a great range of programming experience working on it. A few guys who know what they're doing working on a smaller featureset would always produce better stuff than a large group of loosely coupled and widely differing talents working on a monsterous feature set.
  • Re:What? (Score:4, Insightful)

    by Minna Kirai ( 624281 ) on Thursday November 14, 2002 @11:44AM (#4668533)
    The author wanted to make the point that TCP (fairly reliable) is built on IP (absolutely unreliable).

    UPD is also build on IP, but adds almost nothing to it (aside from the ability to target individual port numbers instead of an entire computer), so its not any more reliable than IP.

    The internet's developers could've built TCP on top of UDP, but for unknown reasons they didn't. Maybe they felt that a whole extra layer of abstraction would've been useless overhead.
  • by nounderscores ( 246517 ) on Thursday November 14, 2002 @11:45AM (#4668542)
    Is our own bodies.

    I'm studying to be a bioinformatics guy with the university of melbourne and have just had the misfortune of looking into the enzymatic reactions that control oxygen based metabolism in the human body.

    I tried to do a worst case complexity analysis and gave up about half way through the krebs cycle. [virginia.edu]

    When you think about it, most of basic science, some religeon and all of medicine has been about removing layers of abstraction to try and fix things when they go wrong.
  • by venomkid ( 624425 ) on Thursday November 14, 2002 @11:46AM (#4668548)
    ...to start with, or at least be competent with, the basics.

    Any good programmer I've ever known started with the lower level stuff and was successful for this reason. Or at least plowed hard into the lower level stuff and learned it well when the time came, but the first scenario is preferable.

    Throwing dreamweaver in some HTML kiddie's lap, as much as I love dreamweaver, is not going to get you a reliable Internet DB app.
  • by Anonymous Coward on Thursday November 14, 2002 @11:49AM (#4668573)
    Hiding ugliness has its penalties. Over time processor performance buries these penalties. What Joel doesn't tell you is that abstraction can buy you productivity and simply put, make programming easier and open it up to larger audiences.

    Maybe someone out there prefers to program without any abstraction layers at all, but they inherit so much complexity that it will be impossible for them to deliver a meaningful product in a reasonable time.

  • by jorleif ( 447241 ) on Thursday November 14, 2002 @11:50AM (#4668579)
    The real problem is not the existance of high-level abstractions, but the fact that many programmers are unwilling or unable to understand the abstraction.

    So you say "let's get rid of encapsulation". But that doesn't solve this problem, because this problem is one of laziness or incompetence rather than not being allowed to touch what's inside the box. Encapsulation solves an entirely different problem, that is the one of modularity. If we abolish encapsulation the same clueless programmers will just produce code that is totally dependent on some obscure property in a specific version of a library. They still won't understand what the library does, so we're in a worse position than when we started.
  • TCP for the bored (Score:5, Insightful)

    by mekkab ( 133181 ) on Thursday November 14, 2002 @11:54AM (#4668606) Homepage Journal
    FIne it's relaibale becasue of acks, timeouts, adaptive re-transmit timeouts that take statistical averages of RTT times, exponential back-off and slow start, window acks which keep track of what bytes are received, etc.

    So in your case of timing out N, re-tx'ing N, and then getting the repsonse to the first N back after sending the second N, you do two things:
    1) Good! You got yr packet!
    2) keep track of how many bytes you have received thsu far (TCP is not sending messages, it is sending a stream)
    3) when you get the response from your second request, discard it, becuase you already received those bytes from the stream.
    4) since you timed out, DON'T use the Round TRip Time for that reponse: slow down your expected RTT time, and THEN start measuring.

    And guess what? If I unplug the NIC of the other machine, there is no reliable way of transmitting that data (assuming your destination machine isn't dual homed)- so I keep streaming bytes to a TCP socket and I don't find out my peer is gone for approx. 2 minutes.
    WOW. There's nothing reliable about that boundary condition!

    my point is TCP is reliable ENOUGH. But I wouldn't equate it with a Maytag warranty. It is not a panacea. Infact, for a closed homogenous network I wouldn't even consider it the best option. But if the boundary conditions fall within the acceptible fudge range (remember Real Time human grade systems are not 100% reliable, only 99.99999% and much of that is achieved through redundancy) your leaks are ok.
  • by Ars-Fartsica ( 166957 ) on Thursday November 14, 2002 @11:58AM (#4668647)
    This argument is so tired. The downfall of programming is now due to people who can't/don't write C. Twenty years before that the downfall of programming was C programmers who couldn't/wouldn't write assembler.

    The market rewards abstractions because they help create high level tools that get products on the market faster. Classic case in point is WordPerfect. They couldn't get their early assembler-based product out on a competitive schedule with Word or other C based programs.

  • by Minna Kirai ( 624281 ) on Thursday November 14, 2002 @11:59AM (#4668663)
    so we need to start having certification processes for abstractions

    Sure, that'd be nice I guess. And it'd be impossible, except for simple cases. Joel's example of the motorcar in the rain can't be fixed 100% unless you've got a weather-genie flying along above you.

    Leaky abstractions have 2 main causes:
    • Laws of Physics ("acts of god"). These you can never fix, because someday the implementation of the abstraction will encounter a situation beyond its ability. When that happens, the users can either give up and go home, or learn about the mechanisms underlying the abstraction and resolve it directly (or call in a specialist to do it).
    • Backwards compatibility ("acts of man"). Things like the ASP example of hyperlinks to submit forms, or the C++ string class (mostly). These we could possibly fix, but to do so is often prohibitively expensive, or just won't pay off fast enough. The goal of 100% specification confidence is nice, but today people aren't usually willing to make the sacrifices.


  • by thom2000 ( 529108 ) on Thursday November 14, 2002 @12:05PM (#4668710)
    Sure, the author points out a few examples of leaky abstractions. But his conclusion seems to be that you always will have to know what is behind the abstraction.

    I don't think that's true. It depends on how the abstraction is defined, what it claims to be.

    You can use TCP without knowing how the internals work, and assume that all data will be reliably delivered, _unless_ the connection is broken. That is a better abstraction.

    And the virtual memory abstraction doesn't say that all memory accesses is guaranteed to take the same amount of time, so I don't consider it to be leaky.

    So I don't entirely agree with the author's conclusions.
  • by PureFiction ( 10256 ) on Thursday November 14, 2002 @12:05PM (#4668719)
    Proper abstractions avoid unintended side-effects by presenting a clean view of the intent and function of a given interface, and not just a collection of methods or structures.

    When I read what Joel wrote about "leaky abstractions" i saw a peice complaining about "unintended side-effects". I don't think the problem is with abstractions themselves, but rather the implementation.

    He lists some examples:

    1. TCP - This is a common one. Not only does TCP itself have peculiar behavior in less than ideal conditions, but it is also interfaced with via sockets, which compound the problem with an overly complex API.

    If you were to improve on this and present a clean reliable stream transport abstraction is would likely have a simple connection establishment interface and some simple read/write functionality. Errors would be propagated up to a user via exceptions or event handlers. But the point I want to make is that This problem can be solved with a cleaner abstraction.

    2. SQL - This example is a straw man. The problem with SQL is not the abstraction it provides, but the complexity of dealing with unknown table sizes when you are trying to write fast generic queries. There is no way to ensure that a query runs fastest on all systems. Every system and environment is going to have different amounts and types of data. The amount of data in a table, the way it is indexed, and the relationship between records is what determines a queries speed. There will always be manual performance tweaking of truly complex SQL simply because every scenario is different and the best solution will vary.

    3. C++ string classes. I think this is another straw man. Templates and pointers in C++ are hard. That is all there is too it. Most Visual Basic only coders will not be able to wrap their minds around the logic that is required to write complex c++ template code. No matter how good the abstractions get in C++, you will always have pointers, templates, and complexity. Sorry Joel, your VB coders are going to have to avoid c++ forever. There is simply no way around it. This abstraction was never meant to make things simple enough for Joe Programmer, but rather to provide an extensible, flexible tool for the programmer to use when dealing with string data. Most of the time this is simpler, sometimes it is more complex (try writing your own derived string class - there are a number of required constructors you must implement which are far from obvious) but the end result is that you have a flexible tool, not a leaky abstraction.

    There are some other examples, but you see the point. I think Joel has a good idea brewing regarding abstractions, complexity, and managing dependencies and unintended side-effects, but I do not think the problem is anywhere near as clear cut as he presents. As a discipline software engineering has a horrible track record of implementing arcane and overly complex abstractions for network programming (sockets and XTI) generic programming (templates, ref counting, custom allocators) and even operating systems API's (POSIX).

    Until we can leave behind all of the cruft and failed experiments of the past, start new with complete and simple abstractions that do not mask behavior, but rather recognize it and provide a mechansim to handle it gracefully, we will run into these problems.

    Luckily, such problems are fixable - just write the code. If joel were right and complex abstractions were fundamentally flawed, that would be a dark picture indeed for the future of software engineering (it is only going to grow ever more complex from here kids - make no mistake about it).
  • by Junks Jerzey ( 54586 ) on Thursday November 14, 2002 @12:06PM (#4668727)
    I'm of the idea that the whole premise that high-level tools and high level abstraction coupled with encasulation are the biggest bane of the software industry.

    Now that simply isn't true. Imagine you need to do reformat the data in a text file. In Perl, this is trivial, because you don't have to worry about buffer size and maximum line length, and so on. Plus you have a nice string type that lets you concatenate strings in a clean and efficient way.

    If you wrote the same program in C, you'd have to be careful to avoid buffer overruns, you'd have to work without regular expressions (and if you use a library, then that's a high level abstraction, right?), and you have to suffer with awful functions like strcat (or write your own).

    Is this really a win? What have you gained? Similarly, what will you have gained if you write a GUI-centric database querying application in C using raw Win32 calls instead of using Visual Basic? In the latter case, you'll write the same program in maybe 1/4 the time and it will have fewer bugs.
  • by Yokaze ( 70883 ) on Thursday November 14, 2002 @12:12PM (#4668787)
    Don't blame the tools.

    High level languages and abstractions aren't the problem, neither are pointers in low level languages. It's the people, who can't use them.

    Abstraction does mean that you should not have to care about the underlying mechanisms, not that you should not understand them.
  • by jneemidge ( 183665 ) on Thursday November 14, 2002 @12:17PM (#4668823)
    This article reminds me of what I hated most about Jurassic Park (the novel -- the movie blessly omits the worst of it) -- Ian Malcolm's runaway pessimism. The arguments boil down to be very similar. Ian Malcolm says that complex systems are so complex we can't ever understand them all, so they're doomed to fail. Joel Spolsky says that our high-level abstractions will fail and because of that we're doomed to need to understand the lower-level stuff. I have problems with both -- they're a sort of technopessimism that I find particularly offensive, because they make the future sound bleak and hopeless despite volumes of evidence that, in fact, we've been dealing successfully with these issues for decades and they're just not all that bad.

    We have examples of massively complex systems that work very reliably day-in and day-out. Jet airplanes, for one; the national communications infrastructure, for another. Airplanes are, on the whole, amazingly reliable. The communications infrastructure, on the other hand, suffers numerous small faults, but they're quickly corrected and we go on. Both have some obvious leaky abstractions.

    The argument works out to be pessimism, pure and simple -- and unwarrented pessimism to boot. If it were true that things were all that bad, programmers would all _need_ to understand, in gruesome detail, the microarchitectures they're coding to, how instructions are executed, the full intricacies of the compiler, etc. All of these are leaky abstractions from time to time. They'd also need to understand every line of libc, the entire design of X11 top to bottom, and how their disk device driver works. For almost everyone, this simply isn't true. How many web designers, or even communications applications writers, know -- to the specification level -- how TCP/IP works? How many non-commo programmers?

    The point is that sometimes you need to know a _little bit_ about the place where the abstraction can leak. You don't need to know the lower layer exhaustively. A truly competant C programmer may need to know a bit about the architecture of their platform (or not -- it's better to write portable code) but they surely do not need to be a competant assembly programmer. A competant web designer may need to know something about HTML, but not the full intricacies of it. And so forth.

    Yes, the abstractions leak. Sometimes you get around this by having one person who knows the lower layer inside and out. Sometimes you delve down into the abstraction yourself. And sometimes, you say that, if the form fails because it needs JavaScript and the user turned off JavaScript, it's the user's fault and mandate JavaScript be turned on -- in fact, a _good_ high-level tool would generate defensive code to put a message on the user's screen telling them that, in the absence of JavaScript, things will fail (i.e. the tool itself can save the programmer from the leaky abstraction).

    What Ian Malcolm says, when you boil it all down, is that complex systems simply can't work in a sustained fashion. We have numerous examples which disprove the theory. That doesn't mean that we don't need to worry about failure cases, it means we overengineer and build in failsafes and error-correcting logic and so forth. What Joel Spolsky says is that you can't abstract away complexity because the abstractions leak. Again, there are numerous examples where we've done exactly that, and the abstraction has performed perfectly adequately for the vast majority of users. Someone needs to understand the complex part and maintain the abstraction -- the rest of us can get on with what we're doing, which may be just as complex, one layer up. We can, and do, stand on the shoulders of giants all the time -- we don't need to fully understand the giants to make use of their work.

  • by radish ( 98371 ) on Thursday November 14, 2002 @12:18PM (#4668839) Homepage
    And inevitably, at some point in a programmer's career, they'll come across a system in which the only available development tool is an assembler

    Do you REALLY believe that? Are you mad? I can be pretty sure that in my career I will never be required to develop in assembler. And even if I do, I just have to brush up on my asm - big deal. To be honest, if I was asked to do that I'd probably quit anyway, it's not something I enjoy.

    Sure it's important to understand what's going on under the hood, but you have to use the right tools for the right job. No one would cut a lawn with scissors, or someones hair with a mower. Likewise I wouldn't write a FPS game in prolog or a web application in asm.

    The real point is that people have to get out of the "one language to code them all" mentality - you need to pick the right language and environment for the task at hand. From a personal point of view that means haveing a solid enough grasp of the fundamentals AT ALL LEVELS (i.e. including high and low level languages) to be able to learn the skills you inevitably won't have when you need them.

    Oh, and asm is just an abstraction of machine code. If you're coding in anything except 1's and 0's you're using a high(er) level language. Get over it.
  • by Junks Jerzey ( 54586 ) on Thursday November 14, 2002 @12:18PM (#4668843)
    After 5 years of programming, my favorite language has become assembler - not because I hate HLL's, but rather, because you get exactly what you code in assembler. There are no "Leaky Abstractions" in assembly.

    Ah, but you are wrong, and I'm speaking as someone who has written over 100,000 lines of assembly code. The great majority of the time, when you're faced with a programming problem, you don't want to think about that problem in terms of bits and and bytes and machine instructions and so on. You want to think about the problem in a more abstract way. After all, programming can be extremely difficult, and if you focus on the minute then you may never come up with a solution. And many high level abstractions simply do not exist in assembly language.

    What does a closure look like in assembly? It doesn't exist as a concept. Even if you write code using closures in Lisp, compile to assembly language, and then look at the assembly language, the concept of a closure will not exist in the assembly listing. Period. Because it's a higher level concept. It's like talking about a piece of lumber when you're working on a molecular level. There's no such thing when you're viewing things in such a primitive way. "Lumber" only becomes a concept when you have a macroscopic view. Would you want to build a house using individual molecules or would you build a house out of lumber or brick?
  • by jaredcoleman ( 616268 ) on Thursday November 14, 2002 @12:19PM (#4668845)
    Very funny! I agree that the average Joe is still going to be lost with the technical aspects of this article, but the author does generalize...

    And you can't drive as fast when it's raining, even though your car has windshield wipers and headlights and a roof and a heater, all of which protect you from caring about the fact that it's raining (they abstract away the weather), but lo, you have to worry about hydroplaning (or aquaplaning in England) and sometimes the rain is so strong you can't see very far ahead so you go slower in the rain, because the weather can never be completely abstracted away, because of the law of leaky abstractions


    I've heard a lot of people say that they can't believe how many homes, schools, and other buildings were destroyed by the huge thunderstorms that hit the states this past weekend, or that many people died. Hello, we haven't yet figured out how to control everything! American (middle to upper-class) life is a leaky abstaction. We find this out when we have a hard time coping with natural things that shake up our perfect (abstacted) world. That is what we all need to understand.

  • Re:Informative (Score:5, Insightful)

    by CynicTheHedgehog ( 261139 ) on Thursday November 14, 2002 @12:22PM (#4668865) Homepage
    Exactly. The only way to do something more easily or more efficiently is to restrict your scope. If you know something about a particular operation, or if you can make a few assumptions about it, your life because much easier. Take sorting, for example. Comparison sorts run (at best) in Omega(n log n) time. However, if you know the maximum range of numbers k in a set of length n, and k is much smaller than n, you can use a counting sort and do it in Theta(n) time. But what happens if you put a k+1 number in there? Well, all hell breaks loose.

    Another example: Java provides a pretty nifty mail API that you can use to create any kind of E-mail you can dream up in 20 lines of code or so. But you only ever want to send E-mail with a text/plain bodypart and a few attachments. So you make a class that does just that, and save yourself 15 lines of code every time you send mail. But suppose you want to send HTML E-mail, or you want to do something crazy with embedded bodyparts? Well it's not in the scope, so it's back to the old way.

    In order to abstract you have to reduce your scope somehow, and you have to ensure that certain parameters are within your scope (which adds overhead). And sometimes there's just nothing you can do about that overhead (like in TCP). And occasionally (if you abstract too much) you limit your scope to the point where your code can't be re-used.

    And as you abstract you tend to pile up a list of dependencies. Every library you abstract from needs to be included in addition to your library (assuming you use DLLs). So yes, there are maintenance and versioning headaches involved.

    Bottom line: non-trivial abstraction saves time up front, but costs later, mostly in the maintenance phase. There's probably some fixed kharmic limit to how much can be simplified beyond which any effort spent simply in displaces the problem.
  • Re:Informative (Score:3, Insightful)

    by oconnorcjo ( 242077 ) on Thursday November 14, 2002 @12:31PM (#4668941) Journal
    I think it's a mistake to simply say that "high level languages make for buggier/bloated code". After all, many abstractions are created to solve common problems. If you don't have a string class then you'll either roll your own or have code that is complex and bug prone from calling 6 different functions to append a string. -by binaryDigit.

    You said my own thoughts so well that I decided to quote you instead! Actually I thought the article just "stated the obvious" but that it didn't really matter. When I want to "just get things done", abstractions just make it so that I can do it in a magnitude faster than hand coding the machine language [even assembler is an abstraction]. Abstractions allow people to forget the BS and just get stuff done. Are abstractions slower, bloated, and buggy? To some degree yes! But the reason why they are so widely accepted and appreciated is that it makes life SIGNIFICANTLY easier, faster and better for programmers. My Uncle who was a programmer in the 1960's had a manager who said "an assembler compiler took too many cycles on the mainframe and was a waist of time". Now in the 1960's that may have been true but today that would be a joke. Today, I won't even go near a programming language lower than C and I like Python much better.

  • by Tom7 ( 102298 ) on Thursday November 14, 2002 @12:40PM (#4669006) Homepage Journal

    OK, fine: All programming languages have an implementation, and a host operating system. But switching from Java to C++ certainly won't save you from these kinds of problems. (In fact, there is only ONE C++ compiler that I know of that actually claims to be compliant with the C++ language definition; ie., every C++ compiler that people use to build programs is filled with bugs concerning the language's many insane idiosyncrasies!)

    I only mean to point out Java as a *language* that has better abstraction properties than C++. (Personally, I prefer other less popular languages like SML, but Java serves the point as well. Just be careful not to take Java as the best example of a high-level language, because high-level languages can have better features and be more efficient than Java is.) Software written in a correct implementation of Java on a correct OS can not have buffer overflows. Programs written in C, even in a correct compiler (few exist) on a correct OS, can and frequently do have buffer overflows. I am reluctant to call this a programmer problem, because such bugs are so common, even among extremely good programmers. (Are the authors of Quake III Arena, Apache, MySQL, the Linux Kernel, ssh, BIND, Wu_ftpd just all bad programmers for having buffer overflows in their software? I personally don't think so...)

    Some people are reading this article and using it as evidence to support low-level languages like C. ("Abstractions are leaky, so programmers need to have access to low-level details in order to work around leaky abstractions." or "Abstractions are leaky, so there's no point in using abstraction.") I think that's exactly backwards! Essentially, what I'm claiming is that C++ is a poor language for large software precisely because it does not allow programmers to create "tight" abstractions. Some languages do! These languages are much more pleasant to program in, and to build large software in! And in those languages, we can indeed make tight abstractions without the kinds of leaks he's described.

  • by Chris Mattern ( 191822 ) on Thursday November 14, 2002 @12:44PM (#4669045)
    > There are no "Leaky Abstractions" in assembly.

    At this point, may I whisper the word "microcode" in your ear?

    Chris Mattern
  • by Anonymous Coward on Thursday November 14, 2002 @01:05PM (#4669245)
    Is everyone as scientifically and technically literate and educated as they should be? Of course not. Are those involved in liberal arts education the worst culprits? Of course not. A liberal arts education most certainly includes a healthy dose of mathematics and science, particularly of the kind you highlight, namely higher-level abstractions.

    What is terrifying about your post is the final paragraph where you imply that an engineering background is better for those who want to practice ``politics, humanities, governance, management''. Scientists and engineers are like any other group. They can have narrowly-focused educations, minds, and perspectives. The fields that you mention are complex, and require study, background, consideration, and debate of their own.

    While you are slamming ``liberal arts'' -- a term you seem not to understand -- you highlight the need for it. Liberal arts does not imply a non-scientific, non-technological education. It implies a broad education, including science, mathematics, and engineering along with the ``traditional'' topics of history, literature, languages, politics, economics, and arts. For politics, governance, and management, I want people who are conversant in all of those topics.
  • by Anonymous Coward on Thursday November 14, 2002 @01:07PM (#4669258)
    Don't fix what ain't broke. You would have saved even more time (And bugs!) if you just let the Fortran be. Seriously.
  • Argh. (Score:4, Insightful)

    by be-fan ( 61476 ) on Thursday November 14, 2002 @01:09PM (#4669286)
    While I usually like Joel's work, I'm pissed about the random jab at C++. For those he didn't read the article, he says something along the lines of

    "A lot of the stuff the C++ committe added to the language was to support a string class. Why didn't they just add a built-in string type?"

    It's good that a string class wasn't added, because that lead to templates being added! And templates are the greatest thing, ever!

    The comment shows a total lack of understanding of post-template, modern C++. People are free not to like C++ (or aspects of it) and to disagree with me about templates, of course, and in that case I'm fine with them taking stabs at it. But I get peeved when people who have just given the language a cursory glance try to fault it. If you haven't used stuff like Loki or Boost, or taken a look at some of the fascinating new design techniques that C++ has enabled, then you're in no place to comment about the language. At least read something like the newer editions of D&E or "The C++ Programming Language" then read "Modern C++" before spouting off.

    PS> Of course, I'm not accusing the author of being unknowledgable about C++ or anything of the sort. I'm just saying that this particular comment sounded rather n00b'ish, so to speak.
  • by mdritchi ( 612425 ) on Thursday November 14, 2002 @01:12PM (#4669322)
    The problem that Joel talks about is not really a problem with abstraction, it is a problem with teamwork. When I program I simply can not do all of it myself for all but trivial projects. By everything I mean write the compilers, write the OS etc. Instead I must rely on other programmers to write large portions of the code that I run. Whether it is the guy across the hall who wrote the search contact Stored Procedure in SQL or a programmer at microsoft writing a windows Disk IO function, I am relient on their code working as I think it should. This is the problem with teamwork but there is no other solution to programming modern applications. Martin
  • by dsaxena42 ( 623107 ) on Thursday November 14, 2002 @01:13PM (#4669339)
    Maybe I'm an old fashioned has-been but people doign software development should understand the fundamentals of how computers work. That means that they should understand things like memor management, they should understand what a pointer is, they should undertsand about how tight loops versus unrolled loops might affect the performance of the caches on their system. I meet so many "programmers" that have no understanding that there are architectural constraints on what they can and can't do. Software runs on hardware. If you're going to write software and treat the hardware as a black box, you're not going to write it as well, or as efficiently as you could be doing it.
  • by daoine ( 123140 ) <moruadh1013@yahoo . c om> on Thursday November 14, 2002 @01:17PM (#4669384)
    The market rewards abstractions because they help create high level tools that get products on the market faster.

    Agreed, but I think it's important to note that without the understanding of where the abstraction came from, the high-level tools can be a bane rather than a help.

    I write C++ every day. Most of the time, I get to think in C++ abstraction land, which works fine. However, on days where the memory leaks, the buffer overflows, or the seg faults show up, it's not my abstraction knowledge of C++ that solves the problem. It's the lower level, assembly based, page swapping, memory layout understanding that does the debugging.

    I'm glad I don't have to write Assembly. It's fun as a novelty, but a pain in the butt for me to get something done. However, I'm not sure I could code as well without the underlying knowledge of what was happening under the abstraction. It's just too useful when something goes wrong...

  • by irix ( 22687 ) on Thursday November 14, 2002 @01:24PM (#4669462) Journal

    Knowing the specific tool is important. More important really than being well rounded.

    Riiight. That is why you always want to hire someone who graduated from Computer College X rather than a CS or Engineering program. I mean they know Visual Basic so well they can't program without it, but they couldn't solve another problem if their life depended on it. Just who I want to hire!

    Look in any office and their best programmer is typically not the one with the grad degree, but the one who is willing to geek out and learn every detail about the particular language they use

    So wrong. Where you you work, so I can avoid ever working there?

    The best programmers I work with are the smartest all-around people that also love their craft. Sure, they might know tool X or Y really well because they use it all of the time, but they also know a little bit about everything, and can learn tool Z at the drop of a hat. They also know that there are many tools/languages to solve a problem, and tend to use the best one.

    The language/tool geek who knows every nook and cranny about their language they use but can't think outside of that box is the last person I want to work with. They create code that is unmaintainable becuase they make heavy use of some obsure language features, but when it comes time to work with another language or tool they are useless. And no matter how much a particular problem cries out for another language to be used, they try and cram their square language into my round problem hole. No thanks.

  • by Eric Savage ( 28245 ) on Thursday November 14, 2002 @01:27PM (#4669487) Homepage
    Now I always consider performance when designing/writing code, but programmers are WAY more expensive than hardware, so eeking out performance can often be a wasted effort. Everyone knows that C will smoke Java in most operations, but having its so hard to manage at the enterprise level that you are much better taking the 50%+ performance hit and writing in a "leaky" language.
  • by Dr. Awktagon ( 233360 ) on Thursday November 14, 2002 @01:29PM (#4669513) Homepage
    Looks like he just discovered and renamed the basic idea that "all models are incomplete". Any scientist could tell you that one! I remember a quote that goes something like this: The greatest scientific accomplishment of the 19th century was the discovery that everything could be described by equations. The greatest scientific accomplishment of the 20th century is that nothing can be described by equations.

    That's all an abstraction is: a model. Just like Newtonian physics, supply and demand under perfect competition, and every other hard or soft scientific model. Supply and demand breaks down at the low end (you can't be a market participant if you haven't eaten in a month) and the high end (if you are very wealthy, you can change the very rules of the game). Actually, supply and demand breaks down in many ways, all the time. Physics breaks down at the very large or very small scales. Planetary orbits have wobbles that can only be explained by more complex theories. Etc.

    No one should pretend that the models are complete. Or even pretend that complete models are possible. However, the models help you understand. They help you find better solutions (patterns) to problems. They help you discuss and comprehend and write about a problem. They allow you to focus on invariants (and even invariants break down).

    All models are imperfect. It's good that computer science folks can understand this, however, I don't think Joel should use a term like "leaky abstraction". Calling it that implies the existence of "unleaky abstraction", which is impossible. These are all just "abstractions" and the leaks are unavoidable.

    Example: if I unplug the computer and drop it out of a window, the software will fail. That's a leak, isn't it? Think of how you would address that in your model: maybe another computer watches this one so it can take over if it dies..etc..more complexity, more abstractions, more leaks....

    He also points out that, basically, computer science isn't exempt from the complexity, specialization, and growing body of understanding that accompanies every scientific field. Yeah, these days you have to know quite a bit of stuff about every part of a computer system in order to write truly reliable programs and understand what they are doing. And it will only get more complex as time goes on.

    But what else can we do, go back to the Apple II? (actually that's not a bad idea. That was the most reliable machine I've ever owned!)
  • by arkanes ( 521690 ) <arkanes@NoSPam.gmail.com> on Thursday November 14, 2002 @01:32PM (#4669544) Homepage
    You don't, and in fact can't, deal with page faults in your Java program. Nonetheless, your java program will suffer a performance hit when it page faults. Thats a leaky abstraction.
  • by ChaosDiscord ( 4913 ) on Thursday November 14, 2002 @01:34PM (#4669562) Homepage Journal
    The market rewards...

    I'd suggest stearing clear of that phrase if your intention is to indicate that something is "good". It's also completes with things like "The market rewards skilled con men who disappear before you realize you've been rooked" and "The market rewards CEOs who destroy a company's long term future to boost short term stock value so he can cash out and retire."

    I'm all in favor of good abstractions, good abstractions will help make us more efficient. But even the best abstractions occasionally fail, and when they fail a programmer needs to be able to look beneath the abstraction. If you're unable to work below and without the abstraction, you'll be forced to call in external help which may cost you any of time, money, showing people you don't entirely trust your proprietary code, and being at the mercy of an external source. Sometimes this trade off is acceptable (I don't really have the foggest idea how my car works, when it breaks I put myself at the mercy of my auto shop). Perhaps we're even moving to a world where you have high level programmers that occasionally call in low level programmers for help. But you can't say that it's always best to live at the highest level of abstraction possible. You need to evaluate the benefits for each case individually.

    You point out that many people complain that some new programmers can't program C, while twenty years ago the complaint was the some new programmers can't program assembly. Interestingly both are right. If you're going to be skilled programmer you should have at least a general understanding of how a processor works and assembly. Without this knowledge you're going to be hard pressed to understand certain optimizations and cope with catastrophic failure. If you're going to write in Java or Python, knowing how the layer below (almost always C) works will help you appreciate the benefits of your higher level abstraction. You can't really judge the benefits of one language over another if you don't understand the improvements each tries to make over a lower level language. To be a skilled generalist programmer, you really need at least familiarity with every layer below the one you're using (this is why many Computer Science desgrees include at least one simple assembly class and one introductory electronics class).

  • by Ungrounded Lightning ( 62228 ) on Thursday November 14, 2002 @01:41PM (#4669618) Journal
    While you are slamming ``liberal arts'' -- a term you seem not to understand -- you highlight the need for it. Liberal arts does not imply a non-scientific, non-technological education. It implies a broad education, including science, mathematics, and engineering along with the ``traditional'' topics of history, literature, languages, politics, economics, and arts. For politics, governance, and management, I want people who are conversant in all of those topics.

    Unfortunately, the subjects you list have all grown to the point that no human can obtain even a BASIC understanding of all of them before he's too old to have a useful carreer left.

    It was once possible to be a "Rennisance Man" - a master of ALL the sciences and arts reduced to teachability. No more. It's just too bloody large. (I say this as someone who attended a univerdity that claims to try to produce such people - centuries after the last of them is dead. B-) )

    Unfortunately, "Liberal Arts" schools have, over much of the last century, been filled with the mathematically and technically illiterate - both because the students without the necessary skills gravitated there, and because the faculties themselves were so disabled, and in turn disparaged the skills they were incompetent to teach.

    The engineering/scientific/biologic/technical cirriculum had constant feedback from the real world about what was true and what was false. But the "Arts Schools" taught classes where what was "right" was ONLY a matter of opinion - and grades solely a measure of how well you could regurgitate your Prof's pet bonnet-bees. (This DESPITE the fact that SOME of these theories could be TESTED - if only the academics understood, and/or believed in, things like the scientific method, statistics, and sampling methods.)

    Yes the "Social 'Sciences'" are hard. But the bulk of their credentialed practitioners used this as an excuse to drop "science" from their methodologies. (This despite that fact that mathematics departments were generally part of the art, rather than the engineering, side of the school organization.)

    I've been out of academia for a while now. I can hope that things have improved, as you seem to claim. But I have not personally seen any sign of such from the outside (other than your claim).

    In my school days, too, many students on the Arts side of the wall knew tech, math, and the like. (Students are generally young, and still hunting for their muse.) But they would generally transfer out to some field more conducive to clear thought, drop out to use it in the real world, or (if they stayed in LS&A) suppress it or flunk out.
  • by kawika ( 87069 ) on Thursday November 14, 2002 @02:01PM (#4669841)
    Even at the machine code level, IEEE floating point is the mother of all leaky abstractions for real numbers.
  • Re:C++ strings (Score:2, Insightful)

    by cifey ( 583942 ) on Thursday November 14, 2002 @02:38PM (#4670266) Journal
    I don't see how you write a can get a String class function to be called using only char*'s as arguments. Thus the abstraction fails.
  • by Zaphod-AVA ( 471116 ) on Thursday November 14, 2002 @03:04PM (#4670567)
    The subject of leaky abstractions applies to novice users as well.

    I've felt for a long time that people are taught about computers the wrong way, and this article clarifies why this is true.

    People are taught less and less about what the computer actually does, and instead focus on things like the desktop analogy, and task oriented training. The user must then remember all these seemingly strange things computers do that don't follow the abstraction they were taught. This makes them seem difficult and incomprehensable.

    The problems created by abstractions intended for users can simply be solved with more complicated software that better models the analogy that the users are taught. Unfortunately, the opposite is probably true for programmers.

    -Zaphod
  • by vsync64 ( 155958 ) <vsync@quadium.net> on Thursday November 14, 2002 @03:07PM (#4670614) Homepage
    This comment was originally posted
    [http://www.kuro5hin.org/comments/2002/11/12/183 88/916/18#18]
    on Kuro5hin's discussion of the same topic. I
    used lists because Rusty feels a need to turn <p>
    into <br><br>. It is displayed in plain text
    because CmdrTaco's lameness filter doesn't like
    nesting more than 3 levels.

    * This article serves well to highlight some of the problems I have
    with Joel Spolsky.
    + Joel is a bright guy, it seems.
    o He certainly has a lot of nuts-and-bolts knowledge of
    various APIs.
    o I do agree completely that it is important to understand
    the underlying layers, but I think the problems he
    experiences are mostly due to the fact that he picks bad
    abstractions.
    + One problem is that Joel seems to think that skill is in
    direct proportion to the number of random incongruities that
    one is able to keep in one's head at a time.
    o In other essays, he has defended complex techniques that
    end up in the [32]Interface Hall of Shame, and
    castigated those who fail to join the bizarre cargo cult
    of MS cut-and-paste workarounds.
    o I don't want to delve into ad hominem attacks here, but
    I'm willing to bet this is why he and Microsoft got
    along so well.
    # Microsoft recommends that people not allow the word
    "begin" to start a line in email, to avoid
    triggering their MUA bugs.
    # Code for Microsoft OSes seems to require all sorts
    of workarounds and kludges depending on which
    version of Win32 -- the supposed lingua franca of
    Windows -- happens to be in use.
    # Almost every bit of documentation I have seen for
    Windows, or written by Windows users, tends to
    suggest the use of Win32-only code, without even
    mentioning the idea that the programmer is throwing
    compatibility out the window.
    @ I'm not sure the authors even know.
    # The most basic and commonly used variables in
    Windows programming, wParam and lParam, are
    incorrectly named.
    + Joel complains that HTML doesn't offer a way to submit forms
    from hyperlinks.

    Indeed ASP.NET abstracts away the difference between writing the
    HTML code to handle clicking on a hyperlink (<a>) and the code to
    handle clicking on a button. Problem: the ASP.NET designers needed
    to hide the fact that in HTML, there's no way to submit a form from
    a hyperlink. They do this by generating a few lines of JavaScript
    and attaching an onclick handler to the hyperlink.

    o Joel neglects to mention that the proper way to perform
    this kludge would be by having the JavaScript append the
    form values to the URL of the link in question; "get"
    methods are just specially formatted URLs, after all.
    o Requiring JavaScript to submit a form at all, however,
    loses.
    o Joel also ignores the fact that form submission buttons
    are visually distinct from hyperlinks for a very
    specific reason.
    # It is important that the user know that its browser
    is not merely transmitting a predefined string, but
    a set of data defined by the user's actions.
    # Strangely, while Joel is talking about the
    importance of correct abstractions for the
    programmer's convenience and correctness, he seems
    to ignore the effect that inconsistent abstractions
    may have on the user.
    @ Cheap shot, but refer back to the bit about
    him and Microsoft being a good fit for each
    other.
    o If Joel is talking about destructive form-based actions,
    he should be further ashamed of himself.
    # It is especially important that the user know when
    its actions are about to change the state of the
    server.
    # The [33]HTML spec makes this very distinction:

    The "get" method should be used when the form is idempotent (i.e.,
    causes no side-effects). Many database searches have no visible
    side-effects and make ideal applications for the "get" method.

    If the service associated with the processing of a form causes side
    effects (for example, if the form modifies a database or
    subscription to a service), the "post" method should be used.

    # Neither a user clicking on a link, a browser or
    proxy performing read-ahead, nor a search engine
    spidering a site, should ever be able to modify
    state on the server.
    @ This is common sense to anyone who has figured
    out why using only JavaScript form
    verification by itself is a bad idea.

    31. http://quadium.net/
    32. http://www.iarchitect.com/mshame.htm
    33. http://www.w3.org/TR/html401/
    34. http://www.kuro5hin.org/story/2002/11/12/18388/916
  • Re:Here goes.... (Score:4, Insightful)

    by Junks Jerzey ( 54586 ) on Thursday November 14, 2002 @03:46PM (#4671044)
    Please tell me what you think of this - I would honestly like to know.

    I've worked in a way similar to you, and I might still if it were as mindlessly simple to write assembly language programs under Windows as it was back in the day of smaller machines (i.e. no linker, no ugly DLL calling conventions, smaller instruction set, etc.). In addition to being fun, I agree in that assembly language is very useful when you need to develop your own abstractions that are very different from other languages, but it's a fine line. First, you have to really gain something substantial, not just a few microseconds of execution time and an executable that's ten kilobytes smaller. And second, sometimes you *think* you're developing a simpler abstraction, but by the time you're done you really haven't gained anything. It's like the classic newbie mistake of thinking that it's trivial to write a faster memcpy.

    These days, I prefer to work the opposite way in these situations. Rather than writing directly in assembly, I try to come up with a workable abstraction. Then I write a simple interpreter for that abstraction in as high a level language as I can (e.g. Lisp, Prolog). Then I work on ways of mechanically optimizing that symbolic representation, and eventually generate code (whether for a virtual machine or an existing assembly language). This is the best of both worlds: You get your own abstraction, you can work with assembly language, but you can mechanically handle the niggling details. If I come up with an optimization, then I can implement it, re-convert my symbolic code, and there it is. This assumes you're comfortable with the kind of programming promoted in books the _Structure and Interpretation of Computer Programs_ (maybe the best programming book ever written). To some extent, this is what you are doing with your macros, but you're working on a much lower level.
  • by J. Random Software ( 11097 ) on Thursday November 14, 2002 @04:01PM (#4671231)
    The abstraction is a reliable byte stream, which of course isn't really possible due to phenomena that can only be affected by interfaces beneath TCP. A leak that's documented is still a leak.

"Engineering without management is art." -- Jeff Johnson

Working...