Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror
×
Programming IT Technology

Null References, the Billion Dollar Mistake 612

jonr writes "'I call it my billion-dollar mistake. It was the invention of the null reference in 1965. At that time, I was designing the first comprehensive type system for references in an object oriented language (ALGOL W). My goal was to ensure that all use of references should be absolutely safe, with checking performed automatically by the compiler. But I couldn't resist the temptation to put in a null reference, simply because it was so easy to implement. This has led to innumerable errors, vulnerabilities, and system crashes, which have probably caused a billion dollars of pain and damage in the last forty years. In recent years, a number of program analysers like PREfix and PREfast in Microsoft have been used to check references, and give warnings if there is a risk they may be non-null. More recent programming languages like Spec# have introduced declarations for non-null references. This is the solution, which I rejected in 1965.' This is an abstract from Tony Hoare Presentation on QCon. I'm raised on C-style programming languages, and have always used null pointers/references, but I am having trouble of grokking null-reference free language. Is there a good reading out there that explains this?"
This discussion has been archived. No new comments can be posted.

Null References, the Billion Dollar Mistake

Comments Filter:
  • by alain94040 ( 785132 ) * on Tuesday March 03, 2009 @11:22AM (#27051389) Homepage

    It's hard to imagine life without the null pointer! That being said, the author is not really responsible for billions of dollars of mistakes, the programmers are.

    If there is one thing I'll complain about, it's the choice of the value 0. It's almost impossible to trace it. When we do hardware debug of chips, we prefer to use a much more visible value such as 0xdeadbeef for instance. Otherwise a bad pointer will bland too much with all the uninitialized values out there.

    In assembly, null has no particular meaning. If you dereference an address, you can do it in any range you like. It's just that 0 on most machines was not a good place to store anything, since it would typically be used to boot the OS or some other critical IO function that you don't want to mess up with. Thus null was born.

    • by CTalkobt ( 81900 ) on Tuesday March 03, 2009 @11:30AM (#27051471) Homepage

      When debugging at the hardware level it's fairly common to fill uninitialized memory (or newly allocated in a debug version of the malloc libraries) with a value that will either cause the computer to execute a system level break ( eg: TRAP / BRK etc) or something fairly obvious such as ($BA).

      If you don't like the 0's, then replace your memory allocation library.

      • Re: (Score:2, Informative)

        its not the memory allocation library that is at fault.
        its the expectation of the app developer to instincively do

        if(!ptr){ ... }

        you have to change the fundimental way the compiler works and alter boolean logic to account for existing code which works like this to then accept 0xdeadbeef under some conditions and not others.

        • Re: (Score:2, Interesting)

          by fishbowl ( 7759 )

          The C specification already requires the compiler to deal with that, and it's been the case since K&R. No matter what the implementation defines as NULL, comparing or assigning 0 in a pointer context always works.

          http://c-faq.com/null/ptrtest.html [c-faq.com]

        • Wrong. A NULL pointer is implementation-defined in C and !p would work just as well if the bit value of p were 0xdeadbeef for a NULL pointer. The compiler is responsible for that.

          0 is used because it's convenient for compilers and architectures, not for programmers. Programmers don't care, they never see the bit pattern of a NULL pointer unless they're doing things wrong (casting to integers) or working on lower level architecture-specific code. Most think they do, though. See the C-faq section on NULL pointers [c-faq.com].

        • Actually, in C the null pointer constant is a distinct value from integer zero. The standard requires the following (see section 6.3.2.3 of ISO C99):

          • That the integer value 0, when cast to any pointer type, yield a null pointer
          • That a null pointer, when cast to any other pointer type, yield another null pointer
          • That any two null pointers will compare as equal, regardless of type

          As for constructions like if (!ptr), the standard requires that the if statement execute if its value is non-zero, and it would be entirely legal for the null pointer to have a non-zero in-memory representation, but convert to the integer zero. See, for example, the comp.lang.c FAQ [c-faq.com].

      • by cant_get_a_good_nick ( 172131 ) on Tuesday March 03, 2009 @12:14PM (#27052067)

        RE: malloc pattern initializer

        what's a good one for x86 and AMD64 chips? While spelunking flags for valgrind, i remembered the thought process for 68k chips. Use an A-Line trap, unimplemented so execution would stop. Also, make it odd, so a dereference would trigger a bus error.

        What's the best values for x86 debugging?

        • Re: (Score:3, Interesting)

          by clone53421 ( 1310749 )

          How about 0xCC (INT 3), which is typically used as a debug breakpoint? It will halt the execution (as long as you're running the code in a debugger, which is assumed), and it's a one-byte opcode which is good since that means if you somehow jump into unallocated space, you can't jump into the middle of the instruction.

      • The first OS I encountered was tape-based. And it prefilled user memory with a "core constant".

        This was a subroutine jump to an abort routine which printed the return location - which in turn told you where you had improperly jumped to and dumped all your registers, followed by the memory itself if that was authorized. (That was all the info that was left by the time the OS got control.)

        The walls of the computing center contained posters giving this value as it would appear if printed as various types of

    • by gr8_phk ( 621180 )
      I'd like it if there was a "prefetch" instruction to fill cache, but that ignored references to address zero. This way you could prefetch all pointers unconditionally to increase performance. Compilers could then insert these prefetches automatically.
    • by jeremyp ( 130771 ) on Tuesday March 03, 2009 @11:53AM (#27051755) Homepage Journal

      That's all very well, but in a production environment when dereferencing a NULL pointer you'd probably rather have the program crash than carry on merrily with bad data. With a zero null value, you can easily arrange for this to happen by protecting the bottom page of memory from reads and writes. That way, even an assembly language program can't dereference a null pointer.

    • Re: (Score:2, Informative)

      In assembly, 0 is much easier to check for.
    • Re: (Score:2, Funny)

      by Anonymous Coward

      When we do hardware debug of chips, we prefer to use a much more visible value such as 0xdeadbeef for instance.

      I've recently seen that one of our developers is using 0xfeedface 0xb00bf00d, which is nice and inventive.

    • by johny42 ( 1087173 ) on Tuesday March 03, 2009 @12:42PM (#27052489)

      That being said, the author is not really responsible for billions of dollars of mistakes, the programmers are.

      Who am I to argue with someone that is taking resposibility for my mistakes?

    • Re: (Score:3, Interesting)

      by rickb928 ( 945187 )

      The first time I saw an ethernet MAC address of 02DEADBEEF20 I went on a 20-minute snipe hunt through the switches.

      It was the /dev/net0 adapter in the standby member of a Sun cluster.

      A month later, I got the inevitable frantic voicemail from the telecom guy, asking what the '^&(*ing 02DEADBEEF20 address' was, and would I pay more attention to these things and secure our network, please and thank you. I told him it belonged to the Teleradiology project, not to worry. He accused me of being an imbecile,

  • by AKAImBatman ( 238306 ) * <akaimbatman@gmaYEATSil.com minus poet> on Tuesday March 03, 2009 @11:24AM (#27051411) Homepage Journal

    I am having trouble of grokking null-reference free language.

    If you're familiar with SQL, then a simple "MyColumn NOT NULL" definition should explain it. Basically, the value can never be set to a null value. Attempting to do so is an error condition itself.

    In fact, DB design is a pretty good analogy for the concept as databases often are forced to wrestle with this issue.

    Consider for a moment how you would design a database that has absolutely NO null references. Not a one. Zip, zero, nada. Obviously the best way of accomplishing such a database is to denormalize any value that might be null. So if Address2 is optional, you would want to split Address into its own table with a parent key pointing back to the user entry. If the user has an Address2 value, there will be a row. If the user does NOT have an Address2, the row will be missing. In that way, empty result sets take the place of null values.

    In terms of programming languages, there are a varity of ways to map such a concept. Collections are a 1:1 mapping to result sets that can work. If you don't have any values in your collection, then you know that you don't have a value. Very easy. Similarly, you can be sure that none of the values passed to a function or method will ever contain a null value. Cases where you might want to pass some of the values but not all can be handled either by method overloading (e.g. Java) or by allowing a variable number of parameters. (e.g. C)

    Some pieces of programming would become slightly more difficult. For example, 'if(hashmap.get("myvalue") != null)' would not be a valid construct. You'd need to perform a check like this: 'if(hashmap.exists("myvalue")'

    Of course, the latter is the "correct" check anyway, so the theory goes that the software will be more robust and reliable.

    • Re: (Score:3, Insightful)

      by Anonymous Coward

      doesn't NULL in SQL represent "unknown", which is something entirely different that a NULL reference, which in the context of programming languages is a discrete value?

      • doesn't NULL in SQL represent "unknown", which is something entirely different that a NULL reference

        No. NULL in SQL represents an absence of data. Which is occasionally used to cover for unknown values. However, NULL is a piece of data that says there is an absence of data. Which is incorrect. Absence of data means that it doesn't exist. Therefore, nothing should exist in its place.

        Normalizing the database can create a situation where the NULL is unnecessary. Therefore, the concept is not needed by computer science. The problem is that real-world considerations often override the ivory tower of comp-sci. And one of those considerations was the fact that RDBMSes have traditionally been organized according to a fixed column model. The inflexibility of the model is driven by the on-disk data structures which are optimized for fast access. OODBMSes (which are really fancy RDBMSes with many "pure" relational features that work around the traditional weaknesses of RDBMSes) attempt to solve this issue by introducing concepts like table-less storage, columns that may or may not exist on a per-row basis, and a dynamic typing system that potentially allow for any data type to show up in particular column. (Note that columns are often handled more as key-value pairs than what we normally think of as columns. This does not undo the theoretical foundation of the Relational model, only results in a different view on it.)

        • Re: (Score:3, Interesting)

          Ok, I'm far from an expert on SQL, but if NULL doesn't represent "unknown" in SQL, then why does

          select 1 from dual where 1 not in (2,3,NULL);

          return an empty set?

          • Re: (Score:3, Informative)

            by AKAImBatman ( 238306 ) *

            That's a misunderstanding of the spec. NULL has no type, so evaluating NULL = 1 results in an unknown. That does not imply that NULL is an unknown value. I believe this reply [postgresql.org] on the PostgreSQL mailing list explained it best:

            0 <> NULL (Indeed nothing equals NULL, other then sometimes NULL itself)

            0 <> 1

            Therefore, the statement: 0 NOT IN (NULL, 1)
            Should always equate to false.

            Therefore No rows returned. Ever.

            It's a bit weird, but it makes sense when you actually follow the logic.

            • by jadavis ( 473492 ) on Tuesday March 03, 2009 @01:16PM (#27052985)

              It's a bit weird, but it makes sense when you actually follow the logic.

              Not really.

              The expression "0 <> 1" is true, but the poster you referenced also says "0 <> NULL", which is NOT true, it is NULL.

              Additionally, NULL is not always treated as false-like. For instance, if you added the constraint "CHECK (0 NOT IN (NULL, 1))", that would always succeed, as though it was "CHECK(true)".

              And if you think "it makes sense", consider this: ... WHERE x > 0 OR x <= 0
              If x is NULL, that statement will evaluate to NULL, and then be treated as false-like, and the row will not be returned. However, there is no possible value of x such that the statement will be false.

              I'm not a big fan of NULL, but I think the most obvious sign that it's a problem is that so many people think they understand it, when they do not.

              • And if you think "it makes sense", consider this: ... WHERE x > 0 OR x <= 0 If x is NULL, that statement will evaluate to NULL, and then be treated as false-like, and the row will not be returned. However, there is no possible value of x such that the statement will be false.

                If x is NULL, the statement evaluates to false. This isn't "false-like"; NULL is the state of not having a value. Comparing a non-value to /any/ value of or range of values is logically false: X is neither LTE 0 nor is it GT 0; a non-value has no relation to the value 0.

                While you can use it to derive a true/false value, NULL is not a (in the RDBMS context) value at all. Would you say in mathematics "empty set" makes no logical sense?

        • Re: (Score:3, Informative)

          by vux984 ( 928602 )

          Normalizing the database can create a situation where the NULL is unnecessary.

          Not reallly. Suppose I'm going to do a mail out to my customers... so I need a table of addresses

          select *
          from addresses inner join addressline2s on addresses.pkey = addressline2s.fkey

          And what happens? I'm now missing all the addresses that don't have a line 2. Well that's worthless.
          how about:

          select *
          from addresses left outer join addressline2s on addresses.pkey = addressline2s.fkey

          Yay, all my addresses. And I can cursor through th


      • doesn't NULL in SQL represent "unknown",

        Sorta. From an operational perspective it represents an un-initialized state. If you don't write anything to a particular column, it's null. From a set-theory perspective it represents "nothing".

        which is something entirely different that a NULL reference, which in the context of programming languages is a discrete value?
        No. I'd say that NULL in a programming language is largely the same concept. Doesn't exist, nothing, etc. It's perhaps slightly more broad, si

    • by MattRog ( 527508 ) on Tuesday March 03, 2009 @11:33AM (#27051511)

      "Obviously the best way of accomplishing such a database is to denormalize any value that might be null"

      That's normalizing -- the table in this example is de-normalized

    • You lost me at "simple". Sorry. I'm afraid I don't grok what a null reference is to begin with, which may be an issue.
    • My problem is that null references are typically used to signal the ends of lists or the place where the tree ends.

      I could see using a variant type for this. Instead of pointing to null, the next to the last list element would point to a value that had the type 'last list element' and no pointer inside it. And there would be four varieties of tree node, leaf, left filled, right filled and both filled.

      Can you think of any better ways than that to handle the lack of a null reference when building data struc

      • Oh, you have a special 'null instance' of any data type. That's just dumb. As someone else pointed out, it's just as easy to forget to check for it as it is to forget to check for null. And then your program ends up in some strange unpredictable behavior instead of generating a nice obvious segmentation fault when the reference is de-referenced.

      • Re: (Score:3, Informative)

        Variant types (or, put more generally, algebraic data types [wikipedia.org]) are indeed a general solution for this problem, that can be reused for countless others.

        The simplest example here is the way you define linked list types in a functional language like Haskell. In pseudo-code (yes, I know this might not be valid Haskell code):

        data List a = EmptyList | Node a (List a)

        This is a data type declaration that says that the type "List of a" is either the singleton EmptyList value, or a 'Node a' value, which contains (

    • Please don't try to explain the behavior of an actual language with SQL. Its demeaning.
    • Re: (Score:3, Funny)

      by clone53421 ( 1310749 )

      ...this: 'if(hashmap.exists("myvalue")'

      ...is the "correct" check anyway...

      Well, it'd be "correct" if it had the right number of parentheses, anyway! ;p

  • Null-terminated strings. The bane of modern computing.

    • Re: (Score:3, Informative)

      by RetroGeek ( 206522 )

      A null terminated String is a misnomer. It is actually an array of chars which uses a special character to signify its upper boundary. So that a second variable is not needed to hold the upper boundary. Zero was chosen by K&R.

      In some languages, a String is an object, and the object holds the upper boundary, so a terminator flag is not required.

    • by Rik Sweeney ( 471717 ) on Tuesday March 03, 2009 @11:39AM (#27051579) Homepage

      Null-terminated strings. The bane of modern computing.

      Yeah! Let's abolish them, life would be much simplerasdjkaRGfl$!jaekrbFt6634i2u23Q0CCA;DMF ASDJFERR

      • by Anonymous Coward on Tuesday March 03, 2009 @11:48AM (#27051681)

        I agree.ï½ï½ï½ï½ï½ï½ï½cï½ï½A
        5ï½)ï½"ï½ï½ï½lï½3åï½ï½ï½SLï½4ï½54Vï½iï½ï½ï½D.O%N|ï½ï½ï½Tï½2nï½ì'iï½ï½ï½;ï½
                                                          ï½,ï½ï½(85ï½Iï½{ï½ï½ï½ï½)ï½Oï½Æ¼ï½%Cï½iwï½ï½ï½ï½ï½ï½I!,.ï½Õ'ï½ï½ï½ï½!ï½òfsQï½ï½zï½ï½Gï½ï½ï½aï½zï½-@ï½ yï½Ë+ï½ï½ï½Xï½ï½ï½ï½"ï½cï½âï½ï½ï½ï½ï½ï½ï½ï½ï½ï½dï½nbÕoeï½ï½ï½ï½lï½ï½ï½ï½ï½;hmï½ï½

    • Re: (Score:2, Troll)

      Null-terminated strings. The bane of modern computing.

      Maybe I'm feeding a troll, but what else would you terminate it with without using something the string may contain? Keep in mind that null-terminated strings were, err, "invented" around the time ASCII was really the only fully widespread character standard, and something was needed to mark the end of a string for detection by software.

      The mistakes you speak of are made by programmers that don't know how to securely utilize this in certain environments. Mainly in buffers, but recall the lkml thread [kerneltrap.org] abou

    • for Pascal type strings in C. The fact that null-terminated strings existed wasn't the problem, they make some sense in some respects, such as when you want to pass text of arbitrary length. But the real problem, the real bug was not having a standard way of doing real strings in C. Everybody had to do it himself, poorly. Had there been a standard, no matter how poor, it would have been a starting point to do something better if needed, and would have been better anyway for many uses than C strings. It would have avoided MANY vulnerabilities from common software.

      • Re: (Score:3, Interesting)

        by Vanders ( 110092 )
        The problem with Pascal strings is that it's easy for a short-sighted implementer to paint themselves into a corner. It's all very well and good to say "The first two bytes in a string are used to indicate the length of the string" but then what do you do a decade from now when a 16bit string is laughably small? The benefit of NUL terminated strings is that there length is only limited by the memory available to you and yet are forward and backward compatible by decades.
    • In a low-level language like C or assembly, anyway? The only workable alternative I ever saw was to store the length in (or with) the string, which can be very wasteful of memory.

    • PEDANT ALERT.

      NULL is a special pointer value, which is 0 in source code, but may or may not be 0 in object code. The compiler sets it to whatever the ABI defines the special flag pointer to be. The size would be whatever a pointer size is on your platform

      NUL byte, a single byte of 0x00 in both source and object code. In C-style strings, it's a marker that terminates the string.

      Not the same thing.

  • Yeah, but wouldn't the first thing you'd do in the system API design of any non-null language be, the creation of a singleton object instance of the superclass of all objects, named 'null' ?

    Also, apart from 'null' there are loads of parameters than can have illegal ranges and must be checked to be proper.

    Thirdly, a similar rant can be had against non-range checking of enums in C (but then warning against it in switches (WTF?)).

    • wouldn't the first thing you'd do in the system API design of any non-null language be, the creation of a singleton object instance of the superclass of all objects, named 'null' ?

      Umm... no? The first thing done is usually a superclass called "Object". If you don't extend anything else, you extend Object. Depending on the language, the superclass of Object would either be self-referential or the option to obtain a superclass wouldn't exist. (The latter being the "correct" solution. See my next statement for

    • Re: (Score:3, Insightful)

      by Sneftel ( 15416 )

      Actually, if you were defining a "null" value, you'd make it a Top-type, meaning it would be a subclass of all other types. Otherwise you couldn't set an arbitrary reference to point to null, because null would be insufficiently derived.

    • Yeah, but wouldn't the first thing you'd do in the system API design of any non-null language be, the creation of a singleton object instance of the superclass of all objects, named 'null' ?

      No. That doesn't really make sense even in a lot of OO languages, anyway -- if my class Foo extends Object, and my function expects a Foo, then in a strongly-typed language you can't pass me an Object.

      In languages where this would be possible, it would nonetheless be very evil to start with a language that is designed

  • Wouldn't help (Score:5, Insightful)

    by corporate zombie ( 218482 ) on Tuesday March 03, 2009 @11:33AM (#27051503)

    Fine. No null references. So I create the same thing by having a reference to some unique structure (probably named Null) and I still *fail to check for it*.

    Null references don't kill programs. Programmers do.

        -CZ

    • by Tridus ( 79566 )

      When the same mistake is repeated over, and over, and over, and over, and over again for decades, it's only natural to wonder if maybe letting it happen was itself a mistake.

      I mean, if I design a road and one car crashes, it's probably the driver. If there's crashes every day for 15 years? Either every driver is bad, or something is wrong with the road design.

    • Re:Wouldn't help (Score:5, Interesting)

      by nuttycom ( 1016165 ) on Tuesday March 03, 2009 @12:12PM (#27052019)

      If you use a sane class for references that could possibly be null (like Option [scala-lang.org] (aka Maybe in haskell) then your compiler will *force* you to handle the null case.

      This is where null went wrong, at least in statically typed languages: it's a hole in the type system that errors fall through into your program. When coding in Java, I make an explicit point to never return null from a method; if I have a situation where no reasonable return value might exist, I use the Option class from functionaljava.org [functionaljava.org] and thus force the client to handle the possibility of the method not returning sensible data. Since Option obeys the monad laws [blogspot.com], it's easy to chain together multiple things that might fail (with the bind or flatMap operations.)

      • maybe type (Score:5, Informative)

        by j1m+5n0w ( 749199 ) on Tuesday March 03, 2009 @03:03PM (#27054631) Homepage Journal
        Maybe types are wonderful. I first thought they were inconvenient, since you have to pattern match against them any time you want to extract the value, but then I realized that that was something I ought to be doing anyways, and the advantages of never accidentally dereferencing a null pointer vastly outweigh a little extra typing. And then, more recently, I figured out how to use the maybe monad to string together a bunch of things that might fail without having to manually pattern match every time.
  • Algebraic data types (Score:5, Informative)

    by Sneftel ( 15416 ) on Tuesday March 03, 2009 @11:42AM (#27051603)

    The concept of "no null references" would be very limiting in a language without algebraic datatypes [wikipedia.org]. You can think of null references as a sort of teeny limited braindead algebraic data type, actually. I get the feeling that much of the incredulity here stems from the posters not being familiar with languages that support them. If this describes you, check out Haskell and OCaML! They're the sort of languages that make you a better programmer no matter what language you're using.

    • by Chrisq ( 894406 )

      The concept of "no null references" would be very limiting in a language without algebraic datatypes [wikipedia.org].

      Not necessarily. You could mandate default constructors that would be invoked every time that an unreferenced object occurred, so Strings unless explicitly initialised would refer to "", user types to whatever the default constructor produced, and so on.

  • Pass by reference (Score:4, Informative)

    by hobbit ( 5915 ) on Tuesday March 03, 2009 @12:00PM (#27051851)

    I'm raised on C-style programming languages, and have always used null pointers/references, but I am having trouble of grokking null-reference free language.

    Take a look at C++, in which you can declare methods to be "pass by reference" rather than "pass by pointer". Although the former is actually really just passing a pointer too, the semantics of the construct make it impossible to pass NULL.

  • They should be shot for that one :-) This is lead to so many costly buffer-overflow virus attacks. Early languages like FORTRAN and COBOL had safer strings, but not as elegant as C. You had to pre-declare string storage size in early compilers.
  • It's unitialized pointers (and, for that matter, other variables) that are the problem. At least in assembly and C/C++. I don't think I ever had cause to use pointers in Perl or Python. Or C#. Null pointers or zero values in other variables are easy to test for anyway. It's the uninitialized variables that bite you in the ass.

    • I don't think I ever had cause to use pointers in Perl or Python. Or C#.

      Umm... what? Every single one of those languages has the concept of a pointer/reference that is virtually inescapable, and every one has a concept of undef/nil/null. Or have you never used a class in Perl (which is just a blessed reference), or a non-value-type in C# (which is stored and passed as a reference to the actual object)?

      Honestly, do you even know what a pointer is, conceptually??

  • You'll just have developers replace it like:

    $foo = NULL;
    getRef( $foo );
    if ( $foo != NULL ) {
    doSomething( $foo );
    }

    with

    $foo = "dummy";
    getRef( $foo );
    if ( $foo != "dummy" ) {
    doSomething( $foo );
    }

    Basicly, you can write any null code as non-null code just like you can hammer a square peg in a round hole. All you'd have is that instead of missed null checks you'd have missed dummy checks and it's be even less sane and understandable. Compared to every othe

    • Re: (Score:3, Informative)

      by Laxitive ( 10360 )

      We're not talking about not having null references at all. Nullable references are in fact very useful in many situations, as you point out.

      The problem is that in many languages, it is not possible to describe a non-nullable type. I.e. a type that guarantees that the value it annotates is not null.

      This is useful because the vast majority of actual code doesn't really deal with 'null' references, and in fact will break if 'null' references are passed in. Right now, there are two ways to ensure your code i

  • by Wargames ( 91725 ) on Tuesday March 03, 2009 @12:12PM (#27052017) Journal

    Zero. The bane of all. It was the gateway math to all modern problems. It would be so much simpler with just countables. Surely the current crisis, measured in trillions would look so much better without all those zeros.
    Whoever it was who invented zero should take responsibility for all the worlds problems, ex nehilo.

    • Re: (Score:3, Informative)

      by AKAImBatman ( 238306 ) *

      Null predates zero in the western world. The Romans had no number for zero, but they did represent the concept of nothing with the word 'nulla'. Thus if I had IIII denarii and spent all IIII, I would have nulla remaining. i.e. "nothing".

      As an aside, the numbering is correct. The subtractive form of IV for four is a more modern construct that was not in common use during the Roman empire.

      If you're still hell-bent on finding who defined zero as a legitimate numerical value, you'd need to look to 9th century I

    • Re: (Score:3, Insightful)

      by jc42 ( 318812 )

      Ki>Zero. The bane of all. It was the gateway math to all modern problems. It would be so much simpler with just countables. ... Whoever it was who invented zero should take responsibility for all the worlds problems, ex nehilo.

      Heh. I'm glad someone managed to bring up what should be obvious to anyone competent in basic math. While reading the posts here, I kept thinking "Yeah, and you have the same sort of problems if you allow your numbers to include zero." But I figured that the folks making the sil

  • Null as a concept (Score:5, Interesting)

    by JustNiz ( 692889 ) on Tuesday March 03, 2009 @12:22PM (#27052193)

    Stroustrup's "C++ Programming Language" book introduces a concept called "resource acquisition is initialisation" that was eye-opening enough to me that it forever changed the way I think about code, and also seems relevant to your point.

    The basic idea is that an object is always meant to represent something tangible. As an example, consider the design of file object that abstracts file I/O operations. As a developer, I've come across this one several times, it is normal that such objects have open and close methods, however that makes the design of the object in contradiction with Stroustrup's concept because open/close provided as methods rather than only called in the constructor/destructor means the object may be in existence yet be in a state where it is not associated with an open file. You basically have to grok that having a file object around that doesn't directly map to an open file just adds overhead to the system and is basically bad OO design in that in some sense that object is meaningless.

    Apply the same concept to a reference and you have your answer. If a reference is pointing at nothing, then what is its purpose? The only thing a NULL reference is good for is when the software design ascribes a special meaning to the value NULL. Instead of just meaning address location 0, it gets subverted to mean "variable unassigned" or the "tail node of list" or somesuch. Ascribing multiple meanings to a variable value (especially pointers/references that are only ever meant to hold memory addresses) is one example of bad programming practice known as programming by side-effect which most people agree should be avoided.

    Another point is that in most OO lanugages, references have an extra benefit of being more strongly typed than pointers, menaing that reference is guaranteed to only ever be pointing at an instantiated object of its specific type. That guarantee also gets broken when a reference can be NULL.

  • The reason it's hard to grok null-reference-free languages is because "a reference to nothing" is a natural concept. For instance, you want to find an object in a list. What's the result when the object you want isn't in the list? A language that can't express that concept leaves the programmer scratching their head.

    The problem I run into's usually two-fold. First, programmers who don't really think about the failure case. They go looking for something, and skip the check for whether they found it. Sometime

  • by Animats ( 122034 ) on Tuesday March 03, 2009 @01:29PM (#27053157) Homepage

    A useful way to think about troubles in language design is to ask the question "When do you have to lie to the language?" Most of the major languages have some situations in which you have to lie to the language, and that's usually a cause of bugs.

    The classic example is C's "array = pointer" ambiguity. Consider

    int read(int fd, char* buf, size_t len);

    Think hard about "char* buf". That's not a pointer to a character. It's a pass of an array by reference. The programmer had to lie to the language because the language doesn't have a way to talk about what the programmer needed to say. That should have been

    int read(int fd, byte& buf[len], size_t len);

    Now the interface is correctly defined. The caller is passing an array of known size by reference. Notice also the distinction between "byte" and "char". C and C++ lack a "byte" type, one that indicates binary data with no interpretation attached to it. Python used to be that way too, but the problem was eventually fixed; Python 3K has "unicode", "str" (ASCII text only, 0..127, no "upper code pages"), and "bytes" (uninterpreted binary data). C and C++ are still stuck with a 1970s approach to the problem.

    The problem with NULL is related. Some functions accept NULL pointers, some don't, and many languages don't have syntax for the distinction. C doesn't; C++ has references, but due to backwards compatibility problems with C, they're not well handled. ("this", for example, should have been a reference; Strostrup admits he botched that one.) C++ supposedly disallows null references (as opposed to null pointers), but doesn't check. C++ ought to raise an exception when a null pointer is converted to a reference.

    SQL does this right. A field may or may not allow NULL, and you have to specify.

    Look for holes like this in language design. Where are you unable to say what you really meant? Those are language design faults and sources of bugs.

  • Too pervasive (Score:3, Informative)

    by shutdown -p now ( 807394 ) on Tuesday March 03, 2009 @03:15PM (#27054791) Journal

    The problem with NULL/null/None as implemented in C++/Java/C#/Python/whatever is that it's pervasive - it always "adds itself" to the list of valid values of any reference type (= pointer type in C++, = any type in Python), in all contexts. At the same time, it isn't truly a valid value, because you can't do with it what you can normally do with any other value of the type. It's actually a lot like signalling NaN [wikipedia.org] for object references, and is an equally bad idea for the same reasons.

    How to handle that? Why, with explicit "nullability markers", and languages which track nullability propagation, and require to check for null everywhere you try to perform an operation that won't work for a null value whenever you have a value that can potentially be null. In FP languages, this is naturally done with ADTs; for example:

    (* Standard library *)
      type 'a option = None | Some of 'a;;
     
      (* User code *)
      let foo (xo : int option) =
        match xo with
        | Some x -> ...
        | None -> ...

    Note that OCaml compiler, in the example above, won't let you omit the "None" branch. You have to handle that (well, you can just pass on the "int option" value, but only to another function that is declared as taking one, and not just "int"). Also note how the other branch is guaranteed to get some specific, "non-null" int value for x.

    These enforced checks prevent silent null propagation, which is the bane of Java, C#, and other languages in the same league. All too often some code somewhere gets a null value where it shouldn't, stores it somewhere without checking for null, and then another piece of code down the line extracts that value (which is not supposed to be null!), passes it around to methods (which pass it to more methods, etc), and eventually crashes with a NullReferenceException - good luck trying to track down the original point of error!

Scientists will study your brain to learn more about your distant cousin, Man.

Working...