Forgot your password?
typodupeerror
Programming Software IT Technology

Old-School Coding Techniques You May Not Miss 731

Posted by samzenpus
from the good-riddance dept.
CWmike writes "Despite its complexity, the software development process has gotten better over the years. 'Mature' programmers remember manual intervention and hand-tuning. Today's dev tools automatically perform complex functions that once had to be written explicitly. And most developers are glad of it. Yet, young whippersnappers may not even be aware that we old fogies had to do these things manually. Esther Schindler asked several longtime developers for their top old-school programming headaches and added many of her own to boot. Working with punch cards? Hungarian notation?"
This discussion has been archived. No new comments can be posted.

Old-School Coding Techniques You May Not Miss

Comments Filter:
  • by belmolis (702863) <billposer@alum.m[ ]edu ['it.' in gap]> on Thursday April 30, 2009 @12:37AM (#27768353) Homepage

    For some reason the article says that only variables beginning with I,J,and K were implicitly integers in Fortran. Actually, it was I-N.

  • by smellotron (1039250) on Thursday April 30, 2009 @12:37AM (#27768355)

    And the language type is rarely interesting- I want to know that the variable outsideTemp holds degrees farenheit, not that it's an integer. But Hungarian doesn't tell me that

    Good Hungarian notation does exactly that, actually. Check out Apps Hungarian [wikipedia.org], which encodes the semantic type of the data, rather than the language-level data type.

    Of course stupid Hungarian notation is stupid. Stupid anything is stupid. Problem is, most people don't hear about the right approach.

  • by AuMatar (183847) on Thursday April 30, 2009 @12:42AM (#27768379)

    First off, most of the things on the list haven't gone away, they've just moved to libraries. It's not that we don't need to understand them, it's just that not everyone needs to implement them (especially the data structures one- having a pre-written one i good, but if you don't understand them thoroughly you're going to have really bad code)..

    On top of that, some of their items

    *Memory management- still needs to be considered about in C and C++, which are still top 5 languages. You can't even totally ignore it in Java- you get far better results from the garbage collector if you null out your references properly, which does matter if your app needs to scale.

    I'd even go so far as to say ignoring memory management is not a good thing. When you think about memory management, you end up with better designs. If you see that memory ownership isn't clearcut, it's usually the first sign that your architecture isn't correct. And it really doesn't cause that many errors with decent programmers(if any- memory errors are pretty damn rare even in C code). As for those coders who just don't get it- I really don't want them on my project even if the language doesn't need it. If you can't understand the request/use/release paradigm you aren't fit to program.

    *C style strings

    While I won't argue that it would be a good choice for a language today (heck even in C if it wasn't for compatibility I'd use a library with a separate pointer and length), its used in hundreds o thousands of existing C and C++ library and programs. The need to understand it isn't going to go away anytime soon. And anyone doing file parsing or network IO needs to understand the idea of terminated data fields.

  • The Story of Mel (Score:5, Informative)

    by NixieBunny (859050) on Thursday April 30, 2009 @12:58AM (#27768457) Homepage
    If you're going to talk about old school, you gotta mention Mel [pbm.com].
  • by Animats (122034) on Thursday April 30, 2009 @12:59AM (#27768465) Homepage

    Self-modifying code
    Yup, I actually write asm code.. plus he mentions "modifying the code while it's running".. if you can't do that, you shouldn't be wielding a debugger.

    Code that generates code is occasionally necessary, but code that actually modifies itself locally, to "improve performance", has been obsolete for a decade.

    IA-32 CPUs still support self-modifying code for backwards compatibility. (On most RISC machines, it's disallowed, and code is read-only, to simplify cache operations.) Superscalar IA-32 CPUs still support self-modifying code. But the performance is awful. Here's what self-modifying code looks like on a modern CPU:

    Execution is going along, with maybe 10-20 instructions pre-fetched and a few operations running concurrently in the integer, floating point, and jump units. Alternate executions paths may be executing simultaneously, until the jump unit decides which path is being taken and cancels the speculative execution. The retirement unit looks at what's coming out of the various execution pipelines and commits the results back to memory, checking for conflicts.

    Then the code stores into an instruction in the neighborhood of execution. The retirement unit detects a memory modification at the same address as a pre-fetched instruction. This triggers an event which looks much like an interrupt and has comparable overhead. The CPU stops loading new instructions. The pipelines are allowed to finish what they're doing, but the results are discarded. The execution units all go idle. The prefetched code registers are cleared. Only then is the store into the code is allowed to take place.

    Then the CPU starts up, as if returning from an interrupt. Code is re-fetched. The pipelines refill. The execution units become busy again. Normal execution resumes.

    Self-modifying code hasn't been a win for performance since the Intel 286 (PC-AT era, 1985) or so. It might not have hurt on a 386. Anything later, it's a lose.

  • by roc97007 (608802) on Thursday April 30, 2009 @01:01AM (#27768479) Journal

    "Top-down" coding produced readable but horribly inefficient code. Doesn't do any good for the code to work if it doesn't fit in the e-prom.

    "Bottom up" code produced reasonably efficient spaghetti. Good luck remembering how it worked in 6 months.

    "Inside-out" coding was the way to go.

    You wrote your inside loops first, then the loop around that, then the loop around that. Assuming the problem was small enough that you could hold the whole thing in your head at one time, the "inside-out" technique guaranteed the most efficient code, and was moderately readable.

    At least, that's the way I remember it. 'S been a long time...

    Now, these new-fangled tools do all the optimizing for you. 'S taken all the fun outta coding.

  • by techno-vampire (666512) on Thursday April 30, 2009 @01:17AM (#27768573) Homepage
    Yes, and the reason for that was that I and N were the first two characters of the word integer.
  • Re:Yes, I'm old (Score:3, Informative)

    by QuantumG (50515) * <qg@biodome.org> on Thursday April 30, 2009 @01:41AM (#27768695) Homepage Journal

    However, "making code run faster" is what you do after the code runs, and does what it's supposed to do, and is modular, flexible, and maintainable.

    Yeah, and you say this like you've never experienced it. Honestly, if you're writing new code you're in the vast minority of programmers.. or you're just playing around. Most of us are working on code that was written years ago and has to keep doing what it does or the company will lose money.

    I'm a web developer

    Ahh, I see.

    I'm.. not.

  • by Geirzinho (1068316) on Thursday April 30, 2009 @01:42AM (#27768705)

    Nonsense, it's simply because i - n commonly is used to denote integer variables (sum x_i from 1 to n) i mathematical notation. This is a practice dating back at least to Gauss.

  • by drawfour (791912) on Thursday April 30, 2009 @02:11AM (#27768851)

    Hungarian notation is bad because you are encoding type and scope information into the name, which makes it harder to change things later.

    Actually, this is exactly the reason I think Hungarian is useful. If you change a variable from, say, an unsigned int to a signed int, you had better check every place you use that variable to make sure that you didn't assume something about the type that now requires a different check. For example, underflow/overflow, indexing into an array, etc... By making you do a search/replace to rename the variable, you should go over every place. The person code reviewing will also see each line that had to change as a result, and can easily check that the assumptions are still valid.

  • by Darinbob (1142669) on Thursday April 30, 2009 @02:32AM (#27768981)

    A lot of RISC machines do support it, it's required and mandatory if only to load new exception handler routines, boot loaders, program loaders, etc. Even for debuggers you've got to drop a trap instruction at the breakpoints. Of course, if you never leave user mode then you may not ever have to worry about it.

    I don't expect the average programmer to have to deal with this, cache flushing, pipeline synchronization, etc. But definately a lot of embedded programmers need to know it, and operating system writers.

    Even in a debugger I've modified assembler code in place. It's a handy trick if your re-link and re-download can take 15-30 minutes.

  • by IntentionalStance (1197099) on Thursday April 30, 2009 @03:29AM (#27769343)
    This is one of my favourite quotes:

    "The First Rule of Program Optimization: Don't do it. The Second Rule of Program Optimization (for experts only!): Don't do it yet." - Michael A. Jackson

    That being said, when I hit the experts only situation I can usually get 2 orders of magnitude improvement in speed. I just then have to spend the time to document the hell out of it so that the next poor bastard who maintains the code can understand what on earth I've done. Especially given that all too often I am this poor bastard.
  • Re:Some, not all... (Score:2, Informative)

    by Darinbob (1142669) on Thursday April 30, 2009 @03:30AM (#27769349)

    How is that faster than: "int table[MAXINDEX+1];" ?

  • Re:True story (Score:4, Informative)

    by noidentity (188756) on Thursday April 30, 2009 @03:36AM (#27769391)

    Turns out that neither of them had the faintest fucking clue what a hash table is, or for that matter what a linked list is. They looked at its hash bucket and expected nothing deeper than that. And, I'm told, at least one of them had been in a project where they actually coded workarounds (that can't possibly do any difference, too!) for its normal operation.

    It's all too-often that people get the wrong view of a program using the debugger, either because it's not showing what's really there, or they're not interpreting it right. If you think something's wrong based on what you see in the debugger, write a test program first. More often than not, the test program will pass. After all, the compiler's job is to output code which meets the language specification regarding side-effects, not to make things look right in the debugger. In this case, the developer should have written a simple test which inserted two different values that had the same hash code, and verified that he really could only access one of them in the container. He would have found that they were both still there.

  • by Two9A (866100) on Thursday April 30, 2009 @03:42AM (#27769417) Homepage

    I had a similar problem when I was writing an "extended text-mode" (80x25) software driver for the C64, recently. Since each character is encoded into 8 bytes, and there are 256 possible characters, the character definitions span over a wider space than the 8-bit index register can fetch.

    Simple to fix: just self-modify the instructions that handle the font buffer, changing the base pointer as you enter a new page. Since the C64 has a 6510 chip, you'll probably understand the code quite well.

    I wrote an article on the code [imrannazar.com] a few months back, might be an interesting read.

  • Re:Yes, I'm old (Score:3, Informative)

    by smash (1351) on Thursday April 30, 2009 @03:52AM (#27769467) Homepage Journal

    No. You should never write code that runs like a dog in the first place.

    Being initially correct is far more important for nailing down what you are trying to do than being FAST.

    Premature optimisation is a sure fire way to shoot yourself in the foot. Who cares if a function that is called on a rather minimal basis is slow but understandable and easily verifiable as CORRECT?

    Write a prototype, *profile it*, THEN put on your optimisation hat.

    I don't care how fast your code gives me incorrect output.

  • by fractoid (1076465) on Thursday April 30, 2009 @04:05AM (#27769537) Homepage
    This thread requires a reference to The Story of Mel [pbm.com].
  • Re:Some, not all... (Score:3, Informative)

    by Fulcrum of Evil (560260) on Thursday April 30, 2009 @04:16AM (#27769593)
    It isn't faster to use the map, but a map (a,b) is clearer than an array of ints with no particular typing on the contents.
  • by fractoid (1076465) on Thursday April 30, 2009 @04:22AM (#27769631) Homepage

    Even mundane programmers would not dream of using a generic library that includes sorts they'll never refer to in, say, an e-mail client or a game. They'll write their own.

    Erm, why the hell not? Good programmers, even the best programmers (in fact especially the best programmers), will just use qsort() (or the equivalent for the language they're using). Then, IF performance on their lowest-spec target hardware is unacceptable, they will profile their code and find out what's taking the time. And then, IF it's the sorting algorithm that's the bottleneck, only THEN will they implement a more specific version. Anything else is a waste of time and an additional risk of introducing unnecessary bugs.

    Unless we're really pushing the boundaries (and those boundaries are so far away with modern computers that 99% of applications can't even SEE them from their cosy seat in the middle of userland) the stock sorting algorithm your language provides will be plenty fast enough. If you're using a high-level interpreted language, you'll never beat it in efficiency.

    What you're saying may have been true 15, or even 10 years ago, but it's certainly not true now.

  • by UnknownSoldier (67820) on Thursday April 30, 2009 @04:53AM (#27769811)
    > the most common piece of self modifying code was to implement a 16 bit index read/write/goto instruction in the Apple ]['s (and Atari and C64) 6502 processor.

    Yeah, there were 2 common paradigms...

    a) self modifying code...
    300: A2 00      LDY #00
    302: BD rr ss   LDA $ssrr,X
    305: 9D tt uu   STA $uutt,X
    308: E8         INX
    309: D0 FA      BNE $302
    30B: EE 03 03   INC $303
    30E: EE 07 03   INC $307

    b) using the Zero-Page
    300:B1 00  LDA ($00),Y
    302:91 00  STA ($00),Y
  • by AuMatar (183847) on Thursday April 30, 2009 @05:20AM (#27769989)

    The reason people say C++ is slower and uses more memory is because it is. Not due to the language itself (except for one case), but due to how people use it and the mistakes they make

    1)RTTI and exceptions- very slow. If you use them you will be slower than C. Of course most embedded systems avoid them like the plague. (This is the one case where it's a language fault)

    2)Passing objects. Its a frequent mistake that people forget to pass const object& rather thn the object itself, causing extra constructors and destructors to be called. Honest but costly mistake.

    3)The object oriented model and memory. In an object oriented model you tend to do a lot more memory copying. In C, if you have an OS function that returns a string (a char*), you'll use generally save that pointer somewhere, use it directly a few times, then free it. In C++ you'll take it, insert it into a string object (which will cause a copy), pass that object around (and even by reference thats less efficient than using a char* directly), probably call c_str() on it if you need to pass it back to the OS, then finally let the destructor free it. More time.

    4)The object oriented model and hiding complexity- it can be very easy in an object oriented system to forget the true cost of an operation. Programmers think of x=y as a cheap operation, like it is with ints. With objects, it may be very expensive. Same with other operations that happen "automatically" like string concatenation using +. It can be easy to write code that doesn't look too bad, but really takes thousands of cycles.

    5)Constructors, copy constructors, and operator =- some of these can be called in very unusual places, especially when they're being passed to and/or returned from a function. Read Scott Meyers for a list of all of them. If you had a function that was passed in two Foo objects, mainpulated them, created a new Foo object, set it equal to one of the two passed in, and returned that Foo object I doubt 1 in 10 programmers would correctly guess all of the times these would be called (and I'm not that 1- it's been way too long since I studied the issue). In C these would be at worst 4 memcpys (two for pass in, 1 for assignment, 1 for return). So C++ object quirks can eat up a lot of time in these situations.

    All that doesn't mean you shouldn't use C++. But due to it you won't get the sheer execution speed you would in C.

  • by DamageLabs (980310) on Thursday April 30, 2009 @07:25AM (#27770675) Homepage

    x = x + y
    y = x - y
    x - x - y

    Much less CPU intensive.

  • b) using the Zero-Page

    I program for a 6502-based platform [pineight.com], and I use that method too. But on the Apple II family, assembly language was often used to write subroutines for programs written in Applesoft BASIC to call, and Applesoft BASIC used about 90 percent of zero page for itself, so it was more difficult to find "holes" in BASIC's usage for you to stick your addresses. But then that was no problem if you're writing pure-assembly programs; in that case, you only had to stay out of the way of the Monitor and DOS. Nor is it a problem on some 6502-based platforms such as the NES, which has no built-in BASIC.

  • Re:Some, not all... (Score:5, Informative)

    by MadKeithV (102058) on Thursday April 30, 2009 @08:13AM (#27770987)

    or that a for loop should be processed with >= 0 where possible (or split into a while loop) to reduce computation time.

    This is an obfuscating micro-optimization with pitfalls (e.g. unsigned is always >= 0) and should not be a general rule. In many cases the compiler will do any optimization here automatically, and in other cases you need to profile first to make sure this is the bottleneck before obfuscating the code.

  • by ClosedSource (238333) on Thursday April 30, 2009 @12:22PM (#27774321)

    just because you're trying to optimize the performance of your system doesn't make it a real-time one. If it were a real-time system, missing a timing window would result in your data being incorrect, not just tardy.

  • Re:Yes, I'm old (Score:2, Informative)

    by gwjgwj (727408) on Thursday April 30, 2009 @04:41PM (#27778463)
    Every day. How about we throw in some reverse polish notation too.. get a Polka going.
    Actually, Polka is a Czech dance.
  • Re:Some, not all... (Score:3, Informative)

    by Chabo (880571) on Thursday April 30, 2009 @05:38PM (#27779391) Homepage Journal

    In any reasonable OO language, it's simple to let the language compare your new Data Types.

    Using Java as an example, just inherit the Comparable interface, then implement the compareTo() method to output whatever metric you want to compare the elements with. If you have the elements inside a standard Collection, then Collection.sort() will sort the elements by using the compareTo() method.

"Say yur prayers, yuh flea-pickin' varmint!" -- Yosemite Sam

Working...