Eric Raymond Shares 'Code Archaeology' Tips, Urges Bug-Hunts in Ancient Code (itprotoday.com) 109
Open source guru Eric Raymond warned about the possibility of security bugs in critical code which can now date back more than two decades -- in a talk titled "Rescuing Ancient Code" at last week's SouthEast Linux Fest in North Carolina. In a new interview with ITPro Today, Raymond offered this advice on the increasingly important art of "code archaeology".
"Apply code validators as much as you can," he said. "Static analysis, dynamic analysis, if you're working in Python use Pylons, because every bug you find with those tools is a bug that you're not going to have to bleed through your own eyeballs to find... It's a good thing when you have a legacy code base to occasionally unleash somebody on it with a decent sense of architecture and say, 'Here's some money and some time; refactor it until it's clean.' Looks like a waste of money until you run into major systemic problems later because the code base got too crufty. You want to head that off...."
"Documentation is important," he added, "applying all the validators you can is important, paying attention to architecture, paying attention to what's clean is important, because dirty code attracts defects. Code that's difficult to read, difficult to understand, that's where the bugs are going to come out of apparent nowhere and mug you."
For a final word of advice, Raymond suggested that it might be time to consider moving away from some legacy programming languages as well. "I've been a C programmer for 35 years and have written C++, though I don't like it very much," he said. "One of the things I think is happening right now is the dominance of that pair of languages is coming to an end. It's time to start looking beyond those languages for systems programming. The reason is we've reached a project scale, we've reached a typical volume of code, at which the defect rates from the kind of manual memory management that you have to do in those languages are simply unacceptable anymore... think it's time for working programmers and project managers to start thinking about, how about if we not do this in C and not incur those crazy downstream error rates."
Raymond says he prefers Go for his alternative to C, complaining that Rust has a high entry barrier, partly because "the Rust people have not gotten their act together about a standard library."
"Documentation is important," he added, "applying all the validators you can is important, paying attention to architecture, paying attention to what's clean is important, because dirty code attracts defects. Code that's difficult to read, difficult to understand, that's where the bugs are going to come out of apparent nowhere and mug you."
For a final word of advice, Raymond suggested that it might be time to consider moving away from some legacy programming languages as well. "I've been a C programmer for 35 years and have written C++, though I don't like it very much," he said. "One of the things I think is happening right now is the dominance of that pair of languages is coming to an end. It's time to start looking beyond those languages for systems programming. The reason is we've reached a project scale, we've reached a typical volume of code, at which the defect rates from the kind of manual memory management that you have to do in those languages are simply unacceptable anymore... think it's time for working programmers and project managers to start thinking about, how about if we not do this in C and not incur those crazy downstream error rates."
Raymond says he prefers Go for his alternative to C, complaining that Rust has a high entry barrier, partly because "the Rust people have not gotten their act together about a standard library."
Rust: act together about a standard lib, (Score:1)
Re:What an idiot (Score:5, Informative)
Personally I think that C++ contains a lot of the bad parts from C and Java while not really offering any major advantage.
In any case - Valgrind and Splint are great for C programs, but for kernel work it's a bit hard to use Valgrind.
When coding Java I have had great experience using Findbugs [sourceforge.net]. For C# I haven't seen any tool as good as that tool.
As a rule - never ignore compiler warnings, they may be the tip of an iceberg problem. I have found a lot of naughty bugs and coding that way.
Also beware of re-using variables, something that I have seen is very easy in VB - a variable is re-used and suddenly contains a new data type. That's really nasty. And some script languages allows that as well.
Re: (Score:2)
I used C++ for a long time, starting with Cfront, though the last decade has been C/assembler. C++ seems to have lost focus and the later standards seem somewhat strange like they're adding new features that aren't needed except to pad out the new standard. Now it's not so great for a low level systems language unless you use a lot of self discipline to avoid features, and it's completely bloated for big applications if you use the fashionable styles, and scripting languages do so much better and rapid pr
Re: (Score:2)
C++ seems to have lost focus and the later standards seem somewhat strange like they're adding new features that aren't needed except to pad out the new standard.
Like what? I mea technically no feature is "needed" in that the language is turing complete and pretty fuctional circa 1998, but I'd say the latest standards have added a fair bit good. C++14 and 17 have both been fairly minor additions, but have given some nice refinements to the rules and some much needed additions to the library.
A shame concepts
Re: (Score:2)
Personally I think that C++ contains a lot of the bad parts from C and Java while not really offering any major advantage.
I disagree for a few reasons. Firstly, Java? What? C++ predates Java.
I've also not had a memory leak in C++ in probably 15 years. Either I'm the awesomest programmer ever or C++ offers some pretty big advantages. RAII is fantastic for resource management. Generics make equivalent code faster and simpler than the C equivalent.
Re: (Score:2)
Why in the hell are you using a typeless variable in VB?
Re: (Score:2)
No I don't, but I have seen code where it existed. Facepalm time there.
Unfortunately we can't have ADA everywhere. But for VB then I blame Microsoft for creating a shitty language that didn't force users from the beginning to be strict.
Re: (Score:2)
Not really garbage collection. It's a "good enough" collector. A real garbage collecting language has zero allocation routines and zero freeing routines, and will reclaim every single object eventually, and do this faster than manual alloc/free. Not a lot of common languages do this these days, because it requires deep hooks into the OS and CPU. Whereas modern scripting languages often find it faster to get going by building the framework on top of device independent languages like C or C++.
Re: (Score:1)
Re: (Score:2)
Smalltalk, Lisp, ML, etc.
Re: (Score:2)
Being able to write memory safe code in a memory unsafe language is not enough, you need machine validation that only the memory safe constructs are used.
C and C++ programmers given the freedom to do so will inevitably create exploitable bugs because of memory unsafe programming.
Re: (Score:2)
Was that really necessary?
Re: What an idiot (Score:1)
eric raymond is the most over rated "hacker" in history. he has no actual technical achievements to speak of, he's famous for promoting open source, sending a death threat to bruce perens, and for attempting (and massively failing) to write a new build system for the linux kernel.
Re: (Score:1)
Re: (Score:3, Informative)
When I coded C for MS-DOS I had to make sure that I did malloc/free in the right order just to avoid memory leaks. So if I did one malloc for A then one for B the result was that I had to free B before A or I would have trouble coming.
Re: (Score:2)
Re: (Score:2)
C++ also provides easy reference counting and insurance of pointer uniqueness. I agree that one can do the same with very well ordered C, but C++ automates the process for you, and that's one of the things that programming languages are about, automating simple thought processes for the programmer.
Re: (Score:2)
Anyone who considers such fundamental concepts of C++ as useful to "way less than 1% of the code in typical projects" "has not understood the language" either.
Crystal - Slick as Ruby, Fast as C (Score:5, Interesting)
Eric suggests it's time to move on from C, and there are indeed better languages today that can help eliminate many classes of error.
Crystal [crystal-lang.org] is a rising programming language with the slogan "Fast as C, Slick as Ruby". It has some compelling features that make it more attractive than other modern language attempts like Go. You really can program in a Ruby-like language and achieve software that performs with the speed of a compiled language. And you can do systems programming in Crystal, too, because while it doesn't encourage you to use them for anything but systems programming and inter-language interfaces, it has pointers, and it can format structs as required to work on hardware registers.
But the greatest advantage of Crystal, that I have experienced so far, is that it provides type-safety without excessive declarations as you would see in Java. It does this through program-wide type inference. So, if you write a function like this:
def add(a, b)
a + b end
add(1, 2) # => 3, and the returned type is Int32
add(1.0, 2) # => 3.0, and the returned type is Float64
You get type-safe duck-typing at compile-time. If a method isn't available in a type, you'll find out at compile-time. Similarly, the type of a variable can be inferred from what you assign to it, and does not have to be declared.
Now, let's say you never want to see nil as a variable value. If you declare the type of a variable, the compiler will complain at compile-time if anything tries to assign another type to it. So, this catches all of those problems you might have in Ruby or Javascript with nil popping up unexpectedly as a value and your code breaking in production because nil doesn't have the methods you expect.
There are union types. So, if you want to see nil, you can declare your variable this way:
a : String | Nil
a : String? # Shorthand for the above.
Crystal handles metaprogramming in several ways. Type inference and duck typing gives functions and class methods parameterized types for free, without any declaration overhead. Then there are generics which allow you to declare a class with parameterized types. And there is an extremely powerful macro system. The macro system gives access to AST nodes in the compiler, type inference, and a very rich set of operators. You can call shell commands at compile-time and incorporate their output into macros. Most of the methods of String are duplicated for macros, so you can do arbitrary textual transformations.
There is an excellent interface to cross-language calls, so you can incorporate C code, etc. There are pointers and structs, so systems programming (like device drivers) is possible. Pointers and cross-language calls are "unsafe" (can cause segmentation faults, buffer overflows, etc.) but most programmers would never go there.
What have I missed so far? Run-time debugging is at a very primitive state. The developers complain that LLVM and LLDB have changed their debugging data format several times recently. There's no const and no frozen objects. The developers correctly point out that const is propagated through all of your code and doesn't often result in code optimization. I actually like it from an error-catching perspective, and to store some constant data in a way that's easily shareable across multiple threads. But Crystal already stores strings and some other data this way. And these are small issues compared to the benefits of the language.
Lucky
Paul Smith of Thoughtbot (a company well-known for their Ruby on Rails expertise) is creating the Lucky web framework [luckyframework.org], written in Crystal and inspired by Rails, which has pervasive type-safety - and without the declaration overhead as in Java.
The point of all of this is that you can create a web application as you might using Ruby on Rails, but you won't have to spend as much time writing tests, because some of the
Re: (Score:2)
I find it very difficult to find really good C programmers who can also have good understanding of low level systems with some domain knowledge. Lots of self taught C programmers though, they can write stuff that's good enough for their own simple tools, but who lack experience working on complex software in a team, mostly they're coming from EE or science backgrounds. People coming from a CS background don't seem to know C, only vaguely know C++ (and can't make the tiny step from there to C), and who com
Re: Crystal - Slick as Ruby, Fast as C (Score:1)
I find C programmers in embedded work who don't understand that they are responsible for everything coming up out of the reset vector. They assume timers are initiated, that interrupt vectors are pre-defined, and that they can just coast along at an application level. That's just the consequence of too much programming on top of an operating system, I guess.
Re: (Score:2)
Re: (Score:3)
I find it very difficult to find really good C programmers ...
If you really need C programmers, why not hire a competent programmer and teach him C or simply ask him to learn it over a weekend
Re: (Score:2)
C takes a huge amount of experience before you stop shooting yourself in the foot and start being productive. Languages like Go are trivial to learn. Rust is less trivial.
No one is expecting C to disappear any time soon.
Re: (Score:2)
You should also check out Nim.
https://nim-lang.org/ [nim-lang.org]
It is the Python version of Crystal. It transpiles to C/C++/Node. Has type inference, integrates Boehm GC, good FFI, meta-programming etc. Similar performance to C (CPU, RAM, static binary size), but with the productivity of Python.
Crystal seems to have more modules - both seem to have the essential libraries covered.
Re: (Score:2)
Like it or leave it, you don't know what you're talking about.
From the official ruby faq [ruby-doc.org]:
"Ruby is a modern object-oriented language, combining elements of Perl, Smalltalk, and Scheme
Influenced by Perl, Matz wanted to use a jewel name for his new language, so he named Ruby after a colleague's birthstone. Later, he realized that Ruby comes right after Perl in several situations. In birthstones, pearl is June, ruby is July. When measuring font size
Re: (Score:2)
lol... do you even know what that means? Are you saying the Ruby interpreter isn't coded in Perl? That's meaningless.
Are you saying that you can't copy/paste code directly from a Perl program into a Ruby program? Again, meaningless. Do you even know what derivative means?
Derivative = not original, derived. derived = received or obtained from a source or origin. Hmmm, where have we heard something like that before? Wait, I
dejour article (Score:2)
Another language de-jour article, I remember in the early days of UNIX one of the sayings about c was something like "it expects the programmer to know what they are doing, it is in a hold-your-hand language".
Well I guess we should go back to COBOL or FORTRAN then
Re: (Score:2)
What does he actually know? (Score:2, Insightful)
Eric Raymond is not well reknowned for his programming/engineering achievements, but for being a public speaker. What is the value of his advice?
Re: (Score:2)
A memory leak in Java code might get you a DOS attack, a buffer overflow in C might get you complete owned.
Apples and rotten oranges.
Re: What does he actually know? (Score:1)
He is visionary in the same sense that a guy assigned to give the keynote speech at an IBM Symposium in 1966 was visionary.
Re: (Score:2)
I always find it valuable to evaluate the advice first; then see whom it came from.
Let's look at his advice:
Frankly I think static code analysis is always worth doing. It's a way of doing automated "apply learned lessons from someone else". Yes, there may be false positives - but you'll quickly uncover code smells.
I think this one is also a no-brainer. I used to do Fortran...switched to C/C++
Re: (Score:2)
Of course.
Hell, everything can be done in go can be done assembler and it's faster (execution).
But it won't be faster to write/debug. (Maybe you like to roll your own unit test and benchmarks, but I don't want to)
unmitigated bullshit (Score:1)
"we've reached a typical volume of code, at which the defect rates from the kind of manual memory management that you have to do in those languages are simply unacceptable anymore"
Translation: "We kept telling programmers they're too dumb to do their own memory management and rely on high level languages like java which handle memory management automatically and now the programmers coming out of college don't even understand memory management because it's too complex so it's time to move onto something like
Re: (Score:2)
It's not a matter of dumb or smart, it's a matter of being human. There's only so many things you can pay attention to at once, and at a certain level of machine capability it's just not worth doing anymore for many use cases. And the the choice isn't C++ or silly over to top OO there is a lot of options for memory-safe languages.
Sanitizers (Score:2)
It's odd that he talks about using tools to validate your code on one hand and then recommends moving away from C or C++ on the other.
There's actually some pretty fantastic work on sanitizers being done right now in Clang (and other tooling chains) that can enforce memory and type safety at run time.
You can do all your development with the sanitizers turned on, and then when you want speed when you're ready to release, turn them off.
There's still nothing faster than C or C++ than assembly, and even then you
Re: (Score:2)
I'm not a C++ programmer, but I'm genuinely curious. What idioms would that be?
"Real programmers don't comment their code..." (Score:2)
Documentation is important.....Code that's difficult to read, difficult to understand
"Real programmers don't comment their code. If it was hard to write, it should be hard to understand." - some coder who doesn't work here anymore.
Re: (Score:2)
Re:remember what a fucking verb is (Score:2)
We should be thankful he didn't suicide anyone else.
Re:remember (Score:5, Informative)
Ian was and continues to be very admired for his achievements, and his death was unnecessary and completely undignified, and is a continuing source of disquiet for me personally. Ian is a victim of mental illness. This is acknowledged by his family and by those who knew him more closely, rather than simply admiring him from afar. Rather than dishonor Ian by discussing this in detail, I would prefer to simply state that he was a victim of mental illness, not the police.
Re: remember (Score:1)
Do you want us to say the same things about you when the police decide it's your turn to pay the piper?
When the police pump you with 30 holes, we will say "it wasn't the police officers fault for shooting Bruce 30 times, it was mental illness."
You are a very smart man, but sometimes you make me sick. It's like you only stand up for things that won't get you in trouble. You tip toe around, when you should be bringing these issues to light. You use your fame to hide issues, choosing to Instead derail conversa
Re: (Score:2, Insightful)
They could have protected and served them by whacking him across the arse with a truncheon, handcuffing him and dumping him in a cell to sleep it off.
Like they do in civilised countries.
Drunks are shit fighters, even if they think the opposite. If you can't subdue one without artillery you shouldn't be working as a cop.
Re: (Score:1)
Who are you talking about? Ian Murdock was not shot!
Re: (Score:2)
Re: (Score:2)
They serve themselves and protect each other. Was it ever supposed to mean anything else?
Re: (Score:3)
It is heart-breaking that he died without a friend left in the world, but that was a consequence of his illness.
I am 60 and my death isn't all that far away any longer. I am fortunate to have friends and a wonderful family, and hope to die in peace, with them around me.
Re: remember (Score:1)
One of the consequences of a bridge falling down can be debris in the river that blocks barge traffic. That does not imply that the bridge voluntarily fell down.