Secure, Efficient and Easy C programming 357
cras writes "Feeling a bit of masochist today.. First in the morning I wrote Secure, Efficient and Easy C Programming Mini-HOWTO. And since I already spent a few hours with it, I figured I might just as well see what Slashdot people would think about it."
Secure, Efficient and Easy (Score:5, Funny)
As a lib and/or in general programming tutorials (Score:2, Insightful)
The other problem is that security issues usually aren't mentioned in general programming tutorials (and books).
If beginners would be pointed to techniques like this (with explanations why) lots of typical mistakes would not happens.
--
Stefan
Looking for Developers, new project members, testers or help? Want to provide your abilities ?
DevCounter ( http://devcounter.berlios.de/ [berlios.de] )
An open, free & independent developer pool.
future plans? (Score:5, Funny)
Damn. What are your plans for the rest of the day?
Re:future plans? (Score:5, Funny)
Maybe make it a HOWTO rather than a Mini-HOWTO? Hell, I could write a mini-HOWTO right here...
SECURE
1) Don't use strcpy.
2) Don't assume data coming in from the world is within valid limits
EFFICIENT
1) Avoid moving/copying large amounts of data whenever possible. Work in place.
EASY
1) Don't redefine the language using macros (e.g., define BEGIN {, #define END })
2) Comment your source
3) Use The One True Brace Style. All others are heretical crap.
Damn, now what do I do with the rest of my day?
Re:future plans? (Score:3, Funny)
Use The One True Brace Style. All others are heretical crap.
As long at that's my brace style (K&R), you are correct.
Re:future plans? (Score:3, Informative)
Re:future plans? (Score:2)
Re:future plans? (Score:2)
Don't use strncpy with the third var of strlen(sourcevar) either.
Seriously
Re:future plans? (Score:4, Funny)
Damn. What are your plans for the rest of the day?
"If you've done six impossible things this morning, why not round it off with breakfast at Milliways, the Restauraunt at the End of the Universe?"
-- Douglas Adams
Hmm.. Question (Score:4, Funny)
So did you wake up early this morning, or are you still up from the night before, like me?
Re:Hmm.. Question (Score:2, Funny)
a little short?? (Score:4, Interesting)
It does look like a good start, add a few more chapters and you will be halfway there...
Re:a little short?? (Score:5, Informative)
Sorry, but I think this is about all I have to say. Secure Programming HOWTO [dwheeler.com] should take care of the rest.
Re:a little short?? (Score:2)
BTW the "s" in Sunday should be capitalized (the part where you mentioned "Sunday morning".
Sorry about that (former English major)
Re:a little short?? (Score:2, Funny)
Re:a little short?? (Score:2)
Damn true, using C for other thing than low-level stuff really is a bad habit.
Re:a little short?? (Score:2)
Re:a little short?? (Score:2)
Re:a little short?? (Score:3, Funny)
Damn true, using C for other thing than low-level stuff really is a bad habit.
Oh, God, another Visual Basic user who writes code with a mouse. Spare me.
Re:a little short?? (Score:5, Funny)
Oh, God, another Visual Basic user who writes code with a mouse. Spare me.
Yes, because it's better to spend weeks and months carefully constructing a GUI by hand then to put it together in a couple days with a mouse. Especially if it's going to be used by three or four people; by God, it's more than worth it to the company for me to spend two or three months on the project (@ $60,000 a year) so those people can get their results back in a couple seconds rather than a couple minutes.
It's also better to spend weeks and months writing an efficent text processing program in C and worrying about buffer overflows and memory leaks, rather then writting it in a couple days in Perl or Snobol. Who cares that the results will inevitably be piped to less and studied for a few minutes; the fact that we shaved off 40% of 2 seconds (and added an obscure error case) is more than worth it!
Actually: Oh, God, another C programmer that will make me suffer through anonymous core dumps because his programming language is so much more macho, and so much more efficent (really wish he understand how to use Big-O notation and switch algorithms, but he spent so much time programming this one and dubugging it that he can't afford to switch. Too bad he doesn't use a language with efficent control structures predebugged and optimized.)
Re:a little short?? (Score:3, Funny)
But my infinite loops runs *so* fast!
And what would I do without my precious core dumps?
I can't trust the computer to manage his memory!
Who cares about algorithms if the language is fast?
I could not live without '\0' delimited strings!
Strong typing sucks. Dynamic typing sucks. I like my types to have no purpose other than sizing the fields in memory.
Error control and safety are for wimps.
Macros should be dumb text substitution tools.
Re:a little short?? (Score:4, Interesting)
Mirror of HOW-TO in case it gets slashdotted (Score:5, Funny)
Not to belabor the point (Score:3, Informative)
The advantages of Python for almost every other operation are really too numerous to list.
Your point about "right tool for the right job" is well taken. _Good_ Python programmers learn the C extension API, and use it when appropriate. Guido van Rossum, the creator of Python, even states in one of his papers "If you feel the need for speed, go for built-in functions - you can't beat a loop written in C."
Voluntary slashdotting (Score:3, Funny)
Re:Voluntary slashdotting (Score:2)
I think he already planned for it. Seems to be holding up.
Re:Voluntary slashdotting (Score:2, Interesting)
[cras@foo] ~$ ps ax|grep apache|wc -l
60
[cras@foo] ~$ uptime
20:32:54 up 127 days, 10:58, 56 users, load average: 0.23, 0.41, 0.37
Those loads were pretty much the same before slashdotting.
Re:Voluntary slashdotting (Score:3, Funny)
Re:Voluntary slashdotting (Score:2)
Unportable? (Score:5, Interesting)
On a more serious note, why in Bob's name don't these two functions exist, standard, in Linux? IMO, they should be added, and gcc should give deprecation warnings about the use of non-safe buffer handling functions - sprintf, strcat, strcpy, etc. No offense to purists, but screw the standard. I'll sacrifice some portability of software and such for security.
Oh, and on a side note, you may take my malloc() when you pry it from my cold dead fingers.
Re:Unportable? (Score:2)
Re:Unportable? (Score:3, Informative)
char buf1[4];
char buf2[4]="Hi!";
strncpy(buf1,"Bonk",4);
printf(buf1);
You will usually get either "BonkOk?" or "BonkHi!"
With strlcpy you will get "Bon"
strlcpy will put the null in the last position, strncpy won't if the last spot has a value it in.
That can result in bad code later with:
char other_buf[4];
strcpy(other_buf,buf1);
Or even:
strncpy(other_buf,buf1,strlen(buf1));
Strncpy isn't much safer than strcpy and gets and scanf are right out for the same reasons.
Definitely useful (Score:5, Interesting)
Do it in a higher-level language first. Make sure your algorithms are clean and efficient. If and only if you see a performance or resource problem do you rework portions(!!!) in C. As a bonus, the higher level language acts as a code template for faster C development.
Once you are at that point, this Mini-HOWTO will definitely be a great resource to use.
Re:Definitely useful (Score:5, Insightful)
Prototyping in a higher-level language (c# is easy, java everyone knows) is a superb idea, provided you
- can release the final product as interpreted, with slow execution speed
- can afford the time to port all to C, in which case DO, this is an excellent way to make a watertight C program
- are happy to learn how to make managed code/vm code call to native and vice-versa (this is far from a trivial problem)
There are apps that fit into all 3 categories, and if your end-result should be a watertight C program, it may even be faster to prototype.
Fight the conventional wisdom! make good code by doing it right, not by being a genius who can hold 4000 variables in his mind over a month-long project (because you aren't one anyway).
Re:Definitely useful (Score:2)
As a RL example a project I worked on used TCL as the glue for C TUXEDO services and Oracle. So to adress your points above WRT this project:
1)There were no performance issues in the delivered product, if a part of the code is under performing in the glue language you rewrite it in C and bind to it instead.
2)We did not need to port all the code to C, identify the small portions of code which benefit greatly from a port and port only them. The time saved here is enormous compared to implementing the entire project in C.
3)calling to native code in TCL was trivial, if it's "far from trivial" for you perhaps you chose the wrong language for the glue. A popular language for doing this today is Python, in which it is also straightforward to create bindings for.
Re:Definitely useful (Score:2)
Yes, glue languages are designed to interoperate easily, but I defy you to create a full-blown app prototype in TCL.
Go on, let's see it.
Re:Definitely useful (Score:3, Funny)
I'm tired of this constant discrimination against the citrus fuits. One of these days the people will get up and say "I'm tired of people thinking that oranges aren't good enough for comparison." They'll say "I can compare apples and oranges". They'll run to the windows and say "This orange is much yellower than this apple". People will be running through the streets screaming "This apple is much more smooth than this orange."
And then my group(The People for the Ethical treatment of Cirtrus (PETC)) will be happy.
Re:Definitely useful (Score:2)
If you are dealing with highly dynamic data, lots of strings, etc. Start in Python. Then locate bottlenecks and convert those sections to C.
If you are dealing with a large application, that you need to control precisely, start with Ada. You probably won't need C at all, but be prepared for a lot of work. If you need to call libraries written in C, you can do so realtively directly.
If you are dealing with a relatively complex application that needs garbage collection, Eiffel is a good choice. Links to C are easy, once you master them, but links back are... dubious.
If you are dealing with something simple, or if everyone else is using C, then your best choice may be to just use C. (Eiffel can generate C code, but don't expect to be able to read it.)
I don't like C for anything that isn't REALLY simple. But it's a lot better than the language I've ended up using most of the time recently (Visual Basic). (Currently I think I may have managed an escape from MSAccess into Java. [Wouldn't you know it! I need to learn a new language to escape from one of the worst environments on the planet. But Java is the only choice that mgmt. will accept.])
Re:Definitely useful (Score:2)
However, as for GC etc using Eiffel - you say links to C are easy once you master them - same with every other language, including c# and java.
Especially if you're trying to get away from VB, why not use
Re:Definitely useful (Score:5, Interesting)
Do it in a higher-level language first. Make sure your algorithms are clean and efficient. If and only if you see a performance or resource problem do you rework portions(!!!) in C. As a bonus, the higher level language acts as a code template for faster C development.
Amen.
Kragg wrote (in his reply to ttfkam):
Prototyping in a higher-level language (c# is easy, java everyone knows) is a superb idea, provided you
- can release the final product as interpreted, with slow execution speed
Most programs spend 90% of their CPU time executing 10% of their code. If that 10% is optimized in a low-level language such as C, a large-scale interpreted program can boast performance that's virtually indistinguishable from an equivalent program written entirely in a low-level langauge. However, there's likely to be a huge difference in programmer productivity.
As a reference, see this Dr. Dobbs article [ercb.com], which states:
"""
"""
- can afford the time to port all to C, in which case DO, this is an excellent way to make a watertight C program
Why port 90% of the application's code to a low-level and less productive programming language, when that 90% will inevitably evolve and require maintenance as the program is utilized in unforeseen ways? I've never written a large program that didn't end up having features added incrementally over a long period of time after the initial release.
- are happy to learn how to make managed code/vm code call to native and vice-versa (this is far from a trivial problem)
If it's "far from a trivial problem", you're using the wrong tool.
Take Python, for example: it's simple to interface between Python and C using Python's C API [python.org]. Recently, a tool named Pyrex [canterbury.ac.nz] has appeared that makes it almost trivial. Pyrex is amazing.
Kragg suggested prototyping in C# or Java, but Python surpasses both of those as a prototyping tool. Python is higher-level than C# or Java (and thus better suited to prototyping and/or malleable fusion with C) because it features:
- dynamic typing ("dynamic", not "weak" like Perl)
- no obession with a particular programming paradigm; use procedural, functional, or OO as appropriate
- high-level data structures built into the language
- more convenient dynamic code loading
- interactive development at a "Python prompt" (the value of this cannot be overestimated)
- no separate compilation step in the edit-test-debug cycle
- more concise syntax
- excellent interface capabilities to C (or C++ via Boost.Python [boost.org], or Java via Jython [jython.org])
I suggest that the fusion of a truly high-level (higher than Java-level) language with C is far more broadly applicable than Kragg claims.
I send you this post to have your advice (Score:5, Informative)
ps. As modern coding is more about the manipulation of very complex structures, rather than how to say, walk a linked list; a higher level language, with native support for more complex constructs, has the potential for creating much faster applications than something on the level of C. The reason being is that the h/l compiler can reason about, and thus optimize over, larger components than the C compiler.
Wha??? (Score:5, Funny)
Wait a second. It's not April 1st is it?
Lacking some focus (Score:5, Interesting)
I skimmed over it, and it is a little rough. My main impression is that I'm not sure what I'm supposed to get out of it (wait! keep reading!).
Who's this meant for? Beginning C coders? Intermediate C coders? Where do C++ junkies fit in? My university taught me C++. I really don't like plain C (feels like coding with some fingers cut off), but I know I need to know it. I was wondering if this would point out some things about C that I ought to know. While I'm sure that it might, I think I might need a beginner's guide first, as a lot of the classic C commands aren't second-nature to me.
Also, you have a bunch of "The old way" sections. What are these sections telling us? Is "the old way" bad? Do they have their uses? You don't really say. I'm not sure what I'm supposed to get from them.
It's pretty ambitious for a Sunday morning, and it's pretty good. But you should probably clarify your who your target audience is. Also, look back at each passage; what is the message? Does it put that message across effectively?
-Grant
Re:Lacking some focus (Score:2, Insightful)
It doesn't really fit to C++ programmers, they seem to have better ways to handle things. It's mostly about alternative (IMHO much better) ways to handle the traditional C buffer overflows. I'm assuming the reader already knows C quite well.
They're just the old ways. Not necessarily good or bad. Possibly good sometimes, possibly not. I'm just mentioning another way which I find better in most cases.
My poor, poor self-esteem.... (Score:5, Funny)
From the mini-howto:
From the post:
Whoa, man.
data stacks (Score:5, Interesting)
The way it works is simply letting the programmer define the stack frames. All memory allocated within the frame are freed at once when the frame ends. This works best with programs running in some event loop so you don't have to worry about the stack frames too much. Here's an example program:
That sounds a little like the NSAutoReleasePool in Cocoa/OpenStep. Objects use reference counting, when the count reaches 0, they deallocate themselves. When an object is created, it can get added to the most recent pool. When the pool is deleted, it decrements the reference count of all the objects within it, causing deallocation unless it needs to be kept around longer.
Re:data stacks (NSAutoReleasePool vs. NSZone) (Score:5, Insightful)
Well, not quite. An NSAutoReleasePool does not allocate a large region of memory and suballocate objects out of that. What an NSAutoReleasePool does is make it possible to avoid explicitly sending the release message for temporary objects.
For example, from Foo() I allocate an NSObject with [[NSObject alloc] init] and pass that as an argument to Bar() which takes ownership of it. However, I must then ensure that I release the object because Bar() is following good coding practices and retains it, so thus with alloc+retain it's reference count is now 2. So instead what I do is Bar([[[NSObject alloc] init] autorelease]) which allocates NSOjbect (with ref count one) initializes it, marks it for autorelease, and passes sends it to Bar() which retains it (ref count 2) and keeps a pointer to it (presumably it is a method of a class). Coming out of bar the ref count is now 2, and perhaps Foo() proceeds to do some other things. Presumably at some point higher up the call stack (or perhaps at the beginning of Foo()) an NSAutoReleasePool was allocated. At the corresponding exit point (either at the end of Foo() or the end of whatever higher up function) [whateverpool release] will be called. When the pool is released, it will call release on any objects it has been asked to take ownership of. At this point one of two things it true. Either the class that Bar() belongs to has already released the object and thus its reference count went back down to one, and now is going to zero (so bye-bye), or the class that Bar() belongs to has not released the object and doing this release merely brings the refcount back to one such that when the other owner releases the object, its refconut will be zero and it will be freed.
Sorry if that was confusing, but in reality it's really not. It also really helps out when you are coding functions that allocate ObjectA, then allocate ObjectB, then ObjectC, and then find out something is wrong and need to "roll back" to the begining. If you allocate an NSAutoReleasePool at the beginning, and autorelease everything you alloc then if you error out you can free the release pool and everything gets released. If you don't you can simply retain what you need and then free the autorelease pool.
Anyway.. what this guy is REALLY talking about is NSZone. NSZone allocates a chunk of memory which other objects will be allocated from. The caveat being that while the memory will be freed, the objects will not be properly destroyed. Now this guy was talking about holding C strings and the like, so this is not a problem. However, had he been holding some C++ or objective-C objects this would be a problem as none of the destructors/deallocators would ever be called.
I think what it all boils down to is that programmers need to read more code than they write and that we should really be getting Masters of Fine Arts in Programming [slashdot.org]. I completely agreed with what Dr. Gabriel said. Programming is about as much like building a bridge as writing poetry is. That is to say.. not much.
Going along with that thought, I think it should be pointed out that /EVERYONE/ here who programs in any language (but specifically C programmers, and ESPECIALLY C++ coders) needs to learn Cocoa and Objective-C. I imagine some of the C++ whiny bitches are going to continue to whine about how much easier and better C++ is, but for those of us who actually prefer to wrangle pointers, Objective-C is where it's at. It's like C with JUST enough object orientation, but not overdone in some committee like C++. Also, one should note that I do like C++ quite a bit, but sometimes there's too many provided ways to do things. With Objective-C, the provided ways are almost all good. In addition, like C or C++ you are not limited to doing it that way, it's just that Objective-C only makes it easy to do good things.
Think for example of wxWindows [wxwindows.org] vs. Microsoft MFC. wxWindows is suprisingly similar to Cocoa (although wxWindows does not do ref counting so making sure that one and only one class ever owns an object can be problematic at times). MFC, on the other hand, is rather a bear to work with as Microsoft has written it such that an MFC programmer /can/ do things multiple ways, none of which work very well. Obviously this is a generalization, but I think the average MFC programmer will understand where I'm coming from here. That is, again, except for the whiny C++ and MFC bitches who can't figure pointers out. Go home!
reference counting (Score:4, Informative)
If two (or more) objects have a reference to one another, the count can never reach zero even if nothing in the main logic points to those objects anymore.
Also, every time an object gains or loses a reference, a check for a count of zero is made. In fuller garbage collection setups, periodic checks are made to all of the objects in a low-priority thread. In some cases, memory usage can be higher, but performance is also higher sometimes and it can handle circular references.
Both are better than repeated use of malloc/free and new/delete though.
--
C also muddies this concept because there are no objects in C.
Re:data stacks (Score:2, Interesting)
Cyclone has region-based memory protection, which means that you can't do stuff like return pointers to local variables etc. because it statically checks the pointer lifetime using region tags that are part of the pointer type. Example: you can have a pointer-to-memory-belonging-to-region-foo, where foo is a function or some other static scope, (written sometype * `foo, although the default region tag is usually correct, then it's just sometype *) which can point to heap memory because that's garbage-collected and guaranteed to live at least as long as function foo's memory, but you can't have a pointer-to-memory-belonging-to-region-Heap pointing to a variable on the local stack: if you have a local variable x and take the address, the type of that is pointer-to-memory-in-region-foo, and that type is not allowed to be cast to pointer-to-memory-in-region-Heap because foo's memory doesn't necessarily live as long as Heap does.
They combined this region-based mechanism with dynamic "stack frames": You can open a dynamic region to open up a new "stack frame" or separate heap of memory bound to a scope in the program, so when an exception is thrown or when you exit the scope the memory is automatically deallocated. The good thing is, you can't go wrong, because the region-based pointer lifetime checking will prohibit you from casting a pointer into that specific region to a pointer into a region with a longer lifetime, so you will never have dangling pointers into such a dynamic region: you will get a compile-time error when you attempt to do this.
It's a Sunday morning; So don't criticize. (Score:4, Funny)
I'm going to start putting that at the end of everything I write so that people can't criticize anything I do. As a matter of fact... I think I'll only write on Sunday mornings after not sleeping the night before. It seems like it's always Sunday morning anyways.
Maybe being in the business world (Score:3, Interesting)
Perhaps working until 4 in the morning on C code has drained my ability to understand.
Re:Maybe being in the business world (Score:2, Informative)
masochist (Score:2, Funny)
You must be to ask slashdot's opinion of your toils!
Is this news? (Score:5, Insightful)
stack allocation?? (Score:5, Informative)
it starts off with denouncing GC as oldfashioned, and then proceeds to tout stack-based allocation, which has been available for ages as the alloca() function (which also has portability problems.)
imho, you should use the Boehm Garbage collector [hp.com], unless you have code that must be guaranteed to be free of space leaks.
Re:stack allocation?? (Score:2)
No offence intended, but you definitely need to have some knowledge of conservative GC before you write a C memory management HOWTO.
Re:stack allocation?? (Score:2)
To me it looks very much like I'm saying that the GC would be the best way to manage memory, except if it wasn't so crappy to use with C. Boehm GC can't do much with C.
BGC recollects non-live objects in memory. It does so very efficiently: it's efficiency is comparable with malloc(). To top that, you can add finalization procedures to objects (more or less analogous to C++ destructors) that are called when an object is cleaned up.
Reclaiming dead objects is what GC is for. If you claim BGC can't do much with C, then you have a different definition of GC than I have. Please be more specific: where is BGC lacking, eg. as opposed to stack based memory allocation?
Re:stack allocation?? (Score:2)
Not good enough? That is a blanket statement. Unless you have specific needs, and have actually measured that heap fragmentation is a significant cost, your home-brew non-thread-safe memory allocation mechanism will cost you more, since it restricts your programming style, and leaves you responsible for maintaining the memory allocation code.
But that's not happening with C since moving data around has too many side effects to handle.
Again, a blanket statement. There actually exists a conservative generational copying GC (Bartlett's mostly copying GC [cornell.edu]) algorithm that could be used for GC-ing C data. You would have to tell the GC the layout of your data structures. That is quite restrictive, but then again, the stack operations that you introduce are also restrictive, and only work for specific allocation patterns.
Anyway, if you have repeated stack-based allocation patterns that must really be efficient, I would suggest to move the allocation outside the loop, so it only has to be done once.
Re:stack allocation?? (Score:2, Funny)
I don't care about speed that much. Portability is one however which I really do care about and I'd hate to depend on requiring an implementation of GC for some specific platform I intend to use.
Telling GC about all your structures sounds much more difficult and error-prone to me than my simple data stack.
Re:stack allocation?? (Score:3, Funny)
I find it somewhat ironic that you make claims like that, and that you write a memory management HOWTO at the same time.
But eh, keep hacking at that square wheel.
You Forgot: (Score:5, Funny)
"Secure, Efficient and Easy C programming in 24hrs"
strncat/strncpy are *NOT* intuitive (Score:4, Informative)
To both zero-terminate and check for truncation is arcane, that's why the OpenBSD ppl made strlcat [openbsd.org] and strlcpy [openbsd.org] in the first place.
There are already other secure programming faqs, though AFAIR, they suck too. If I were you, I'd put a HUGE disclaimer to take this page as work-in-progress.
(before flaming, write down the correct code to check for truncation for both funcs)
NO string copy function is intuitive (Score:2)
What if the result would be bigger than the output buffer? strncat() does the "right" thing, and doesn't overflow the buffer. But your string just got truncated! That's probably bad. So, suppose you check for this problem, by examining the string lengths beforehand. You verify that the result will fit, and not be truncated.
But now that you've gone to that trouble, and you know that the result will fit, why bother with the strncat()? Since you already know there is no overflow, you can go ahead and use the (faster) function strcat().
Now, in order to avoid these problems, you might write your own string concatenation function, that first computes the total size needed for the result, allocates it, and then copies the strings into this new buffer. Now, the issue of buffer ownership comes up, and you introduce a new class of possible bugs: memory leaks.
The fact is, in any non-garbage-collected language like C, string handling is a pain in the ass. The problem runs deep, and can't be solved by any quick hack like strlcat().
Devils' Advocate (Score:4, Interesting)
Problem 1:
It's not easy, nor fast to write. Errors are severe if present and undetected. Code required to be reliable might not be a good place to test this allocation method.
Problem 2:
I'm not entirely sure these concepts are very portible outside of GCC. May not be a big deal to most, but uh, multiplatform code is required in some enviroments.
Problem 3:
Any speed increase without massive resource wasting is pure dumb luck during heavy usage, unless used in an application that takes little user input or has limits on the ammount of input.
Just my $0.02.
These are common tricks (Score:5, Insightful)
Some of my personal favorites include:
DISCIPLINE, DISCPLINE, DISCIPLINE. I fully expect to see the usual barrage of comments to the tune of: "C is outdated, insecure, brittle, yadda yadda..." No. Some PROGRAMMERS are "outdated, insecure, and brittle."
The C language doesn't write bugs. Programmers write bugs. If the programmer can't handle C, then take it away from him. But don't try to take it away from ME.
Re:These are common tricks (Score:2)
The moral is to always use the right tool for the right job. You wouldn't use a chainsaw to do heart surgery and you wouldn't fly a 747 to travel 30 miles.
C is really good at certain things, but writing secure code isn't really one of them: meaning that you have to go through some extra effort to ensure your C code is secure. If you can solve your problem in another language with less effort and that language meets all your requirements (speed, memory use, portability, whatever else), then why pick C in that case? Not every application requires maximum speed or easy bit twiddling.
The C language doesn't write bugs. Programmers write bugs. If the programmer can't handle C, then take it away from him. But don't try to take it away from ME.
Nobody's advocating take C away. But you shouldn't be slavishly devoted to it as the One True Language. You should choose the language that will allow you to solve your problem with the minimum amount of effort, taking into account whatever constraints are relevant.
Re:These are common tricks (Score:2)
Not necessarily - depends how you implement it. For example, the SmartHeap library would use fixed size allocators for common small block sizes that worked in pretty much O(1) time (they just popped an item off the head of a list). This was used as a GP replacement for normal malloc/free libs. When you free a block, it just gets put on the head of the list.
Tim
Re:These are common tricks (Score:3, Informative)
I'm not seeing the difference between your data stack and a memory pool, except that you can divide it into frames which you can collectively pop off, and free entire contexts at once. But by making the data stack independent of the call stack, you introduce the possibility of the two getting out of sync. A context frame should probably always map to a single function invocation. Or to put it another way, a data frame pushed in a particular function call should always be popped by that same function call. And that kind of defeats the purpose of being able to return stack-allocated data UP the call stack.
In contrast alloca() is a simple manipulation of the hardware stack pointer, which will be automatically undone by the hardware itself at the end of the call frame (on any sane architecture, that is). There's no possiblity for abuse.
Any strlcat(), strlcpy(), etc. don't solve the underlying problem in all string operations, which is making sure you always have enough room. They prevent overflows, but they can still truncate data without you realizing it's happened. Unless you check first. See my other comment [slashdot.org].
Re:These are common tricks (Score:2, Interesting)
You could think of it that way if you wanted. I actually called them "temporary memory pools" before learning it had an existing name.
Yes, but there's no need to create a new frame for each function call. You may not need to create more than one frame in the entire program if you know you're not allocating too much memory out of it. That's what makes it better than alloca(). You can do for example:
t_push(); ret = alloc_some_data_from_stack(); /* do stuff with ret */ t_pop();
All very simple. Sure there's still possibility breakages but they're not very common, and you know when you're doing it wrong. Simply forgetting a t_pop() call will be noticed at the bottom level t_pop() which would kill the program then - nothing got overflown but it might have allocated memory excessively.
alloca() simply doesn't do what I want. I want to return dynamically allocated memory from functions without worrying about freeing it. Data stack and GC are the only possibilities for that.
I'm not propsing strlcpy() either. I only mentioned them as being much better than strncpy/strncat which they definitely are. I've never used them though.
Re:Exceptions in C? (Score:3, Informative)
Then, as you allocate memory, open files, lock mutexes, or whatever, you register cleanup functions with another macro. This pushes a cleanup frame onto the stack indicating a function to call along with a single argument.
Then, if an exception is thrown, you just pop cleanup frames off the stack and execute them, until you hit an environment frame. At that point you call longjmp() to transfer control to the user-code exception handler, inside the CATCH macro block. That user code can choose to do cleanup or error recovery, or it can THROW again and continue propagating the exception up the stack.
Figuring out how to implement the TRY, CATCH, ENDCATCH, and THROW macros is fun, so I won't give away exactly how to do it. It's simple, and involves a creative use of a do-while statement.
Bad implementation of a heap... (Score:4, Insightful)
At any rate, there are better ways to make sure one never leaks memory problems:
1) always set a freed pointer to 0. Most architectures have a predictable behavior in dereferencing a 0 (throws an exceptions).
2) Limit all malloc/free pairs to the same function. If a function just has to allocate and return some buffer, give it a meaningful name to that effect and all a corresponding free version. Then, you can follow the above rule.
3) assert()s are your friend. Use them religiously. They can always be shut off.
4) Use memory tracking software (purify) before ship.
Yes, it's easier to shoot yourself in the foot with C, but you'll gain a huge performance increase. It's all about using the right tool for the right job.
Re:Bad implementation of a heap... (Score:2)
Probably not, but I'm sure the open source community can come up with their own version given a few years and a copy of the original.
Simon
(Oh, I'm sorry... was that bitter and cynical? My mistake)
visual basic (Score:3, Funny)
say what you want about it, you don't have to use stupid hacks to avoid buffer overflows.
C++ can do this, and it'll look cleaner (Score:4, Informative)
Second, destructors in C++ guarantee clean up of objects, regardless of how you leave scope (natural, return, exception, etc).
Finally, you couple destructors and reference counting auto-pointers, and you have yourself a very nice allocation API that's as easy as Java, but without the performance or unnatural destruction logistics.
And C said (Score:2)
aggressive use of glib (Score:4, Insightful)
glib containts a lot of useful things: lists, trees, hash tables, memory pools, string handling functions and a lot more, everything thread safe.
gobject contains tools on top of glib like "classes" and "objects". It's not the same as in C++ or java, but also very useful. Runtime classes oder data types, generic object properties, reference counting, signal callback, runtime type checking, etc...
The code ist now full of g_... and it took longer than usual because I had to read the documentation, but I think these libraries are very great, and provide a solution for nearly everything that has to do with abstract data types and dynamic memory allocation.
And it's very lightweight, fast and efficient.
Re:aggressive use of glib (Score:3, Informative)
Sure. The libraries are basically standard C. The optional thread library has implementations for pthreads and win32 CriticalSections. And there are even some additional compatibility wrappers to make some things more portable (e.g. listing directories).
They are just utility libraries or foundation libraries to build more functionality on top of it. Especially the gobject library is great as a foundation to do modular programs.
Obstacks (Score:3, Informative)
I think the HOWTO should have a reference to obstacks, rather than claiming data stacks are a new invention. (Hint: data stacks have been used many, many times in many, many projects. GNU obstacks are the only one for which I can find a URL at the moment.)
Or save yourself the headache and do it in C++ (Score:2)
std::string MyString("Hello");
std::string MyCopy(MyString);
MyString +=(" World");
and so on.
And for memory management, head over to BOOST [boost.org] and grab the smart pointers library..
shared_ptr pMyStruct(new MyStruct());
which will get garbage-collected when it's no longer needed. Really, c++ does take a lot of the pain out of C programming, without much in the way of downsides. One of the biggest advantages is that you can pick and choose exactly which constructs you want, and the existance of the others doesn't adversly affect you.
You just reinvented alloca() (Score:3, Interesting)
ALLOCA(3) FreeBSD Library Functions Manual ALLOCA(3)
NAME
alloca - memory allocator
LIBRARY
Standard C Library (libc, -lc)
SYNOPSIS
#include <stdlib.h>
void *
alloca(size_t size);
DESCRIPTION
The alloca() function allocates size bytes of space in the stack frame of
the caller. This temporary space is automatically freed on return.
RETURN VALUES
The alloca() function returns a pointer to the beginning of the allocated
space. If the allocation failed, a NULL pointer is returned.
SEE ALSO
brk(2), calloc(3), getpagesize(3), malloc(3), realloc(3)
BUGS
The alloca() function is machine dependent; its use is discouraged.
FreeBSD 5.0 June 4, 1993 FreeBSD 5.0
What's the gain? (Score:2, Informative)
You failed to describe what's wrong with strncat(), strncpy() etc. IMHO people who can't comprehend the man pages for those functions probably should avoid C altogether, but definitively must be hindered to write security relevant software (as should sleep-deprived coders who try to do it on a Sunday morning
Said that, I can only appreciate your attempt to raise this issue (once more, maybe for a new generation of C coder).
Not complete? (Score:2)
Seems a bit on the "strings" side, so I assume the text is not complete.
What I wanted to read on was how to create modular programs with C, as in function pointer arrays and how to generally modularize the application. My attempts at building larger apps have resulted in instability, and I do not want to get into C++. Maybe some details on howto allocate mem less frequently in larger chunks would be also useful..
scanf and friends (Score:5, Informative)
Regarding scanf(3), many people don't realize this is Bad:
scanf("%s %s", cmd, arg);
This is Good:
scanf("%79s %79s", cmd, arg);
This prevents a buffer overrun if a word contains 80 or more consecutive non-white characters.
Ditto for sscanf(3) and fscanf(3). Never forget the N+1 when declaring the arrays (eg. char s[80] vs %79s) to leave room for the NULL.
Here's a good command to run on all your .c files to find such problems:
And in a document like this, *definitely* point out the whole gets(3) problem; the granddaddy of them all. Never use gets(3), period. Use fgets(3) instead.
The gets(3) interface is inherently insecure; a problem waiting to happen by its mere existence. Any code that uses it is broken.
There are probably some others (someone mentioned strcpy) I'll try to post more if I think of them.
Vector class (Score:2)
But hey, it's not C. Ohhhh for a program that is so power hungry I have to write it in pure C.
I'll believe this when I see... (Score:2)
Just another C string library (Score:3, Informative)
Some of the idea's aren't bad (and those have been done before), but mostly it's just another simple dynamic string library in C.
As for efficency...
...this pretty much speaks for itself. Why Is strconcat() so efficient compared to just doing strcat() multiple times? Because you've got a model for representing the data that has ZERO metadata, and a model for storing the data that requires you to reallocate bits of memory all the time.
Assuming you can just disacount all this overhead by using memory pools, is a simplistic outlook (for instance even if you waste gobs of memory so you don't have to call malloc that much you'll still need to do copies all the time)
There are more than a few much better string libraries out there for C. Probably the best for an IMAP server is probably Vstr [and.org] as that was deigned to work well in an I/O context (For instance it doesn't need strconcat() like calls in the API because doing repeat adds is just as fast).
Henry Spencer said it all ... (Score:2)
Never forget that C is just a machine independent assembler, you need to have a good understanding of how machines really work to be able to write good C programs.
Also: plan the code and code the plan. C is a language that bites you if you are sploppy.
This is bollocks (Score:3)
To summarise the articles: a bunch of small libraries providing object-based memory allocation and string handling.
Kudos to the poster for enabling himself to write code in a way that's good for him. But that doesn't mean it's good for anyone else.
For example, I'm not going to go and learn 20 more function names and have more library dependencies and I wouldn't recommend anyone else does either.
Finally, suppose one wants a better string library or memory library. There are already plenty of good, with-much-work-done-on-them, open source libraries out there. Tried and tested. Not to mention the C++ STL.
Pick one that means many other people will also be able to read your code and be familiar with the libraries you use! There's nothing I'd hate more than working on a project written by someone using these libraries. Not only do I have to analyze the code, I have to analyze these libraries, and also manage to keep them and their quirks in my head while I am reading the program. Yuck.
Bjarne says "Use C++" :) (Score:3, Informative)
Stroustrup explains some nice details on especially this issue of memory constructs. He makes a convincing argument for why C++ is easier for C-style programming... Especially for those of you (One I saw below) who "Don't want to get into c++", realize that you can edge into it pretty easily, and accomplish your tasks more easily and quickly -- give it a try!
Jeez, just learn C++ already (Score:3, Interesting)
The arguments I've seen against C++ seem to fall into the following categories:
* It adds bloat and it's slow
No, not since optimizing compilers were perfected in the 90s. You can add a lot of overhead to your app by abusing the STL, but for non-trivial applications, you'll never notice it. GCC (at least for the pre-3.0 series) has a really unoptimized template implementation, where "Hello, world" using cout would make a multi-megabyte executable (and be forever compiling it), but more modern compilers, like VC++ and Intel's compiler, do a lot better. Either way, for a real-world app, any size increase will be unnoticable. As for speed, with an optimizing compiler and judicious use of inlining, a C++ program will run just as fast as one written in C.
These complaints may have been true in the days of the Cfront preprocessor, but not today. I don't know about you, but I no longer write code for a 386 with 4 meg of memory.
* I don't like/need/want to learn OOP
You don't need OOP to use C++, but it helps. A class is just a struct where everything's private by default. If you know C, it takes about a day to learn the basics of constructors and destructors, references, and exceptions. Templates and STL will take a bit longer. One great about C++ is that you can just use small bits here and there if you don't want a full-blown OO program.
* It's not as good as Perl/Python/Ocaml/Eiffel/Java/whatever
That's not the point. It's not supposed to be. It's supposed to be as good or better than C. If you want a standalone-executable without linking in a complete interpreter and you don't need a lot of string parsing or regexps, you were using C anyway.
* It won't let me write libraries that work with other languages
Just declare all of your external APIs using 'extern C' and make sure they only use C types in their signatures. Done.
The main reason not to use C++ in new development seems to be "I don't want to learn it" or "I don't know anything about it". If you use either one, I don't ever want to work with you.
What C can do that Perl can't (Score:5, Insightful)
I still have yet to write a single useful C program that I couldn't have done in Perl.
Can you write a video driver with acceptable performance in Perl? Can you write programs that do things other than text manipulation, such as (say) a 3D engine and make them faster in Perl than in C? Remember that in the real world, time is money because a shorter execution time means lower system requirements and thus a larger market for mass-market desktop applications.
Re:What C can do that Perl can't (Score:3, Insightful)
Neither have I. Neither have most programmers. If you don't work for a company that makes graphics hardware or operating systems, I suspect you probably never will -- at least, not a "real world" one where performance really matters.
For the majority of programs, there are more appropriate languages. Some programs may require C, but only a small percentage.
Re:+1 Insightful (Score:2)
I still have yet to write a single useful C program that I couldn't have done in Perl.
Well, yeah, you can do the same thing as C code in Perl more slowly and less efficiently. Conversely, Perl is my language of choice for some tasks. The right tool for the right job. You obviously don't do any systems programming.
Re:+1 Insightful (Score:5, Funny)
s/idiots/wise souls/
s/think/know/
Problem solved.
Re:+1 Insightful (Score:2)
Re:+1 Insightful (Score:2)
flamebait" please.
(time passes...)
Moderation Totals: Flamebait=1, Insightful=1, Total=2.
Thank you.
Re:Overly simplistic (Score:2, Informative)
That sounds like a yet another solution for safe string handling. Like I said, I think they're too highly advertised as being the only way to overflow buffers.
Threads? Yeah .. I wouldn't really even bother thinking about security with threads.
There's now a link to Secure Programming HOWTO which talks about most of the other things just fine. Maybe I could write about a few other things that aren't too well discussed in that HOWTO, like integer overflows (although it's next version will contain several of my examples about them).
Not necessarily. Or are you talking about complete systems here instead of individual applications? If your application doesn't have external dependencies other than libc and you write it fully up to ANSI-C specs (that's a bit difficult actually) and in general you're careful enough, it's theoretically possible your program is secure now and forever. libc, kernel, user, etc. bugs are different things then, although you could try to prevent some of them as well (don't give dangerously parameters, don't use dangerous functions).
Re:Or Otherwise Known as (Score:2, Interesting)
But then again, threads are useless for most applicatios, especially the ones I've written so far. Besides, it's easy to make it thread safe with per-thread data stacks and adding locks to other stuff.
Re:cras, why you choose C ? (Score:2)
Sorry, but I don't understand how this can really be true, unless you really meant developers instead of users.
When looking for a piece of software, I'd really only consider questions such as "Does it run on my target platform?", "Does it have the set of features I need?", "Is it stable?", and "How much will it cost me (in time and money)?". Performance is, of course, another issue I'd be interested in (I'd probably group this in with features).
While the choice of language can influence all of these in one way or another, I seriously doubt most users are interested in *how* their criteria are met as long as they are.
Re:Nice little remark, there, michael (Score:2, Insightful)
"Secure, efficient and easy" would at first seem to be impossible, but on closer inspection is not so... that makes it a paradox.
(admittedly, it has oxymoron-like properties, but I think the fact that there are three parts to it rather than two negates that)