Slashdot Log In
C with Safety - Cyclone
Posted by
michael
on Fri Nov 16, 2001 01:13 PM
from the whirling-away-bugs dept.
from the whirling-away-bugs dept.
Paul Smith writes: "New Scientist is carrying a story about a redesigned version of the programming language C called Cyclone from AT&T labs. "The Cyclone compiler identifies segments of code that could eventually cause such problems using a "type-checking engine". This does not just look for specific strings of code, but analyses the code's purpose and singles out conflicts known to be potentially dangerous.""
This discussion has been archived.
No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
In related news.... (Score:2)
wasn't this supposed to be an NP-Complete problem?
Re:In related news.... (Score:2)
Dang, now I'm unsure if the Halting problem is NP-Complete.
Isn't that called "Java"? (Score:4, Informative)
For the rest of the world, secure C programing [google.com] is far from a secret.
Re:Isn't that called "Java"? (Score:5, Informative)
Parent
No, it's called "Pascal"! :-P (Score:2)
Seriously, modern Pascal compilers like Delphi/Kylix are capable of some compile-time checking...Pascal already has strict var type checking, and all you have to do is make sure its turned on when you compile.
This also includes bounds checking for arrays. Pointers are handled better than most C compilers, too.
The key difference here is that it sounds as if Cyclone checks the code for *intent* rather than just checking the types and such. That IS a hard problem.
Re:Isn't that called "Java"? (Score:3, Insightful)
Java does not rely on a "run time stack" for its type checking, whatever that means. Java does plenty of checks at compile time (and load time, if you're using dynamic loading/linking).
Java, like Cyclone, Vault and every other language you'd ever want to use (and many you wouldn't), relies on a combination of static and dynamic checks to ensure safety. Cyclone does move more checks over to the static side than Java does, so it might get higher performance. But no compiler, and certainly not Cyclone's, will be able to eliminate all dynamic checks (for array bounds and null pointers, for example). Vault moves even more over than Cyclone.
There is a spectrum that describes the amount of dynamic checks that have to be performed for safe execution of a language. It looks a bit like this:
Vault
(C and C++ aren't on there because they don't have any concept of "safe execution"
except Java doesn't have (Score:3, Informative)
(right on the web page detailing the language)
Actually Java does have (Score:2, Informative)
the 1.4 jdk (currently in beta) has pattern matching
parametric polymorphism (iow - templates) are in development and being called generics
how many times do I have to say it? (Score:3, Informative)
Please read what pattern matching means when Safe-C (and ML and Prolog and Erlang and...) says "pattern matching" before you post your irrelevant link anymore.
Lclint (Score:5, Informative)
A lot of the static checking made possible by Cyclone can be done for ordinary C with lclint [virginia.edu], which lets you add annotations to C source code to express things like 'this pointer may not be null', 'this is the only pointer to the object' and so on. You write these assertions as special comments, for example /*@notnull@*/. These are checked by lclint but (of course) ignored by a C compiler so you compile as normal.
(If you weaken the checking done, lclint can also act as a traditional 'lint' program.)
Also C++ provides a lot of the Cyclone features, not all of them, but it certainly has a stronger type system than C. I'd like to see something which combines all three: an lclint-type program that lets you annotate C++ code to provide the extra checks that Cyclone (and lclint) have over C++.
Parent
Re:Isn't that called "Java"? (Score:2, Interesting)
There are other safe programming languages, including Java, ML, and Scheme. Cyclone is novel because its syntax, types, and semantics are based closely on C. This makes it easier to interface Cyclone with legacy C code, or port C programs to Cyclone. And writing a new program in Cyclone ``feels'' like programming in C: Cyclone tries to give programmers the same control over data representations, memory management, and performance that C has.
Just what I need... (Score:5, Funny)
I am against this (Score:5, Funny)
Just add wrapper libraries (Score:2)
No No No (Score:5, Funny)
"C with safety," or C with trigger locks? (Score:5, Funny)
And isn't a cyclone an infinite loop? You have to like a scientist who uses the word humongous.
Re:"C with safety," or C with trigger locks? (Score:3, Informative)
Ever had an agressive optimizer break code, such that you had to use a lower optimization setting? This can be a symptom of weakness in the compiler's ability to statically analyze the program. Not just a garden variety "bug", but rather the optimization is correct only for a subset of valid input source code! I.e. it can be difficult to impossible to prove that a given optimization is safe, aka "semantics preserving".
Many modern PL researcher/designers thus aim to give compiler writers a head start by ensuring that the language design permits increasingly powerful forms of static program analysis. Functional language work in particular has focused heavily on utilizing language and type system design to enable more powerful analysis support. (cf. the various published papers on the Haskell and OCaml languages as a starting point).
party like its (Simula) 1962 (Score:3, Funny)
The wrong starting point? (Score:4, Troll)
I'm a professional software developer, and all for anything that makes my code safer without unduly compromising it. But I can't help thinking that starting from C is probably a mistake.
C is a fundamentally unsafe language. It has some easy fixes (remove the always-unsafe gets() function from the library, for example). It has some fundamental "flaws" (pointer arithmetic and the use of void*, for example). I quoted "flaws" because, while these features make the language necessarily unsafe, they are also very helpful in the low-level programming that got C to where it is today.
The underlying problem here has never been with C, it's been with using C for the wrong jobs. Application code, and certainly high-level code where security is essential, just aren't C's strong suits. I can't see how even the geniuses we're talking about can start from such a broken language (in the context we're discussing) and successfully make a non-broken language out of it.
I would expect a much better solution to be that followed by later C-like languages. C++ retains the low-level control, but other languages (Java, C#, etc) are available to those willing to sacrifice some of that control in exchange for added safety, and consequently may be better tools for different types of project. The biggest problem at the moment is that none of these "safer" languages has yet developed the same raw expressive power of C++. As they evolve, and catch up on the 20-odd year head start, hopefully we'll see programmers given a genuine choice between "safe but somewhat limited" and "somewhat safe but unlimited".
Re:The wrong starting point? (Score:4, Insightful)
Take a look at Ada [adapower.com]. Extremely safe, extremely powerful, extremely unpopular. Go figure.
It's object-oriented, it supports generic classes ("packages", in Ada terminology), it has built-in support for multitasking and distributed programming, it lets you (optionally) specify even such details as numeric representations for the ultimate in portability, and it has a set of first-class and well-documented bindings for GTK+ [act-europe.fr].
There's a free compiler called GNAT, which is built on gcc and will actually be rolled in to gcc 3.1 or thereabouts. There's also a Linux-specific [gnuada.org] site for gathering and distributing component packages.
And pace ESR, it wasn't designed by a committee.
Parent
Cyclone Beta Testing (Score:2, Funny)
Either method is an enormous amount of overhead being generated by Cyclone. However, one can see that the amount of lines of code released in a release (by creating overflows) that actually goes to maintaining the Cyclone System spiraling bugs is a huge ratio of 400 to 1.
Stick with C++ I think.
Pre-processor better?? (Score:2, Interesting)
I don't mind suggestions, but I'm not sure I like the idea of having my code rewritten.
Couldn't the same error-checking be incorporated into a pre-processor rather than developing an entirely new compiler/language?
Re:Pre-processor better?? (Score:2, Insightful)
In the early 90's, we were using one of the C compilers at the time (dont remember which, sorry, we quickly dumped it when Borland came out) one of the error messages was "Need semicolon here" with a ^ to show where. My reaction, every time, was "Shit howdy, if you know that, put it in, and make it a warning!"
Re:Pre-processor better?? (Score:3, Insightful)
Vision of the future (Score:4, Funny)
Am I the only one to whom this sounds like potentially a really bad idea? I mean, think about it, coding along one day:
#include
int main() {
printf("He
At this point, small, cute cartoon versions of Kernighan and Ritchie pop onto the screen and say "It looks like you're writing a Hello World program! Click here to check this program for bugs automatically..."
I'm just shuddering at the thought...
Re:Vision of the future (Score:2)
Run-time checking is slow (Score:2)
if (!infile) { perror("input file"); exit(1); }
The advantage of C is that you are allowed to not use it, if you think it's not recommended in that case.
Just one thing to say (Score:2)
I'm sorry, Dave, I can't compile that.
I know it's cliche, but really, do we expect it to be as smart as another competent programmer reviewing code?
Why.... (Score:2)
Other than "new" and "improved" sell products better than "useful".
New language? (Score:5, Interesting)
Put bluntly, Cyclone seems to be little more than C for lazy programmers. Fat pointers for those who can't follow the logic of pointer arithmetic and *`H for those intimidated by malloc() is not a beneficial service.
It's nothing more than built-in PC-LINT! (Score:2, Informative)
Safety in C and C++ (Score:4, Insightful)
I'd like to see Cyclone's kind of safety, but if you're going to require garbage collection and forbid pointer arithmetic, you may as well use Java.
I've proposed "Strict Mode" for C++ [animats.com], a compatible retrofit to C++ that uses reference counts like Perl, but with some optimizations to get the overhead down.
A basic decision is whether to have garbage collection. If you have garbage collection, C++ destructors don't fit well. (Java finalizers, called late, during garbage collection, can't be used for things like closing files and windows. Microsoft's C' has destructors, but the semantics are confusing and ugly, and we don't have much mileage yet on how well that will work.)
Reference counts work reasonably well. There's a problem with not releasing circular structures, but that doesn't keep Perl from being useful. Perl now has "weak" pointers (they won't keep something around, and turn to null when their target goes away), and if you use weak pointers for back pointers, most of the circularity problem goes away. True rings of peer objects are rare, and they're the main case where weak pointers won't solve the problem.
If you don't have garbage collection or reference counts, programs obsess on who owns what. A basic problem of C and C++ is that it's essential to track who owns which objects and when they're supposed to be released, yet the language offers no help whatsoever in doing so. This is the fundamental cause of most crashes in C and C++ programs. Almost every core dump, "bus error", or "general protection fault" comes from that problem. So it's worth fixing.
It's the right time to address this. We're in a period of consolidation, now that the dot-com boom has collapsed. Our task as programmers over the next few years is to make all the stuff that sort of works now work 100%.
Re:Safety in C and C++ (Score:2)
Moon patiently told the student the following story:
"One day a student came to Moon and said: `I understand how to make a better garbage collector...
-- Jargon File
Re:Safety in C and C++ (Score:3, Interesting)
It doesn't prevent perl from being useful, but no language which uses reference counts is ever going to replace C or C++. The problem with reference counts is that sometimes they cause more problems than they solve. A good example is in GUI programs, where a lot of objects might be mutually aware of each other. That's not to say that reference counts are not useful. Rather, forcing programmers to use reference counting to manage memory whether appropriate or not is problematic.
If you don't have garbage collection or reference counts, programs obsess on who owns what. A basic problem of C and C++ is that it's essential to track who owns which objects and when they're supposed to be released, yet the language offers no help whatsoever in doing so.
C++ givas the programmer the flexibility to choose a memory management strategy that suits the problem at hand. Sometimes pool allocation works. Sometimes reference counting works. Sometimes, parent/child management works. It's very simple to implement reference counted classes in C++. It's certainly not necessary to exclusively use an "exclusive ownership" model in C++.
Almost every core dump, "bus error", or "general protection fault" comes from that problem.
They come down to a lot of problems -- library incompatibilities, bounds errors, and other things can cause these problems. I think it's naive to assume that using reference counting for everything will just make the problem "go away". Writing reference counted code without memory leaks gets quite difficult when the data structures are more complex.
The URL you have is interesting, and I think for some types of problems, using an object system where you just reference count everything is probably a good idea. But I question its value as a cure-all.
Error 0 (Score:2, Funny)
test.c
C:\stuff\test.c(3) : 'int main(void) {' : Error 0. Program is in C. This section of code could cause problems.
Legacy Savior? A culture fix would be better... (Score:5, Insightful)
Then I got chewing on it and realized something: when I came on board and suggested running lint on our code, I was shot down by both the rank & file and by management (who each blamed the other). When I suggested a concerted effort to rewrite our code to eliminate or justify (in comments) every warning our compiler spewed on a build, I got a similar reaction.
Don't get me wrong. I think cyclone still sounds great, especially the pattern matching and polymorphism indicated on its home site [cornell.edu]. If it can gain some momentum, it stands to have a real place (niche?) in dealing with legacy systems. For my shop, though, I fear much of the value would be wasted. Until we change our motto from "There's never time to do it right, but always time to do it over" we're going to continue repeating our mistakes.
there has been tool similar in purpose (Score:2, Informative)
p.
anal compilers (Score:2)
What about PC-Lint? (Score:2)
I generally don't like internal type-checking within a language, because it results in slowness, and some los of power. (Sometimes there are times you want to do things that you normally shouldn't be doing, in order to speed up routines.) A language which prevents "bad programming practice" ends up screwing itself over. However, having an external source-code checking utility that tests for bad programming, while still allowing complete power would be much more useful, to me, at least....
In defense of type systems (Score:3, Insightful)
I think you must have had bad experiences with safe languages (Java?). Static checking doesn't result in slowness (in fact, it can make compiled code faster in many cases, for instance by enabling alias analysis).
Static typing and safety also allow for *more* power than a "do anything you like" language. One kind of power I get when I write in a language like this is the ability to enforce invariants without runtime checks. So if I am writing a program with several other people (or by myself across several evenings, except I am drunk some of those evenings), I can arrange my code such that bugs in one part of the program can NEVER affect other parts of the program. Thus, it is easier to figure out who to blame and where the bug is. This is impossible in a language like C, where any code can write over another module's memory, free its data structures more than once, or cast, etc.
Speeding up routines with hacks is pretty overrated; there are very few places where this is necessary, and even fewer where it is desirable. In those cases, we can always fall back to C or assembly.
Cannot cast? (Score:2)
Probably about as effective as Grammar Check (Score:2)
Microsoft Word's grammar check has suggested to me in the past that "do it for the greater good" should probably be "do it for the greater well ".
It's sometimes helpful in helping my catch my grammar mistakes. But more often than not, it's a PITA, and the act of wading through its incorrect suggestions is more work than I think it's worth. And that's when it's SO easy to figure out if the suggestion is right or wrong...the sentence is on the screen, standing alone, and I can instantly decide if it's right or not.
Now, imagine wading through a bunch of suggestions and warnings on your code. Imagine having to figure out the context for the flagged code segnments, and having to review the code and all code which references it to see if it's correct or not.
Sure, if you've got free time or resources to throw at it, using computer heuristics to attempt to help out humans is nice. But you have to realize that at this stage in the game, it often takes a lot of work to vet those results in order to glean any gain.
bad idea (Score:2)
i'd say more but i cut my right hand today and typing sucks.
Didn't Bill Joy announce C+++=-- in the 80's? (Score:2)
a) Computers would increase in speed, to the tune of 2^(year-1984) MIPS. [That would put us at 131,072 MIPS today, and 262,144 MIPS in a few months.]
b) He predicted the rise of a safe system programming language he called C+++=-- (pronounced "see plus plus, plus equals, minus minus), which is a safe subset of a C++ superset.
Java hadn't been invented yet, but Gosling (who was busy inventing NeWS at the time) wrote Oak aka Java several years later, and it fit the description to a tee, but just had a different name or two.
[I'll never forgive Bill Joy for writing VI and CSH. Ewwww icky yucko!]
-Don
Static verification vs. type-safe languages (Score:5, Interesting)
In 1999, the Ariane 5 launcher exploded a few seconds after leaving the ground. The faulty program, written in type-safe Ada, has been submited to a static program analyzer developped by Alain Deutsch at INRIA in France. The analyzer spotted the error right away!
It was a number going out of range after too many iterations and wrapping back to 0.
The verification technique used was based on abstract interpretation.
This is just to say that even a strongly type-checked language can fail and that type checks, whether static or dynamic, are not the only way to catch bugs.
Alain Deutsch has started a company called Polyspace that sells static verifiers for Ada and C (See www.polyspace.com). The idea is not to rewrite C or Ada but to spot potential bugs inside programs.
I have no special interest in this company, (I know Alain Deutsch), but I mean that improving C does not imply removing the type-unsafe onstructs.
C with safety. reminds me of a story... (Score:3, Funny)
One day, a city slicker with a spotless seersucker suit and a perfectly pointy moustache was reported travelling from station to station, selling his new technology suite. It included remote manipulators for making repairs from a higher level, without having to go under the trains. It also came equipped with "parking brakes" for trains, to prevent them accidentally moving while they were under repair.
This new "high level" technology was a hit in many towns, where the young repair technicians were unenthusiastic about life with missing limbs. In addition, the new technology came with many interlocking "safeguard" mechanisms to make sure that no fittings were left unsecured when the repair was completed. This saved many a "crash".
But there remained many towns with older engineers, who had grown up doing things the "fast" way, repairing the trains on the fly (because things went faster that way!), and of course having the scars and stumps to show for it. They were also unenthusiastic about the "safeguards", declaring that they were "smarter than any newfangled machine", and could remember to close the latches and fittings themselves.
In one of these Ancient Telegraph Towns, one of the older engineers, Cyclone Bob, came up with his answer to the newfangled "high-level machines" -- special steel braces to wear over arms and legs while repairing the moving trains. "In most every case, these braces will protect your precious limbs from the hazards of moving wheels!", enthused Cyclone Bob.
The older engineers, who, when all was said and done, actually enjoyed mucking about under trains, and who had already paid their dues in missing limbs, were rather proud of the new braces, and wore them proudly. "My trains hardly ever crash now", they would say, "and now I don't always have to lose a leg to prove it!".
The younger, smarter engineers continued using their "high-level" machines, and were happy that they still had arms so they could snigger up their sleeves.
disappointing speed, too complex (Score:3, Insightful)
Cyclone could be a winner if it gave you C-like performance with safety and minimal changes to your programs. But it doesn't match C performance as it is and I don't think large, existing C programs will port to it easily, despite superficial similarities.
The way it is, I think you are better off using O'CAML [ocaml.org] or MLton [clairv.com]. They are probably easier to learn and give you better performance. O'CAML, in particular, has already been used for a number of UNIX/Linux utilities. And Java is probably as C-like as Cyclone and runs faster (although programs have a bigger footprint).
Smarter than the compiler? (Score:2, Insightful)
Well, that's not too difficult. Compilers are just a bunch of algorythms.
Question is - are you smarter than the person that wrote the compiler?
Re:But, I like being unsafe! (Score:2)
There are also some nice tricks you can sometimes play with integer-based data by casting them into integers, and doing something with them. "Going through channels" can take too much time, if you know what you're doing.
Almost everything comes down to a C or C++ base, which takes care of the dirty bits. Somebody needs to take care of the dirty bits.
That said, some people underestimate the value of staying in the channels. Whether or not the person you replied to is one of them is not something we could determine without knowing what kind of programs he writes.
Re:But, I like being unsafe! (Score:2)
No no no no no!!!!!!! Do you know how many archivers I've had to rewrite because they just cast a struct over the top of a data stream?
The only fixed size in C is the BYTE (unsigned char). Everything else will change. Never use direct memory dumps of structs for on-disk or over-net structures! When reading a data stream, read _bytes_ and convert them at runtime to the structures you desire. Now your code is not only portable across platforms, but portable across compilers, too.