New System Auto-Converts C To Memory-Safe Rust, But There's a Catch 74
Researchers from Inria and Microsoft have developed a system to automatically convert specific types of C programming code into memory-safe Rust code, addressing growing cybersecurity concerns about memory vulnerabilities in software systems.
The technique, detailed in a new paper, requires programmers to use a restricted version of C called "Mini-C" that excludes features like pointer arithmetic. The researchers successfully tested their conversion system on two major code libraries, including the 80,000-line HACL* cryptographic library. Parts of the converted code have already been integrated into Mozilla's NSS and OpenSSH security systems, according to the researchers. Memory safety errors account for 76% of Android vulnerabilities in 2019.
The technique, detailed in a new paper, requires programmers to use a restricted version of C called "Mini-C" that excludes features like pointer arithmetic. The researchers successfully tested their conversion system on two major code libraries, including the 80,000-line HACL* cryptographic library. Parts of the converted code have already been integrated into Mozilla's NSS and OpenSSH security systems, according to the researchers. Memory safety errors account for 76% of Android vulnerabilities in 2019.
Theoretically... (Score:1, Interesting)
Theoretically, if you can convert Mini-C directly to Rust, then the exact same safety guarantees could be applied to the Mini-C code by the compiler or a linter.
Re: (Score:3, Interesting)
I suspect most mission-critical code that is at risk for memory leak vulnerabilities already fails to meet the requirements, and so this converter won't help. At least not in its current form. It's still neat though.
I understand the need for a more memory-safe low-level language. I understand that professional, veteran, highly-skilled developers have left many memory leaks behind in their code. The argument that "if you know what you are doing and you use good coding practices you can avoid these mistak
Re: (Score:2)
The argument that "if you know what you are doing and you use good coding practices you can avoid these mistakes" falls flat because so many cases have arisen where that should have been true and yet the memory leaks were there.
Cheap. Fast. Good. Pick 2.
Developers who know what they are doing are not cheap. Good coding practices take time. Cheap + Fast have been the winning corporate strategy for decades. First to market at the lowest price wins the race. Quality is job None.
Re: (Score:2)
Also there's technical debt. I know the code is crap, but we can't waste time and money fixing it because there are schedules to meet. Thus the solution to spend even more time converting to Rust is a non-starter. If we are not allowed to rewrite problematic code from scratch, then why are we allowed to spend so much more time (years) to rewrite every single line into a new language? (it's not even memory problems, we don't have generic memory problems, we have pools and can track the leaks, so our bugs
Re: (Score:2)
1) They have no idea how the magic box works. Nor do they care. Until it explodes in their face (gets hacked) and then they care but only inandsofar as they can sue for $$$$$. As a result they fear the explosions (exploits), because they don't want to deal with the fall out.
2) Companies, as you pointed out, regularly ship incomplete slop as a "complete" product, despite promoting said products with "safety" and "security" selling points. End users
Re: (Score:2)
Needs a runtime too, bounds checking can't always be optimized away.
But with Rust you get at least some safe code you can interop with.
Any Cobol and Fortran examples? (Score:2)
Anyone have some history on past efforts, studies or tools on converting older languages to more modern languages using automated tools?
Excluding parallel cross language ones like Java to C# since they are of the same generation of programming langues.
I'm asking about past efforts and excluding any AI based ones. The past efforts would have metrics, findings and human guidance on such a conversion.
Re: (Score:2)
Automatic code generators usually end up with a great increase in code size. Hand crafted is always smaller. Thus, the automatic code generation tends to be on systems that are already bulked up, like Windows or back office servers. People occasionally try to get this into memory limited embedded systems and it's a problem. Designs look great on paper, and looks so well organized in the powerpoints, but they don't fit and are too slow.
That's not even automatic code *converters* though these are much rar
Re: (Score:2)
Theoretically, if you can convert Mini-C directly to Rust, then the exact same safety guarantees could be applied to the Mini-C code by the compiler or a linter.
But why reinvent the wheel? It's much simpler to do a simple conversion to Rust source and let the existing Rust compiler enforce the rules than to dedicate countless man-years trying to recreate all those safety principles from scratch.
Re: (Score:1)
Re: (Score:2)
I didn't say it was a manual conversion to Rust. I was pointing out that this software converter is much simpler than what the OP suggested because it can leverage the existing Rust compiler.
Life (Score:2)
Re: Life (Score:2)
Re: Life (Score:2)
Pointless (Score:3)
Restricted C missing "dangerous" things such as pointers is already a thing and has been for decades. Google MISRA-C which is used in the defense industry. Converting it to Rust will gain you nothing except reducing the number of devs who can work on it iand hence raise your costs.
Re: Pointless (Score:2)
"pointer" does not mean the same thing as "pointer arithmetic"
Re: (Score:2)
A pointer is just an int in ANSI C. A bigint, to be sure, but still just an int.
If you can have pointers, then you can do pointer arithmetic.
I agree with the guy above- cheap, fast, good- and capitalism prioritizes cheap and fast, is why we can't have nice things like pointers or actual control of our own hardware.
Re: Pointless (Score:2)
A pointer is just an int in ANSI C.
That's nice, if you're using ANSI C. But this isn't ANSI C, is it?
A bigint, to be sure, but still just an int.
That depends entirely upon the architecture dude...
If you can have pointers, then you can do pointer arithmetic.
The language has to provide the means to turn a pointer into an integer. In rust, you can't directly do pointer arithmetic, which is by design. Values that represent pointer offsets are usize. So if you index into an array, you'll pass a usize as opposed to say a u32 or u64. While you can convert a pointer into an integer, you can't dereference it in safe rust.
I agree with the guy above- cheap, fast, good- and capitalism prioritizes cheap and fast, is why we can't have nice things like pointers or actual control of our own hardware.
Instead you, like other socialis
Re: Pointless (Score:2)
Once the compiler notices that your int is being dereferenced somewhere, then it's gave over. No arithmetics for this int anymore.
Re: (Score:1)
What a pointer is, is actually undefined
A C compiler is free to implement a pointer how ever it wants.
Erm, how ever the compiler writer wants.
And to make a point: on a 8bit machine a pointer is most likely 16bits.
Perhaps even cut down to 14bits.
On a modern 64 bit machine, a pointer might be 64 bit, but has only 48 significant bits ... and so on.
And before one gets angry: there is no such thing like a "big int", we have char <= short <= int <= long <= long long. (and then comes the signed versus
Re:Pointless (Score:4, Insightful)
MISRA-C [wikipedia.org] was originally for the automotive industry.
The rules range from "of course" to "wtf were you thinking when you made this rule", but ten or so years ago when I was using IAR C which lets you individually enable each rule, I enabled a bunch of them that were about things I sensibly shouldn't be doing.
These days I'm mostly using gcc or clang, and I try to "-Wall -Werror -Wextra" whenever I can. "-Wextra" has a lot of static analysis checks, and some of the more recent checks can be annoying, especially when printf functions are involved. But if you don't build with at least "-Wall -Werror" all the time, then you're probably part of the problem.
Re: (Score:1)
Re: (Score:2)
Is there evidence of this? It sounds like marketing to me. Where's the proof? There have been hundreds of magic bullets to magically make code better, and they all came with devout adherents.
Now Rust might be the real deal. I'm certain there's a lot of good things there. But it comes overly burdened down by religiosity and dogma. I'm a skeptic and I would rather see proof than scriptures.
Because I KNOW what the bugs are in the C code in the festering pile of technical debt I have to work on. I'm just n
Reducing cost of poor quality (Score:2)
Because I KNOW what the bugs are in the C code in the festering pile of technical debt I have to work on. I'm just not allowed to fix it because that doesn't generate revenue.
Fixing a bug generates revenue indirectly by reducing cost of poor quality [wikipedia.org]. In particular, it reduces the amount that your bug bounty program pays and the time (which is money) that your DevOps team needs to spend fighting fires when a user triggers the bug in a material way.
Re: (Score:2)
Also using a lot of your compiler's static checkers is a good thing. One thing modern compilers support is printf-string checking., You just need to decorate a function that takes a printf-style string (or scanf, or others), the compiler will verify the positional parameters to make sure it conforms.
It's amazing how often a faulty log message gets caught by this check and you can waste a good half hour debugging why your program is crashing differently because you added a printf log message.
Sure, they're no
-Werror breaks your program on new compiler (Score:2)
But if you don't build with at least "-Wall -Werror" all the time, then you're probably part of the problem.
Using -Werror will probably cause your code to fail to compile after users of your software have upgraded to a new version of the compiler. This happens when the developers of the new compiler version have added a warning to -Wall or expanded the scope of misbehavior detected by an existing warning.
Re: (Score:2)
Restricted C missing "dangerous" things such as pointers is already a thing and has been for decades. Google MISRA-C which is used in the defense industry. Converting it to Rust will gain you nothing except reducing the number of devs who can work on it iand hence raise your costs.
It makes me wonder whether simply encouraging use of Mini-C would be enough and how it compares to Rust when used in a practical context?
So if your C code is already memory safe ... (Score:3)
... it can now automatically be converted to Rust.
So, why not just write Rust code to start with?
Re:So if your C code is already memory safe ... (Score:4, Interesting)
1. It's really hard to find anyone who's got actual dev experience. You can only hire so many juniors per year.
2. Writing code in Rust is really hard for many people who are perfectly fine programmers in C.
3. If they cannot write in C, they have zero chance of writing in Rust.
Re: So if your C code is already memory safe ... (Score:2)
I'm the opposite, rust was basically my first real programming language. Sure, I was writing some java and c# code before, but both languages, especially java with its shitty tooling, were fucking annoying to work with. Rust was the first language I used that actually made sense.
I could never pick up c or c++ despite making a few attempts. Particularly in the case of C++, it was too easy to write code that was broken without any apparent reason. You basically have to learn the fundamentals before you get an
Re: (Score:3)
As a C programmer (and often assembly), I looked into Rust. I battled Javascript in 1999, and declared it my enemy. I tried to learn Perl, but it makes my brain hurt. I learned and used PHP OO to develop a useful web database project, and for certain jobs, I like it. I found some Python code behind a dumpster and tried to understand it, but I hated the way the types could just change without my consent. Getting the actual facts of an object was not easy. Maybe good Python code is wonderful, but the language
Re:So if your C code is already memory safe ... (Score:4, Insightful)
So, why not just write Rust code to start with?
Because a lot of code is already in C and it is easier to rework the tricky parts than to redo the whole thing from scratch?
Re: (Score:2)
Real world example - the Linux kernel which is in its 4th decade and is integrating Rust.
The key phrase in the paper 'existing formally verified C codebases'. Maybe that would be some benefit to a Rust-based environment such as Redox? Take a formally verified L4 microkernel, autotranslate that to Rust and the lower levels of your OS are worry free.
Re: (Score:2)
Re: So if your C code is already memory safe ... (Score:2)
Slashdot and old gray beards in Unix hate anything new and it is strange how they attack Rust when everyone else including Microsoft and Google love Rust and have no quarrels, SustemD was cool for awhile too and one comment on slashdot freaked out the whole Linux community for 10 years. Smh.
Solaris and Apple went with event driven init systems so stuff can move to the oudcand be flexible for a decade. Rust is superior in never way and while slashdotters love to blame Microsoft for writing insecure operating
Re: (Score:1)
SystemD was never cool.
The idea was okey
And, that is it.
I have stupid ideas every morning, but I do not convert them into software that is supposed to configurate a *ix computer.
Does not make any sense to to have all the this and those in a single text file!
Re: (Score:2)
Because you already have megabytes of C code to start with, such as NaCL library or OpenSSH. And most of this code is written well-enough to be converted.
So, try, find onversion error and fix - and you'll get working full-featured program much sooner than if you sit down and rewrite it on Rust from scratch (although I believe that rewriting mature program, and, especially library from scratch could significally imporove it and get rid of tons of outdated junk).
Re: (Score:2)
I suspect the idea is that this will make it easier to move existing codebases to Rust. They can start by refactoring the C bits which can't be automatically converted, and let the converter do the rest.
It seems they think this is a better approach than automatically converting the C codebase to unsafe Rust, then trying to refactor that into safe Rust.
In any case, I suspect that anyone looking to use this type of conversion is seeking to do new development in Rust. They won't write new applications or libra
Re: (Score:2)
Because some people don't want to throw the baby out with the bathwater. I.e. they may have C code that predates Rust, but would like to switch to Rust without doing a complete rewrite at exorbitant cost.
This should be obvious.
Some game consoles lack a Rust compiler (Score:2)
Writing Rust code fails in at least two edge cases I can think of. One is targeting a platform without an implementation of Rust. For example, I've read that some major video game consoles lack a Rust toolchain, and this is part of why Facepunch didn't write the survival shooter Rust in Rust. The other is if many of your users are building your program from source code, already have a C toolchain installed to build other programs, and lack gigabytes of SSD space and/or Internet download quota to run rustup
Re: (Score:2)
Writing Rust code fails in at least two edge cases I can think of. One is targeting a platform without an implementation of Rust. For example, I've read that some major video game consoles lack a Rust toolchain, and this is part of why Facepunch didn't write the survival shooter Rust in Rust. The other is if many of your users are building your program from source code, already have a C toolchain installed to build other programs, and lack gigabytes of SSD space and/or Internet download quota to run rustup and cargo for the first time.
If there is no Rust toolchain, you also can't automatically convert C code to Rust. But that's the whole point of the article.
undergraduate work makes /. news (Score:2, Insightful)
"Memory safety errors account for 76% of Android vulnerabilities in 2019."
Seems kinda surprising, but why quote a project from 2019? More importantly, how many "memory safety errors" were fixed in converting these "two major code libraries"?
And finally, who gives a shit about cherry picking a subset of C that does not use pointers? How many "memory safety errors" exist in C code that doesn't use pointers? Let's select safe code that maps directly to our target language and write a translator for that! Wi
Re: (Score:2)
"And finally, who gives a shit about cherry picking a subset of C that does not use pointers?"
I only read the abstract. Does the paper actually say it cannot convert code that uses pointers? I would be surprised if that were true. Are you confusing floating-point arithmetic with pointers?
Re:undergraduate work makes /. news (Score:4, Informative)
It doesn't include pointer arithmetic. You can still use pointers for doing things like linked lists, pass-by-reference, etc. but you're limited in many ways.
Re: (Score:2)
Re: (Score:3)
If you read the paper, you'll find it can recognise those patterns in some cases and translate them to use array iterators and slices in Rust. It's nowhere near as useless as some of the commenters here seem to think.
Re: undergraduate work makes /. news (Score:1)
No, he's doing the same crap rust haters typically do: skimming over shit, not bothering to understand it, and then making laughably ill-informed statements about the language.
In this case, he's confusing pointer arithmetic with pointers. It's a pretty big distinction and pretty hard to fuck up, but he managed to do it anyways. He is the kind of developer who goes around talking about why you don't need memory safety because people like himself never make mistakes, only the other "bad" developers do, and th
Re: (Score:3)
The article and the summary say that the conversion will not work for code that uses pointer arithmetic. I'm sure that general pointers (such as pointers to arrays) are probably workable.
The gist of it, though, seems to be, "first write safe C code using a minimal subset of the language, then our program will convert it to Rust." So the program is currently of minimal use.
Re: (Score:3)
I thought in C arrays are just syntactic suger for pointer arithamtic. That is, when referenced a static array is converted to a pointer to the first element of the array, and "array[n]" is identical to "array + n", which uses pointer arithmetic. And dynamically allocated arrays are always pointers to the first element. Does that mean you can't use arrays in Mini-C at all?
Re: (Score:3)
That's really just an implementation detail. Provided you only use array syntax and don't rely on decay to pointers, you can translate to a language that has first-class arrays. You can read the paper [arxiv.org]. It does support structs and arrays, and recognises at least some uses of malloc.
Re: (Score:1)
A subset of C that doesn't support even malloc()/alloca()'d arrays would be nearly useless, however, the Mini-C described in this article does alow for array indexing with the [] syntax. And the research paper [arxiv.org] explicitly says so on page 3.
And yes, to your point, array subscripting is identical to a specific pointer arithmetic expression; per the C23 standard: [open-std.org]
6.5.2.1 paragraph 2 makes this point clear:
The definition of the subscript operator [] is that E1[E2] is identical to (*((E1)+(E2))).
But this particula
Re: (Score:2)
Probably since things have been in a gradual circle down the drain since 2019.
Android apps in C? (Score:2)
> "Memory safety errors account for 76% of Android vulnerabilities in 2019."
This statement seems out of place. Aren't the overwhelming majority of Android apps written in Java or Kotlin (which is based on Java)?
Or at least, wouldn't that have been the case 5 years ago (in 2019)?
So how's a C converter going to help with that?
Re: (Score:2)
They're probably talking about vulnerabilities in the Android OS itself and the native libraries it includes out-of-the-box.
A bit pointless.. (Score:3)
You probably could just convert C into C with a memory safe library instead and not have to change the entire toolchain and deal with the compatibility loss and all that.
Rust == C++ while Zig == C (Score:2)
Lack of pointer arithmetic limits the use of this to the simplest C programs, but I don't understand why Rust is even mentioned in the same breath as C. Rust is closer to C++ than C. If you want Rust for C, then Zig is a better option. Zig improves on C's performance and size characteristics while providing memory safety and maintaining C FFI compatibility. Once you wrap your head around Zig's pointer semantics, it feels very C like. Rust does not feel like C while being slower and heavier than Zig, why eve
Re: Rust == C++ while Zig == C (Score:2)
Zig doesn't advertise itself as memory safe, nor is it trying to be. Not only that but rust performance and memory characteristics are, for all intents and purposes, indistinguishable from C. The same cannot be said for C++. Among other things, rust doesn't stick a vtable on literally everything, and it doesn't do pointer aliasing. Those are generally why C++ is slower. Rust also doesn't do object oriented shit.
MICROS~1 defective by design (Score:2)
Because winTEL can't design a MMU that can sucessfully isolate process memory.
“Memory safety errors account for 76% of Android vulnerabilities in 2019.” “and are currently 24% in 2024, well below the 70% industry norm, and continuing to drop.” Googleblog [googleblog.com]
Re: MICROS~1 defective by design (Score:2)
âMemory safety errors account for 76% of Android vulnerabilities in 2019.â âoeand are currently 24% in 2024, well below the 70% industry norm, and continuing to drop.â
I think the point being made is that 2019 was before Google started using rust in Android. Google themselves attribute rust to being THE reason for that reduction. To date they've found no vulnerabilities in their new rust code, whereas their vulnerability rate for new C++ code is just as high as ever.
Mr. Potato-Head Speaks (Score:2)
But faulty pointer arithmetic is the very foundation of hidden back-doors! Rust and mini-C are the mom and pop poo-poo heads who are going to suck the fun out of everything.
Wake me up... (Score:2)
Wake me up when they can convert pointer arithmetic into safe Rust. Not everything, but enough to be useful. Thxbai.
Convert? (Score:2)
So not-C can be converted to not-C. Great! (Score:2)
A completely worthless stunt, nothing else. To use this, you have to rewrite existing C code, often heavily.
There's a catch? (Score:2)