Forgot your password?
typodupeerror
Programming Space IT Technology

Writing Code for Spacecraft 204

Posted by michael
from the carried-the-bits-uphill-one-by-one-in-the-snow dept.
CowboyRobot writes "In an article subtitled, "And you think *your* operating system needs to be reliable." Queue has an interview with the developer of the OS that runs on the Mars Rovers. Mike Deliman, chief engineer of operating systems at Wind River Systems, has quotes like, 'Writing the code for spacecraft is no harder than for any other realtime life- or mission-critical application. The thing that is hard is debugging a problem from another planet.' and, 'The operating system and kernel fit in less than 2 megabytes; the rest of the code, plus data space, eventually exceeded 30 megabytes.'"
This discussion has been archived. No new comments can be posted.

Writing Code for Spacecraft

Comments Filter:
  • Wait a minute? (Score:4, Insightful)

    by Billly Gates (198444) on Saturday November 20, 2004 @02:52PM (#10875669) Journal
    Was not the OS about Rover loaded with problems? Go read past news from last Febuarary here on slashdot?

    VXworks does not even offer memory protection and the ram can get fragmented. Not to sound trollish but I would pick something like Qnx or NetBSD for any critical app or embedded device.

    Its amazing the engineers fixed it and got it to work reliably but better more mission critical operating systems would be a better choice.

  • by Dominic_Mazzoni (125164) on Saturday November 20, 2004 @03:08PM (#10875748) Homepage
    For those who are wondering, JPL is very aware of the shortcomings of VxWorks and has seriously considered other alternatives for every mission. Keep in mind that the choice of OS has to be made years before launch, so at the time the OS for the 2004 Mars Rovers was decided on, many options that are possibilities today were not contenders. Also keep in mind that in spite of many shortcomings, VxWorks is a known quantity. JPL has been working with it for years and had a lot of in-house expertise with it.

    There are a few groups at JPL that have been actively experimenting with other options, including RTLinux and a few different variants of hard-real-time Java (basically Java with explicit memory management and no garbage collection).
  • Re:Wait a minute? (Score:5, Insightful)

    by neonstz (79215) * on Saturday November 20, 2004 @03:20PM (#10875822) Homepage
    VXworks does not even offer memory protection and the ram can get fragmented.

    Dynamically allocating memory is usually a big no-no in real time systems.

  • Re:hmm... (Score:4, Insightful)

    by Richthofen80 (412488) on Saturday November 20, 2004 @03:23PM (#10875837) Homepage
    thiers might be more reliable, but at a huge cost.

    Probably not as big a cost as losing a Mars rover because your OS wasn't reliable enough.
  • Re:Wait a minute? (Score:4, Insightful)

    by RAMMS+EIN (578166) on Saturday November 20, 2004 @03:41PM (#10875916) Homepage Journal
    ``VXworks does not even offer memory protection and the ram can get fragmented.''

    Why would you even want memory protection in a system like this? Memory protection is great to prevent crappy apps on your PC from doing too much damage, but in a system like the Rover it's pure overhead.

    As for ram getting fragmented, it all depends on how you program it. Often, you don't even need memory allocation, so you won't have any problem with fragmentation.
  • Re:Efficiency (Score:3, Insightful)

    by CarlDenny (415322) on Saturday November 20, 2004 @03:46PM (#10875940)
    Just to clarify, VxWorks runs on a hell of a lot of hardware, dozens of CPUs across all the major families, thousands of device drivers.

    Now, any particular instance of the kernel gets compiled for a specific processor, and only includes the drivers it needs. Which does save on some space. But a lot of that extra space comes from things like a dynamic loader/loader, graphics packages, local shells (usually in multiple flavors,) and host of other applications that are "standard."

    The thing that saves *that* space is the local WDB debugging agent. It lets you offload almost all of the bells and whistles to another machine, which does the object loading, provides your shell, does whatever debugging you need, then sends simple instructions ot he agent to carry them out, and generally dramatically increase the interface capabilities without increasing footprint.
  • Marketing crap (Score:4, Insightful)

    by jeffmock (188913) on Saturday November 20, 2004 @03:54PM (#10875994)
    Okay, I've got to call foul on this WindRiver marketing ploy. They're trading on the last days of being able to get away with saying that something mystical and special and super-high quality is going on behind the walls of trade secret and proprietary software.

    I used vxworks on a reasonably large project several years ago, it's a fine piece of work, but nothing special, it's no where close to the quality of a recent linux kernel.

    About half-way through our project we developed a need for a local filesystem on our box. We bought a FAT filesystem add-on from wind river that was annoyingly poor quality, lots of bizarre little problems, memory leaks, and of course no source to look at. In the end we didn't use it, we put together our own filesystem from freely available sources.

    When I read the articles about vxworks filesystem problems nearly borking the entire Mars rover mission I laughed and laughed. I'm sure that it was the same crappy code (although I don't really know for sure).

    For me it's a case study on why you shouldn't use closed source software, you can't evaluate the quality of the code on the other side trade-secret barrier and you wind up trusting things like glossy brochures.

    jeff
  • by relaxrelax (820738) on Saturday November 20, 2004 @03:56PM (#10876018)

    If that was open source, there are so many space nerds who are programmers that flaws of that magnitude would never get by the army of testers.

    Many would help out simply because hey it's the *space program* and that's good enough for them. Other would want their name listed next to some obscure bug fix on a NASA site; it's good for the ego or your CV.

    Simply put, even a binary distribution of that code would allow unlimited free testing for crashes. Why wouldn't NASA do it?

    Because there are still people in washington that think code mysteriously get damaged by being public - even if such code isn't modifiable by the public who reads it.

    This is evidence of advanced cluelessness in Washington and maybe independant anti-free-source advocates (spelled M-i-c-r-o-s-o-f-t) are at cause.

    But I've learned not to bash. Never explain by Microsoft malice what could be explained by stupidity. Such as using DOS on a space thing...
  • by Gogo Dodo (129808) on Saturday November 20, 2004 @05:25PM (#10876566)
    Uhhh... and exactly how are you going to allow people to test "spaceware"? Last I checked, nobody owns their own satellite system. You just don't dump some satellite code onto your PC and "test" it.

    Open Source is great and all, but it's hardly the answer to everything.

  • Hell yes! (Score:5, Insightful)

    by devphil (51341) on Saturday November 20, 2004 @05:38PM (#10876651) Homepage


    Why would you even want memory protection in a system like this? Memory protection is great to prevent crappy apps on your PC from doing too much damage, but in a system like the Rover it's pure overhead.

    Exactly!

    The problem is that most /.ers are used to thinking of an OS as something that needs to run any arbitrary program under any arbitrary conditions and survive any arbitrary crash in those programs.

    For a Rover, none of those are true. They know exactly what code is going to be run. They know exactly where it's going to sit in memory. And they test it. (This is the part that /.ers can't quite understand.) They test these programs far more rigorously than any bog-standard x86 Linux OSS program ever gets tested. Those programs have their problems, but they will be mistakes in logic (metric/imperial conversions, or thread priority inversions), not segfaults because of derefing a null pointer.

    I wonder how many undergrand CS degree programs still teach correctness proofs? Not "yeah, I ran it lots of times and it didn't crash," but "I ran it 100,000 times with 100,000 different inputs, all random, and it didn't crash, but while it was running I also sat down and mathematically proved the code is correct."

    Embedded programming is just plain different than "normal" progrmming. It's usually a mistake to try to generalize from one to the other.

    (All that said, the next version of VxWorks is advertised to optionally support a "traditional Unix" process model, and I think protected memory boundaries are one of the features. In case your embedded app needs to run arbitrary third-party software which probably doesn't get stress-tested at JPL :-), you can turn all that stuff on and live with the overhead.)

  • by Anonymous Coward on Saturday November 20, 2004 @07:02PM (#10877106)
    In my experience, it is engineers who do not understand how to correctly employ mutexes and semaphores that always cause trouble.
  • Ditto (Score:3, Insightful)

    by wowbagger (69688) on Saturday November 20, 2004 @11:17PM (#10878549) Homepage Journal
    I as well have had the misfortune to pick WindRiver as the core OS for my project, and have had no end of problems.

    Part of the problem in my case was that VxWorks is for smaller embedded systems, which my project is NOT. I need fast disk storage, I need graphics, I need networking, I need things that VxWorks just doesn't provide very well.

    Were I able to change one decision about the design of my project, I would have gone with Linux instead.

    WRS *used* to have something to offer, in that they provided a real-time OS and hardware driver bundles (board support packages in WRS-speak). However, they no longer provide great value in that area - Linux has far better hardware support, and for any reasonably complex project will scale down as well as VxWorks will scale up.
  • by DerekLyons (302214) <fairwater@gmail. c o m> on Sunday November 21, 2004 @12:30AM (#10878860) Homepage
    If that was open source, there are so many space nerds who are programmers that flaws of that magnitude would never get by the army of testers.
    Almost certainly not, as none of that army of geeks would have the specialized hardware that the Rovers use.
    Many would help out simply because hey it's the *space program* and that's good enough for them.
    Few would accomplish anything, as few would bother to study, and learn, and analyze the structure of the program.
  • Huh? (Score:4, Insightful)

    by devphil (51341) on Sunday November 21, 2004 @02:54AM (#10879426) Homepage


    I need fast disk storage, I need graphics, I need networking, I need things that VxWorks just doesn't provide very well.

    "...and even though I chose the wrong tool for the job, it's still the tool's fault for not doing everything I need."

  • by grozzie2 (698656) on Sunday November 21, 2004 @06:45AM (#10879982)
    This just illustrates why /. folks are typically not actually involved in spacecraft design and deployment. If you were, you would know the real reason for this, and wouldn't ask the question (which is not a dumb question btw).

    In the real world, once you get up in the vicinity of the Van Allen belt, you get into hard radiation. If you use typical modern high density chips, with 0.15 micron die spacing, a single particle will short/damage half a dozen traces on the chip on a single impact. If you use really old stuff, with 5 micron die spacing (and higher), a particle will be to small to get multiple traces in a single impact. you may still get a single bit flip, but, ecc will catch that, and you can deal with it. In the former case of a high density die, the failure would end up being catastrophic when a particle impacts the chip. There are practical limits to the size of die that can be mounted on a carrier, and the trace density defines the capacity of that die. Yes, it's possible to cram 32 meg of ram into that space, but, it wont last but a few minutes in a hard radiation environment. Take that same silicon wafer, using 5 micron traces, and it'll last years exposed to the same environment, but, it'll only have 1 meg of useable ram locations due to the decrease in density. you cant just throw more of them on, because then power consumption becomes the issue, in overly simplified terms, the chip is going to use power relative to it's surface area, matters not if it's got 1 or 32 meg of addressable locations in that area. Clock frequency is the other major contributor to power consumption, hence its not uncommon at all to see space hardware measured in KHZ rather than MHZ and GHZ like most folks are used to, and there are damn good reasons to leave it that way.

    An all up spacecraft platform has hard limits on physical size (constrained by the physical limits of the launcher), and hard limits on total mass, determined by the launch vehicle capability to the final trajectory required. The final design will budget a portion of it's mass allowance to power generation, and that power is in turn budgeted to various systems. the folks doing the controllers will have a hard limit on power consumption, another on volume, and a third on mass. working within those limits, they have to design and deploy a system that is expected to have 99.999999% reliability, operating in conditions more extreme than it's possible to actually simulate on earth.

    Its a shame, but there is one thing they dont seem to teach in computer science courses anymore. Out here in the real world, reality gets in the way of all the theory. Moore's law may well say chips will get faster, and density higher as time goes on, but it becomes irrelavent when other limiting factors get in the way. until gamma particles start to shrink, or we come up with an effective way of making sure they dont hit the electronics, 10 year old and older stuff is going to remain 'state of the art' for use in space. Die density and ability to shield are hard limitations, cant get past them, and you wont see more modern equipment going into the reaches of space till those limitations are overcome. That's not likely to happen in the forseeable future, the research in that area is all 'nuclear research' and that's all out of vouge these days, gonna take a couple more generations or a severely critical power shortage to change that.

  • An important point (Score:2, Insightful)

    by m3talsling3r (624150) on Monday November 22, 2004 @01:54PM (#10889662) Homepage
    I hope no one overlooked the "radiation hardening" part of the article. This is something the common, and even a lot of techs I talk to, don't realize as important. Speed is not the only variable in the equation. I'd much rather have a chip that doesn't fall to pieces on me while I'm flying through space. In fact I think it's time for us normal people to get used to thinking about quality again. We are soon going to be forced into harsh elements where we must be able to depend, absolutely, on the hardware being reliable. It's time we start now getting used to the performance loss some might have because of it; or get ready to ditch thin again.
  • RTFA (Score:1, Insightful)

    by Anonymous Coward on Monday November 22, 2004 @05:17PM (#10891637)
    If you had read the article, you would have discovered that JPL had full source code to VxWorks. The article belabors the fact that the folks at WindRiver went out of their way to make sure that JPL could complie the entire system from scratch.

    I'm as fervent a WindRiver basher as the next guy. But at least bash them for things they are *guilty* of. Sheesh!

Nobody said computers were going to be polite.

Working...