MenuetOS, an Operating System Written Entirely In Assembly, Hits 1.0 368
angry tapir writes: MenuetOS, a GUI-toting, x86-based operating system written entirely in assembly language that's super-fast and can fit on a floppy disk, has hit version 1.0 — after almost a decade and a half of development. (And yes, it can run Doom). The developers say it's stable on all hardware with which they've tested it. In this article, they talk about what MenuetOS can do, and what they plan for the future. "For version 2.0 we'll mostly keep improving different application classes, which are already present in 1.00. For example, more options for configuring the GUI and improving the HTTP client. The kernel is already working well, so now we have more time to focus on driver and application side."
Wow. Still? (Score:5, Insightful)
I remember futzing around with this little project 15 years ago. I am pleased to see that, not only is it still going strong, it's pretty remarkably modern.
is this year (Score:5, Funny)
of assembly language on the desktop? will it run linux?
Really? (Score:5, Funny)
Question: MenuetOS is entirely written in assembly? There's no traces of other languages, such as C?
Reply: nop
Re: (Score:2, Informative)
Question: MenuetOS is entirely written in assembly?
Yes, that's the whole point of the OS.
Re: (Score:2)
You clearly don't speak assembly--or funny ;)
Entire OS in about 1/3 of i7 Cache (Score:5, Insightful)
I'm not reallyd sure that I understand that point. To me, thst would sound reasonable for educstionsl Ãr entertainment purposes, but are there any other meaningful reasons for writing an entire OS in assembler?
The entire OS would occupy about 1/3 of an Intel i7's cache. For ultra-high performance apps that might actually be useful.
Of course that includes user land apps and such so the footprint of the OS itself would probably be far smaller.
Re: Entire OS in about 1/3 of i7 Cache (Score:3)
yeah, and think of the lower-powered chips. If I can get an HTTPS server into cache at 10W TDP, a whole new class of apps becomes possible. How small can a Go runtime be?
Re: Entire OS in about 1/3 of i7 Cache (Score:4, Funny)
Re: Entire OS in about 1/3 of i7 Cache (Score:4, Informative)
Someone would have to be crazy enough to try to write an SSL & PKIX library fully in assembly to get that HTTPS server working.
Re:Entire OS in about 1/3 of i7 Cache (Score:5, Informative)
None of that follows from it being written in assembly.
My first 5 years as a dev were spent working on a team that maintained an all-assembly OS, database, print server, and so on. Very fast, very small. But you can get all that in C, and a modern C compiler on full optimization produces object much faster than any sane, maintainable assembly source.
It's a mindset issue, not a toolset issue. Actually using assembly is a great way to keep the right mindset throughout. Plus, learning to read assembly fluently, find bugs by glancing at core dumps, fix bugs with a sector editor to save an assembler pass when you're in a hurry, bugfix a running program in memory - all of that builds your coding muscles, even if it all a bit silly for a modern prod environment.
Re:Entire OS in about 1/3 of i7 Cache (Score:5, Informative)
From experience I know that a well-trained, well-weathered assembly hacker can generate code faster than the compiler.
This hasn't been true since instruction pipelining. And it hasn't been true for maintainable assembly source for abut 20 years.
Quick, what's the fastest way to multiply an integer by 8? Did you answer "depends on the previous 20 instructions, but probably 3 adds separated by 3-4 instructions each"? There's just too much state to keep track of with which silicon is doing what for how long following each instruction.
Re: Entire OS in about 1/3 of i7 Cache (Score:5, Informative)
Not a left shift?
Depends - did you do a shift or multiply in the previous several instructions? When do you need the result? Do you need to add anything else soon, or is the adding silicon idle? Modern compilers actually keep track of all that. For sure, if you need to shift two values, doing each a different way is faster.
The important stuff the coder can provide to help with optimization comes down to: avoid conditional branches, and emphasize locality of reference (I miss the days when a 256-byte lookup table was the fast answer to most bitwise questions).
Re: (Score:3)
Michael Abrash's ancient-but-awesome book, "The Zen of Graphics Programming", contains a chapter called "Heinlein's Crystal Ball, Spock's Brain, and the 9 Cycle Dare" that 's about Abrash's adventures in coming up with the fastest CPU-driven texture mapper he could write in hand-written assembly for his X-Sharp graphics library (he eventually en
Re: (Score:3, Funny)
Re: Really? (Score:5, Funny)
I can tell you weren't alive during the Reagan administration.
Was anyone?
Re: (Score:3)
From what I remember, Ronald Reagan wasn't.
Re: (Score:3)
Re: (Score:3)
The problem with this is that smaller, cheaper, and less power-hungry hardware these days has an entirely different architecture, called "ARM". (I think MIPS is also alive and kicking, being used in many embedded applications as well as ARM.) This shows why writing stuff in assembly is generally a bad idea: you can't port it to another architecture.
Re: (Score:2)
I'm not reallyd sure that I understand that point. To me, thst would sound reasonable for educstionsl Ãr entertainment purposes, but are there any other meaningful reasons for writing an entire OS in assembler?
Today, not that much apart from looking cool. Not a lot of programmers know assembly that well anymore so writing a non-trivial operating system completely with it is definitely something to put on the resume. It used to be necessary to use assembly get good performance, but since the late 80's and early 90's it's not really necessary anymore on personal computers.
Worse, no Unix or POSIX either (Score:4, Informative)
From their own site [menuetos.net]:
So, if you want to port your own application to it, you'll need to rewrite it too. And you may need to do it in assembly — although there is, apparently, a C-compiler for MenuetOS [goosee.com] it is billed as "low-level", which, I gather, means no (or limited) libc, and other exciting and challenging limitations.
Re: (Score:3)
That sounds like theyve been actively un-learning all of the lessons the computer science field has spent decades learning.
Here everyone else has been learning how abstraction can promote collaboration and keep bugs simple, and theyve found a way to justify removing abstraction as a way to reduce complexity (lol?).
Re: Worse, no Unix or POSIX either (Score:3)
There is a problem these days with over generalizations and abstraction to the point where things become unmaintainable and can't be troubleshooted. Eg. JavaScript.
Re: (Score:3, Informative)
http://toastytech.com/guis/qnxdemo.html [toastytech.com]
Re:Really? (Score:5, Insightful)
Seriously? There's absolutely nothing suspicious about the AC's claim. Countless hobbyists have done the same thing.
I know the younger crowd seems to think assembly seems a bit like incomprehensible magic, but it's really not.
So give it another 15 years... (Score:5, Funny)
And they'll have something that'd be marginally useable today.
Come a long way (Score:5, Insightful)
Wow, slashdot has come a long way from when I first started reading "chips & dips" in 1997. Even just 10 years ago, a story like this would have been met with enthusiasm and honest support, with a virtual pat on the back to the developers.
Today, a story like this is reduced to a mere platform for chest-beating (see the parent above). As in, "nevermind the lame story, look at me instead". Why in the world are you people even here?
Re: Come a long way (Score:3, Insightful)
After the Beta fiasco and the Systemd wars combined with the 7month long--every day--posts of "all STEM/geek males are severely sexist," all the quality slashdotters left.
Right now we are basically in a post-apocolyptic state here.
Re:Come a long way (Score:4, Insightful)
Wow, slashdot has come a long way from when I first started reading "chips & dips" in 1997. Even just 10 years ago, a story like this would have been met with enthusiasm and honest support, with a virtual pat on the back to the developers.
Today, a story like this is reduced to a mere platform for chest-beating
To be fair, the vast world of computers and software has come a long way since 1997. What might have been an interesting accomplishment in 1997 is now basically an exercise in pointlessness. Sure, it can be done. Sure, it's small and fast. But so what? What was actually accomplished that's worth anything? Processor power and memory advances since 1997 have obviated any reasonable need for an operating system such as the one described here, and the demands made on modern operating systems pretty much dictate that they be a whole lot more maintainable than any assembly code will ever be.
Re:Come a long way (Score:5, Insightful)
"What was actually accomplished that's worth anything?"
I can turn my computer on and pretty much INSTANTLY use it after POST.
The only other OSes to do that have been purely command-line.
The entire OS can fit within the L3 or L2 cache of modern processors, leaving the RAM entirely free for applications. Got a linux distro that will do that?
It also demonstrates the absolute shit bloat code inefficiencies of today.
MenuetOS is the demoscene of operating systems. It does what any operating system can do, faster, with fewer resources, in much smaller, tighter, SUPERIOR code.
Also, because of the code being pure ASM, most nimrod 'hackers' won't even be able to touch it, as they know jack shit about bare-metal programming. It's more secure by its nature, not by obscurity.
Re:So give it another 15 years... (Score:5, Funny)
Well if it ran Emacs it would be able to do anything anyone could possibly want.
Re:So give it another 15 years... (Score:5, Funny)
Except edit text.
Re: (Score:2)
Re:So give it another 15 years... (Score:4, Funny)
If you use a small enough font it will.
Re: (Score:2)
The real test for computability is to demonstrate that a Turing machine is Emacs-complete.
Sadly, it still requires a chord-enabled keyboard, which the traditional tape-based Turing machine does not implement.
It can run Doom (Score:3)
and their website looks like it's from 1995 as well!
Re:It can run Doom (Score:5, Funny)
and their website looks like it's from 1995 as well!
So its not bloated and it is fast too? Seems appropriate. :-)
Re: (Score:2)
For a page so spartan, it shouldn't have needed 3 seconds to load on a 100mbit connection.
Re: (Score:2)
For a page so spartan, it shouldn't have needed 3 seconds to load on a 100mbit connection.
Well it is hosting its own web page and running on a Pentium MMX 200. :-)
Re:It can run Doom (Score:4, Insightful)
and their website looks like it's from 1995 as well!
So its not bloated and it is fast too? Seems appropriate. :-)
I can state with authority that in 1995, there were exactly zero fast websites. Of this I am 100% certain.
Re:It can run Doom (Score:5, Insightful)
Finally, a web site which doesn't try to overrun your browser with unnecessary rotating images and the latest and greatest shiny because some web designer said, "Why not?"
In other words, a web site which is useful.
Re: (Score:2)
And yet Slashdot loads faster even with all its shit coding.
Looks great for industrial (Score:3)
Re: (Score:2)
if the timing is that tight, it would allow using secondhand PCs instead of ladder logic controllers.
Think high performance computing too . An entire OS that takes a small fraction of the CPU Cache.
Re: (Score:2)
Re:Looks great for industrial (Score:5, Insightful)
Or a $10 ARM chip, programmed in C.
Not Open (Score:4, Insightful)
http://www.menuetos.net/m64l.t... [menuetos.net]
I might play with it, but if I can't use it for work, play is all it'll be.
Re: (Score:3)
http://www.menuetos.net/m64l.t... [menuetos.net]
I might play with it, but if I can't use it for work, play is all it'll be.
I wonder what commercial uses they're thinking of.
Presumably they're thinking of some super-low footprint embedded devices, but still this seems like a lot more of a fun project than a viable product.
Re:Not Open (Score:5, Interesting)
I wonder what use they're thinking at all. Minix 3 is a step forward: were we to port Linux interfaces for udev, kevents, and such onto Minix, we could drop a Linux userland onto it wholesale, with systemd and all, and benefit from a core operating system which lends itself to drastic rearchitecting.
Consider that running Minix as an OS-level virtualizer--OpenVZ, LXC--is a trivial task, one which requires only providing a different network server and different security features (e.g. users and groups, and their flow through the file system driver and such), largely doable on the existing code base. You could even run Xen on top of Minix with a minor tweak.
Consider that Minix is a collection of services which may be extended by adding other services. It is interfaces and features, not tightly-bonded kernel code. The shape and form of the OS can be changed without commitment: to add services to handle Linux services is not to change Minix, for you may simply not use those services; you could instead add services to make Minix pretend to be OpenBSD. You could replace its threading model or scheduler by swapping out a service, providing a system that uses the advanced features of DragonflyBSD.
There, again, we see a step forward: DragonflyBSD, with its non-locking semaphores, its highly-efficient threading model, its ability to freeze an application and thaw it after a reboot, to checkpoint running applications--a feature Minix does not possess, but could simply by adding a new kernel service--and even to move applications between machines. Extending some of these things would be to bring the features of OpenMOSIX: check pointing and running an application on a different boot cycle or different machine brings the magic of scheduling applications across a cluster of machines acting as one.
Why, then, do we persist in creating these tinker toys, instead of extending, cannibalizing, or imitating those things which show real progress? Why has Minix not embraced the great strides forward made by Linux and integrated its interfaces so as to integrate its user space in distribution? Why has Linux not subsumed the threading model of Dragonfly BSD? Why has someone chosen to create an OS to no purpose, rather than to create a unified system carrying and integrating the lessons from all prior systems?
Re:Not Open (Score:5, Interesting)
Check out the memory footprint of Linux sometime.
Two megabytes. Four, really, when using a single 4MB huge page to allocate the entire kernel in one go. A few bytes to maintain each task's information, each cached disk page, each handle held by an application.
Linux uses so little memory you can run it on a microcontroller with a megabyte of RAM. When you build up all the services needed to supply network management, graphical systems, user log-on, audio mixing, and so forth, you get maybe 100MB. When you then run a Web browser and go to a few open tabs, you need a gigabyte or three.
Re: (Score:2)
Re:Not Open (Score:5, Insightful)
The problem with this thinking is memory is not cheap. Memory has grown, and program memory usage has grown linearly with memory, or greater. The problem is when you run two programs at once: the combined working set is bigger than RAM, and has grown linearly as well. Where you may have had 32MB of RAM and 48MB swapping on and off disk, now you have 16GB of RAM and you're swapping around 10GB of data; but swapping is not now 500 times faster, and so the bloat has slowed the machine.
The growth of the working set means the growth of memory controller latency, the need for RAS and CAS selects on different rows requiring precharges taking up 200 FSB cycles on CPUs which now have multipliers of 10 or 15 instead of 1.5 or 2. Random memory access may now have delays of 3000 cycles, causing an instruction requiring 4 cycles to execute to now take 750 times as long; your typical CAS selection may in fact require 7, 10, even 21 cycles now, multiplied by 10 or 15, so as to take 300 cycles of stall. Modern CPUs are made and broken by their CPU cache efficiency and their predictive execution, and multi-tasking flushes those caches and destroys performance.
Computers have grown to manage immense spans of cheap resources, yet they have not increased their capability to manage resources nearly as fast as they have increased those resources.
Re: (Score:2)
Memory has grown, and program memory usage has grown linearly with memory, or greater.
Because people want more features, not because they were written in a higher language.
now you have 16GB of RAM and you're swapping around 10GB of data; but swapping is not now 500 times faster, and so the bloat has slowed the machine.
I have 8GB of RAM in my PC, and 2GB of swap, none of which is currently used. And compared to my 486 with 16MB, everything is a lot faster, despite the bloat.
Re: (Score:2)
But every time RAM capacity increases and CPU speed increases, the OS gets more bloated and your apps' performance has only marginally increased in 5 years. Hardware improvements should go to the apps, and not get eaten by the OS.
Re: (Score:2)
Re:Not Open (Score:5, Informative)
Not Open
Yeah, KolibriOS [kolibrios.org] is an open fork of MenuetOS. It was forked when Menuet was still open source. Although Kolibri hasn't been updated for almost a year.
Not sure why they want to keep the genie in the bottle. Open source would be perfect for this kind of hobby project.
Re: (Score:2)
Re:Not Open (Score:5, Informative)
That's the whole point. It's not open source. The 32 bit kernel is old and not developed anymore.
Re: (Score:2)
It's not free software (as in freedom)
Re: (Score:3)
But corporations are people, so...
32-bit is open (GPL) (Score:3)
What next? (Score:5, Funny)
Now that they've got this thing working, what would be really cool is if they could come up with a way of getting it to run on different processor architectures, in case x86 loses out to ARM in the long run.
I'm thinking maybe they could write some sort of abstraction layer whereby the instructions are originally written in some sort of higher level format, which could then be automatically turned into machine code for different hardware using a special program. You could do all sorts of things with that kind of system. I'm surprised nobody's thought of it before, actually.
Re:What next? (Score:5, Funny)
Now that they've got this thing working, what would be really cool is if they could come up with a way of getting it to run on different processor architectures, in case x86 loses out to ARM in the long run.
All they have to do is translate the part of the OS written in x86 assembler...
Re: (Score:2)
and if they use keypunch it'll fit on a punched card
Re: (Score:3)
I'm not sure he's the one who got whooshed :)
Re: (Score:2)
Did you bother reading the whole post? Apparently not since the joke whooshed over your head.
Re: (Score:2)
You mean, start over and rewrite everything they've done so far in ARM assembler?
I wonder if you could do something like have LLVM Clang compile it to byte-code and compile the byte-code to ARM binaries?...
Disappointing (Score:3)
Re: (Score:2, Insightful)
Never wrote any DSP code have you? It's trivially easy to beat compiler optimizations even with naive SIMD assembly.
Re: (Score:2)
It's mostly faster because they left out all the things that were too hard in assembly.
Re: (Score:3)
It's not, and it doesn't run on tiny systems either. It requires ~300MB of RAM just to boot up. It's basically a hack done to stroke someone's fancy. It's not practical in any sense of the word. You could rewrite it in C and it would run just as fast. Its speed is mostly related to the architectural choices, and many of them are just plain wrong. Way too much of the API is blocking - that means that the internal architecture is broken, pretty much, and won't scale, and wastes power. It's basically a demonst
Re: (Score:2)
It's been a long time since I was able to "out-code" a C compiler.
Do anything requiring SIMD to have any decent performance. Then it's pretty easy to beat C compilers since they are all pretty shit at vectorization. And thats even when following the arcane requirements of each compiler that the vendor says improves the chances that auto-vectorization can work which mostly don't work for anything more than trivial algorithms.
Assembler Scene (Score:2)
Surely this runs so embiggeningly fast that it's unusable?
I remember running some assembler demo's (back in the day) that made "legacy" hardware do things that just didn't seem possible. The thought of running an OS with that same performance on modern hardware is frightening. Hopefully they've tuned the input routines so that you don't eeennnddd uuuppp wwwiiittthh kkkeeeyyybbboooaaarrrddd jjjiiitttttteeerrr :o)
Do they still teach assembly language? (Score:3)
Plenty of OSes written in assembly (Score:5, Insightful)
Plenty of OSes have, over the course of history, been written in assembly.
And all of them proprietary, just like this one.
Menuet is cool, but I don't see a compelling reason to use closed source assembly unless it demonstrates some really crazy superpowers. It's also an odd case of a GPL codebase switching to a closed source license a couple years before it becomes useful.
Kolibri forked from the GPLed 32 bit branch, but I don't think it's pure ASM at all.
Clearly this is the Kuerig of OS's. (Score:2)
Good work Menuetosites!
Comment removed (Score:3)
30 seconds of music from iTunes. (Score:5, Funny)
It fits on a floppy disk? We are in 2015, right? What is a 'floppy disk'?
Its a unit of storage space measurement equivalent to about 30 seconds of music from iTunes.
Re: (Score:2)
You forgot to convert it to MIDI first, which any good hacker can do.
Re: (Score:3, Funny)
Definition of a "good" hacker; One who knows how to convert Jimi Hendrix to MIDI format, but doesn't.
Re:30 seconds of music from iTunes. (Score:4, Funny)
Its a disk invented by AOL and promoted by them with free internet offer on the disk.
Re:Floppy disk? (Score:5, Funny)
It fits on a floppy disk? We are in 2015, right? What is a 'floppy disk'?
It's an object lesson in using pure assembly. By the time you get anything useful done, technology has moved on.
Re:Floppy disk? (Score:5, Interesting)
It's an object lesson in using pure assembly. By the time you get anything useful done, technology has moved on.
Not really. I have some computational code that I wrote in assembly 4 Pentium architectures ago. Every new architecture I run it against the C implementation, freshly recompiled with a current compiler. The assembly is still faster given all the hardware and compiler improvements. Now the performance improvement is getting much smaller but it is still a win.
Re:Floppy disk? (Score:4, Insightful)
And you're assembly is probably easy to beat with even pretty crappy SSE2 code.
Re: (Score:2, Insightful)
Damnit, yes it was supposed to be *your*. Typed faster than thinking.
Think of the legacy hardware ... (Score:4, Interesting)
And you're assembly is probably easy to beat with even pretty crappy SSE2 code.
Apparently not by compilers.
You don't seem to understand the purpose of writing in assembly language. Its not to optimize for the current state of the art box. It is to get acceptable performance from old legacy boxes. Some assembly in the right spot(s) can make the difference between an old architecture making the cutoff in terms of acceptable performance, of being able to include that segment of the market in your minimum system requirements.
My point is that such optimizations for the sake of the old boxes doesn't necessarily do any harm to the new boxes. That worrying about future architectures is a red herring of sorts.
Re: (Score:3)
Apparently not by compilers.
Duh. Compilers are shit at things like vectorization. They can fail at generating decent code even with intrinsics.
Re: (Score:2)
Apparently not by compilers.
Duh. Compilers are shit at things like vectorization. They can fail at generating decent code even with intrinsics.
Thank goodness we still have assembly language. :-)
Re: (Score:2)
For me, when I do a any assembly it is writing SSE cores for algorithms. So pretty crappy SSE2 code wouldn't usually beat it.
Re: (Score:2)
The assembly barely wins if I update it every generation with the latest improvements in SIMD instructions.
Then you're doing something horribly wrong and causing all sorts of stalls in the pipeline.
Otherwise, the unchanged c code wins ever since autovectorization has been implemented decently.
Well since autovectorization is still mostly shit this time has yet to arrive. If you think autovectorization is so great, take x264 and compile it with this mythical compiler with all its fancy autovectorization but without any of the assembly optimization. There's no way in hell any compiler is going to beat the hand-written SIMD.
Re: (Score:2)
but without any of the assembly optimization
And by this I mean disabling x264's handwritten assembly optimizations.
Re: (Score:2)
Maybe it takes 10 times longer for the 2-3x performance gain. But with decent apps, it could beat Windows, Linux, OS X, BSD etc in the performance game. In case you haven't noticed, operating systems have stagnated... there haven't been major advances in OS X or Windows for over a decade.
Re: (Score:2)
there haven't been major advances in OS X or Windows for over a decade.
That's because the field has become mature, and there's little improvement to be made. For kernel-level and system-level stuff, all these concepts were mostly invented ages ago. Even the UI stuff was pretty stable by the early 2000s. So now some groups are trying to reinvent the wheel with all-new UI paradigms, as seen with Gnome3 and Windows' Metro UI, usually by trying to tie mobile and desktop UIs together, and the results have bee
Re: (Score:3)
No NT is not VMS, but common architect and core developers make for many shared concepts: http://windowsitpro.com/window... [windowsitpro.com]
VENOM bug exploits floppy drivers in KVM, etc. (Score:2)
A Floppy Disk is that device you almost never bother using, but which gets added to your virtual machines by default, at least under VMware (haven't paid attention on OpenStack.) The recently-discovered VENOM vulnerability exploits bugs in the floppy drivers, which have been around for a decade, to let a process on a virtual machine break out into the hypervisor and maybe mess with other virtual machines.
So it's especially timely to have a convenient new platform for using floppies!
Re: floppy disk... (Score:5, Funny)
it grows to 2.5"?
Re:floppy disk... (Score:4, Informative)
Hey buddy, ever wait around near the nurses station at a hospital? Women can and do tell dirty jokes all the time.
Re: (Score:3)
Assembly is not difficult.
And nobody creates chips from scratch any more, but the underlying electronics is still worth learning. If you disagree, go look at the THOUSANDS of Arduino etc. projects. Arduino is a microcontroller, not a processor. It has a pittance of RAM, a pittance of speed (16Mhz?) can't access external memory directly, etc. But the principles behind using it reveal a lot about how the electronics work and the problems associated with them.
Just a digital circuit? Far from it when you h
Re: (Score:3)
Arduino is a microcontroller, not a processor.
Actually, it isn't. The microcontroller is an Atmel AVR. The Arduino project is the processor, plus some standard hardware boards, plus a C++ programming environment that actually shields you from the bare metal.
Re:Turn in your geek card. (Score:4, Informative)
Re: (Score:3)
Doom can run on a Super Nintendo, while Quake requires more system performance.
Doom is a fan favorite, but Quake is a more relevant demo