Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror
×
Data Storage Hardware

Conquest FS: "The Disk Is Dead" 316

andfarm writes "A few days ago, I sat in at a presentation of a what seems to be a new file system concept: Conquest. Apparently they've developed a FS that stores all the metadata and a lot of the small files in battery-backed RAM. (No, not flash-RAM. That'd be stupid.) According to benchmarks, it's almost as fast as ramfs. Impressive." The page linked above is actually more of a summary page - there's some good .ps research reports in there.
This discussion has been archived. No new comments can be posted.

Conquest FS: "The Disk Is Dead"

Comments Filter:
  • well and good (Score:3, Insightful)

    by iamweezman ( 648494 ) on Monday April 21, 2003 @10:34AM (#5773474)
    this is great. We all have seen this coming, but how is the industry going to take this and implement it. My bet is it won't. The only way that it will take hold is if you can find some small application that will take and apply it.
    • Re:well and good (Score:5, Insightful)

      by robslimo ( 587196 ) on Monday April 21, 2003 @10:45AM (#5773563) Homepage Journal
      I've predicted and eagerly anticipated the demise (by replacement) of spinning media (magnetic and optical) for 10 or more years now... I've predicted it will happen, not when.

      As this new filesystem implicitly admits, the price/MB is still so much dramatically lower for HDD's than solid state memory, it will still take quite a will for this replacement to happen.

      I disagree that some small killer app must come along to make this happen. Yes, solid state media is coming down in cost and increasing in density, but both need to change by 2 or 3 orders of magnitude before the HDD is dead. What we're waiting for here is the classis convergence of technology and its applications... the apps won't some until the technology can support it and the tech is driven by our demand for it. Expect another 10 years at least.

      • Re:well and good (Score:5, Insightful)

        by CrosbieSmith ( 550211 ) on Monday April 21, 2003 @10:59AM (#5773656)
        It will happen when the price difference between solid-state devices and magnetic storage gets narrower. That's not happening [jcmit.com].

        This was also pointed on Saturday's Slashdot Story [dansdata.com]

        A mere $US5,000 would be something of a price sensation by the standards of current large capacity SSDs,
        whose prices aren't dropping nearly as quickly as are those of magnetic media.
        • Re:well and good (Score:5, Insightful)

          by robslimo ( 587196 ) on Monday April 21, 2003 @11:12AM (#5773754) Homepage Journal
          True, and that narrowing will have occurred by the time the cost/density ratio of SSM has improved by 2 or 3 orders of magnitude.

          A couple of reasons I see the death of the HDD to be not-to-imminent:

          (1) Those damned HDD makers keep pulling new physics out of their as^H^H hats and keep pushing the storage densities to rediculous new levels.

          (2) the solid state memory of the future ainta gonna be Flash as we know it now (with slow and limit write cycles) and it also will not be battery-backed RAM (unless we go write it all back to disk for 'permanent' storage at some point). I bet on some variation on today's Flash without its limitations, but the tech has got some ground to make before this all happens.

          My other long-term prediction has been that CRTs (vacuum tube, for pete's sake!) will be replaced with LCD or similar tech and we're getting really close.
      • I've predicted and eagerly anticipated the demise (by replacement) of spinning media (magnetic and optical) for 10 or more years now... I've predicted it will happen, not when.
        well, i've been "predicting" that the end of life as we know it will come, not when though.
      • Re:well and good (Score:5, Interesting)

        by iabervon ( 1971 ) on Monday April 21, 2003 @11:24AM (#5773844) Homepage Journal
        One important thing to realize about storage is that people's storage needs for some types of files grow over time, but storage needs for other things do not grow significantly. For example, if you separate attachments and filter spam, you can now buy an SD card which will store all of the email you will get in the next few years; when that runs out, you'll be able to buy a card which will store the rest of the email you will ever receive. There are similar effects for all of the text you'll ever write.

        Furthermore, there are a number of important directories on any system whose total size won't double in the next ten years, because they add one more file of about the same size for each program you install, and they already have ten years of stuff.

        In the cases where you do have exponential growth of storage use, the structure of the stored data is extremely simple; you have directories with huge files which are read sequentially and have a flat structure.

        I see a real opportunity for a system when you have one gig of solid-state storage for your structured data and HDDs (note that you can now add a new HDD without any trouble, because it's only data storage, not a filesystem) for the bulk data.
        • Re:well and good (Score:5, Interesting)

          by DoraLives ( 622001 ) on Monday April 21, 2003 @11:40AM (#5773948)
          I see a real opportunity for a system when you have one gig of solid-state storage for your structured data

          It will be OS-on-a-chip (and a good OS at that), it will go for about twenty bucks a pop down at WalMart or CompUSA and Bill Gates will die of an apoplectic fit when it hits the streets. Hackers will figure out ways to diddle it, but corporations and average users will upgrade by merely dropping another sawbuck on the counter and plugging the damned thing in when they get back to their machine(s). Computers will come with these things preinstalled, so there'll be no bitching about not having an OS with any given machine. High-end weirdness will, as ever, continue to drive a niche market, but everybody else will regard it about the same as they regard their pair of pliers; just another tool. Ho hum.

        • So basically, as the cost of intermediate levels of storage comes down, you can manifest the different levels of persistence in that storage. Certainly not all data is created equally, and things like temporary file systems, and "real" RAM swap can be kicked to an intermediate cache before hitting disk. In fact, IIRC, hard disks already have some memory on board for internal use...it just isn't visible or available for general use to the operating system or rest of the computer (I could be wrong here). I
      • Re:well and good (Score:5, Insightful)

        by Daniel Phillips ( 238627 ) on Monday April 21, 2003 @12:16PM (#5774232)
        I've predicted and eagerly anticipated the demise (by replacement) of spinning media (magnetic and optical) for 10 or more years now... I've predicted it will happen, not when.

        You may have to keep predicting for some time yet. So far, nobody has managed to come up with a solid-state approach that gets anywhere close to the cost of spinning media, and though solid state gets cheaper over time, spinning media does too.

        For the most part, posters to this thread missed the point of this effort. The authors observed that some relatively small portion of filesystem data - the metadata - accounts for a disproportionate amount of the IO traffic. So put just that part in battery-backed ram, and get better performance. Hopefully, the increased performance will outweigh the cost of the extra RAM.

        The fly in the ointment is that, in the case where there's a small amount of metadata compared to file data, the cost of transferring the metadata isn't that much. But when there's a lot of metadata, it won't all fit in NVRAM. Oops, it's not as big a gain as you'd first think.

        It's surprising how well Ext2 does compared to RAMFS and ConquestFS in the author's benchmarks.
  • Drawback (Score:5, Interesting)

    by ifreakshow ( 613584 ) on Monday April 21, 2003 @10:36AM (#5773491)
    One quick draw back I see in this system is on a computer where you have more small files than available RAM space. How does the system decide which small files to keep on the regular disk and which ones to keep in RAM?
    • Re:Drawback (Score:3, Informative)

      by oconnorcjo ( 242077 )
      How does the system decide which small files to keep on the regular disk and which ones to keep in RAM?

      That is an algorithm issue but the worst that happens as far as I can see is that when the ram borks a little, it just means the throuput goes down while it writes more stuff to disk. I highly doubt that will be a big issue since algorithm's for determining priority [they could almost be plucked from a "VM" HOWTO] are abundent and if they are not all perfect- many are good enouph.

    • Re:Drawback (Score:2, Informative)

      by AaronMB ( 136741 )
      Since the metadata is stored in RAM, it'll have access to the atime,ctime and mtime variables quickly. Thus, it'd be pretty easy to do an LRU scheme to dump rarely-used files to disk periodically.
      -Aaron
    • Re:Drawback (Score:5, Interesting)

      by ottffssent ( 18387 ) on Monday April 21, 2003 @11:19AM (#5773809)
      LRU eviction is somewhat costly, but highly effective. Pseudo-LRU can be much cheaper and nearly as effective. The replacement policy is not hard - it is a well-researched problem in cache design.

      What I find telling is that such a system has to be implemented at all. It seems clear to me that the operating system's filesystem, in conjunction with the VM, should implement this automatically. In Linux, this is true - large portions of the filesystem get cached if you have gobs of RAM lying around. Why certain more commonly-used OSes do the exact opposite is beyond me.

      From my perspective, the right way to handle this is obvious. RAM is there to be used. Just as we have multiprogramming to make more efficient use of CPU and disk resources, we should be making the best possible use of available RAM. Letting it sit idle on the odd chance the user will suddenly need hundreds of meg of RAM out of nowhere is rediculous. From the perspective of the CPU, RAM is dog slow, but from the perspective of the disk, it's blazing fast. ANYTHING that can be done to shift the burden from magnetic storage to RAM should be done. Magnetic storage excels in one area and one area only: cheap permanent storage of vast amounts of data. RAM should be used to cache oft-used data. Why is this not painfully obvious to anyone designing an operating system?
      • Re:Drawback (Score:4, Informative)

        by Gromer ( 9058 ) on Monday April 21, 2003 @03:01PM (#5775460)
        I think I was at the same talk as the poster.

        In point of fact, Conquest does not use LRU. Conquest uses a very simple rule- files larger than a threshold are stored on disk, and files smaller than a threshold are stored in RAM. The threshold is currently a compiled-in constant (1 MB), but plans are for it eventually to be dynamic.

        The advantage of this approach is that it eliminates the many layers of indirection needed to implement LRU-type caching, which is one reason Conquest consistently outperforms FS's based on LRU cacheing.
  • by Falconpro10k ( 602396 ) <jmark2&gmail,com> on Monday April 21, 2003 @10:36AM (#5773494) Homepage
    wow... thats all i have to say, something like this could make waiting over a min or two to boot totally obselete... sort of like a "turn on" welcome to your OS of choise type of thing... i also tons of other possibilities such as high end graphics work and maybe even phasing out the disk as we know it 100%.... all solid state.. the possibilities are endless
    • Like loading a device with a hardened linux OS that is only stored on flash ROM.

      Go calculate [webcalc.net] something.

    • "something like this could make waiting over a min or two to [re]boot totally obselete..."

      But my company's IS dept. is still very content on making me reboot whenever I try to do something "luxurious," like having two applications open at the same time.

      I will post my sig as soon as I finish rebooting.
  • by asdfasdfasdfasdf ( 211581 ) on Monday April 21, 2003 @10:37AM (#5773495)
    The idea of RAM as storage is great and all, but can we work towards the elimination of STORAGE as RAM before we get to RAM as storage?

    I mean, why *DO* we still have pagefiles?

    A MS Gripe: I seriously don't understand why I can't turn it off completely. With multiple GB of RAM dirt cheap, writing to a disk pagefile slows my system down-- It has to!
    • by DigitalGlass ( 513918 ) on Monday April 21, 2003 @10:40AM (#5773521)
      what version of windows are you running? I have had no problem with turning off the pagefile in 2000 and xp, my machines have 1gb in them and they cranked when i disabled the pagefile.

      should be in control panel - system - advanced - performance --- look in there for something to set the page file to 0 or to disable it.
      • by Surak ( 18578 )
        So long as data + applications physical RAM you won't have a problem. I often have to deal with files that are several hundred megabytes in size and grow once in RAM. Your setup wouldn't work for at me.

        That's why we still have swapfiles.
      • by Anonymous Coward
        Does not quite work like that, you can set your pagefile size to 0, but Windows 2000 wil create a small pagefile about 20 MB and still use it !
    • Heh... I've "Ask Slashdot"'ted this one myself (rejected, of course).

      Personally, I agree with you 100%. Get rid if the idea of "pagefiles" and "swap partitions" completely, and enjoy the performance boost resulting from the "loss".

      As for the idea at hand (to make this at least vaguely on-topic)...

      How does this differ from a large RAM disk cache, with slightly tweaked heuristics for keeping something in cache (size, rather than most-recently-read)?
      • When the power goes out you don't lose anything, even if it goes out for several weeks.

        plus, rather than basing its heuristics on recently used data, it is actually designing around having RAM to store files (as opposed to RAM to store disk blocks)

        This guarantees that entire files are in RAM and are rapidly available rather than just a few blocks of many different large files.

        Plus it stores the metadata in RAM, so the filesystem on disk is much simpler (and thus more robust). If it's smart, it backs up t

    • Correct me if I am wrong, but the only time that it starts to write to your page file is the moment that the amount of data passes the amount of RAM you have. At which point it starts writing information from the ass end of your RAM to the pagefile. While this is not as fast as your RAM data pull it is faster than finding it on the disk, and parsing it back into usable info. It knows where it is already and drops it back to the front of your RAM storage.

      This is just my limited view of what I kind of figure
      • by gl4ss ( 559668 ) on Monday April 21, 2003 @10:53AM (#5773617) Homepage Journal
        with some versions of windows it will swap to pagefile even before the ram fills up (recent too), sometimes putting there stuff that will be needed very shortly too, theres some programs that try to battle this though(i had success with one with win2k back when i still had only 128mb, supposedly it prohibited windows from swapping some constantly needed system resources to disk, and worked mostly this way, i don't have a lot of faith in 'memory managing' programs that will just do a malloc(some_big_number_of_bytes) every now and then and free that straight after supposedly resulting in most useless stuff getting thrown into the swap and leaving the ram free for what you're going to run next)
    • Who says that you need to have pagefile? (Or is that A MS Gripe refering some certain Operating System?)
    • Who are you kidding? (Score:3, Informative)

      by amembrane ( 571154 )
      What do you mean you can't turn it off? I haven't had a pagefile since I hit 1GB of RAM in my desktop. The Windows XP option (System Properties, Advanced, Performance Options, Advanced, Change) is even called "No paging file".
      • Thanks for the info. That must be new with XP. In 2000 (still much more stable) you can't go below 2 MB, and it still insists on using it, even though it's so small. Perhaps there are registry hacks to do it, thanks again for the info!
        • It's not new with xp, it's been there since windows 95. Of course running non-NT based Windows systems without a pagefile was not advisable (too many memory leaks from badly programmed apps and no way for the OS to resolve them).

          Also, please stop with that myth that 2000 is more stable than XP. If you don't like the eye-candy, fair enough, but 2000 crashes as much as XP. No more, no less. And on a properly configured system they crash as much as Linux, that is to say, they don't.

    • Put your swap file on a Ramdisk.

      Palad1, MCSE

    • by Merlin42 ( 148225 ) on Monday April 21, 2003 @10:56AM (#5773642)
      Even with tons of RAM pagefiles are a GOOD thing and if used properly (Even MS uses them properly these days) they speed up the system on the whole.I have done a *little* OS development so I may not be an expert but I do have an idea what I 'm talking about.

      The swapfile is where the OS puts things it hasn't used in a while. On windows this would probably include things such as the portions of IE that are now part of the OS and you are forced to have loaded even if you are not using the box for web browsing. Having placed these items in the page file frees up room for things that are currently usefull such as IO buffers/cache (disk and/or net) that can dramatically increase speed by storing things such as recently used executables, meta-information .... wait this sounds familiar ;)

      That being said I think the technology discussed in this article is a bit too single minded. I think adding an extra level in the storage heirarchy between main ram and non-volitile HD is probably a good thing. My idea is to add a HUGE pile of PC100 or similar ram into a system and have this RAM accessed in a NUMA style which is becoming very popular. The nintendo GameCube uses a form of this aproach, there are two types of RAM with a smaller-faster section and a larger-slower section.

      The problem with my idea is that the price difference b/w cheap-slow RAM and fast-expensive RAM is not enough to make it worth the extra complexity currently. But, I would guess that if someone took the effort to design/build cheap slow RAM they could find a niche market for a system accelerator device ... but then again I could just be not well enough informed (a little knowledge is dangerous ;) and rambling like an idiot.
    • Very simple... slow ram isn't what you want in your system memory, but it's cheap enough to justify putting a smaller database on.

      It's being continually backed up to hard disk, but in the background.
    • by pclminion ( 145572 ) on Monday April 21, 2003 @11:02AM (#5773683)
      I mean, why *DO* we still have pagefiles?

      Well, a couple of reasons. Most important, the "pagefile" is there to protect against a hard out-of-memory condition. Modern operating systems are in the habit of overcommitting memory, which means they grant allocation requests even if the available RAM can't fulfill them. The idea is that an app will never actually be using all those pages simultaneously. If things go wrong and all that extra memory is actually needed, the system starts kicking pages to disk to satisfy the cascade of page faults. This means the system will become slow and unresponsive, but it will keep running. But say you didn't have anywhere to swap to. The system can't map a page when a process faults on it, and the process gets killed. But which process gets killed? After all, is it the process's fault if the OS decided to overcommit system memory? The swap space serves as a buffer so a real administrator with human intelligence can come in and kill off the right processes to get the system back in shape.

      Swap is also important because not all data can just be reloaded from the filesystem on demand. Working data built in a process's memory is dynamic and can't just be "reloaded." If there's no swap, that means this memory must be locked in RAM, even if the process in question has been sleeping for days! We all know the benefits of disk caching on performance. Process data pages are higher priority than cache pages. Thus if old, inactive data pages are wasting space in RAM, those are pages that could have been used to provide a larger disk cache.

      You basically always want swap.

    • First, pagefiles (or swap partitions) are usually of fixed size. So their existance doesn't necessarily mean they're being used.

      Second, RAM is used for disk cache as well as application memory space. 'Unused' RAM is really being used to cache your slow hard drive, which is a good thing.

      Third, stuff is paged in and out when necessary to free up space for other stuff. It's swapped at the memory page level (4k) based on when data is used. So if you leave your system running for a while, the stuff that gets w
      • Correct me if I'm wrong, but I believe Windows, be default, allows the size of the page file to fluctuate as needed. One of the performance tips I saw recently at MaximumPC was to set this to a constant to avoid unnecessary HD work...
    • I mean, why *DO* we still have pagefiles?

      Because if we didn't our computers would crash when we ran out of RAM. Seriouly, a lot of programs don't cope well with being told: there's no more RAM left on the system, you can't create that object.

      Swap is a good thing. Right now my PC has 512MB of RAM this is more than I need for just about anything I do. Right now my RAM is 31% free and my 2GB swap space is 78% free. Why the hell should so much be in the swap space when I have free RAM? Because that in
    • but can we work towards the elimination of STORAGE as RAM before we get to RAM as storage?

      I mean, why *DO* we still have pagefiles?

      You've obviously never done any real computing [caltech.edu]. There are some computational problems that are still too big to do entirely in RAM, hence pagefiles.

      Actually, Beowulf clusters are spot on topic here. Ever wonder why the concept of a Beowulf cluster was such a breakthrough? (Hint: have you ever tried to crunch 100GB of data from a heat transfer problem?) Beowulfs allowed the hu

    • A MS Gripe: I seriously don't understand why I can't turn it off completely. With multiple GB of RAM dirt cheap, writing to a disk pagefile slows my system down-- It has to!

      This has to do with the way in which Windows NT handles disk cache and paging. Windows treats your swap file as your main RAM, and your RAM as a disk cache, so it can unify the swapping and caching alogrithms. If you disable swap, then it will act as if you have no RAM (with the exception of the unpagable RAM reserved for the kernel)

  • by scorp1us ( 235526 ) on Monday April 21, 2003 @10:37AM (#5773497) Journal
    Execute in Place (EIP)- currently, your system will copy the program to RAM. Here, you'd copy everything from volatile ram to Non-volatile ram - a rather wasteful operation don't you think?

    This is not just for exe's but for datafiles as well...

    • > Execute in Place (EIP)

      Ha! No problem. Just make the file system "sector size" equal to the MMU page
      size, and you can dynamically map the executable file into the machine address
      space.
      • I never said it was hard. Getting changes to made to the kernel is the hard part though, particularly if you are proprietary...

        For exes, you have to setup a stack and heap for the program, so there's still a _little_ work to be done, but it would rule. You could boot Windows in under 10 minutes this way!
    • by hey ( 83763 ) on Monday April 21, 2003 @10:49AM (#5773595) Journal
      Doesn't Linux almost do this?
      It nmap()s executables before running them.
    • No modern operating system "copies" the entire executable into memory when it's run. It's paged in from disk as necessary. If it's already in the cache, then it basically is run from RAM anyway.

      The concept of a paging VM system is so elegant that when it's done right, it makes most other optimizations (such as your EIP suggestion) unnecessary.
    • The inventer of Beowulf is currently working on something called "Processor in memory", the idea being that you embed a number of smaller, slower processors within the memory to speed up the smaller, easier calculations, and send the slower, longer calculations to the main processor.

      For instance, if you were searching through a huge 1000000x1000000 matrix for a single entry to hash, you don't want to have to move each and every entry to the processor to decide whether it's the right one: offload the search
  • Dead? Hardly... (Score:4, Insightful)

    by KC7GR ( 473279 ) on Monday April 21, 2003 @10:37AM (#5773499) Homepage Journal
    This is 'stepping-stone' technology, along the same lines as hybrid gasoline/electric vehicles. They're still depending on hard drives for mass data storage. It's just the executables, libraries, and other application-type goodies that they're sticking into RAM.

    You can do exactly the same thing by sticking an operating program into any sort of non-volatile storage (EPROM, EEPROM, memory card, whatever), and including a hard drive in the same device if need be. The new filesystem they're describing simply shifts more of the load to the silicon side instead of the electromechanical realm.

    In short; The Disk is far from dead. This is just a first step in that direction.

    • In short; The Disk is far from dead. This is just a first step in that direction.

      As far as I am aware, mainframe and other highend systems had hard drives like this since even before the eighties. It is just becoming standard in low-end HD's. Now I can buy a HD with several meg buffer for about 3(?) years. The only thing new in this article is a filesystem that (possibly) helps the hardware/software utilize this buffer better.

    • by tsetem ( 59788 )
      This really should be implemented via an HFS approach. ie: The commonly used files are placed & kept in memory, while less-frequently used files are kept on another medium (Disk or Tape).

      This way, the 20% of your disk that is used 95% of the time is kept in memory forever. While the other 80% is placed on disk. With HFS, as soon as the system saw that you needed the file, it would automatically pull it off of the hard-drive, and stick it in memory.

      Similar techiniques are used for tape & hard-driv
    • This is also a very old idea. I (like many others) used to do the same thing with a RAM Disk on my Amiga.
  • Old news. (Score:5, Informative)

    by Nathan Ramella ( 629875 ) on Monday April 21, 2003 @10:37AM (#5773502) Homepage
    These guys have already done it..

    http://www.superssd.com/products/tera-ramsan/

    Up to a terabyte even.

    -n

    • you are correct. many disk array vendors have been doing this for years. they have huge amounts of ram cache that is backed up by largish batteries, with redundancy of course. This is really nothing new. Even a low or mid-end disk array that scales perhaps up to around 10 terabytes has this involved.
    • Re:Old news. (Score:2, Informative)

      by cube_mudd ( 562645 )
      Actually, what they've done is novel.

      If you read the article, the point of Conquest is to remove the filesystem complexities that pander to disks. Giant RAM based storage arrays do nothing to simplify and streamline the filesystem code in your kernel.
    • Re:Old news. (Score:2, Informative)

      by Duck_Taffy ( 551144 )
      That RAM-SAN is remarkably inefficient, compared to HDDs. For a comparable hard drive-based SAN, you'd think the manufacturer was insane if they said it required 5,000 watts of electricity to operate. I know it's fast, but I don't want a dedicated 60-amp circuit just for a single storage device. And I can hardly imagine the heat it produces.
  • by Apparition-X ( 617975 ) on Monday April 21, 2003 @10:39AM (#5773517)
    God is dead. (Neitzche) Tape is dead. (Innumerable pundits) Disk is dead. (Conquest FS) Yet somehow, they all seem to be alive and kicking as of right now. I wouldnt be throwing a wake just yet for any of 'em.
    • Re:Yeah wutever (Score:3, Informative)

      by gsfprez ( 27403 )
      don't forget the most popular one of all time... Apple is dead. (everyone, starting with that stupid ass Dvorak)

      unfortunately, i find any of these calls of "teachnology death" a waste of time... i'm working at a frmr TRW (now NG) location - and my cow-orker just brought in a floppy disk to my computer because they can't seem to get us network access to the printer.

      nothing is dead - its all just where you're at.
    • God is dead. (Neitzche) Tape is dead. (Innumerable pundits) Disk is dead. (Conquest FS)
      Also:
      OS/2 is dead. (Almost everybody)

      The same thing is true for all of the above:
      They are only as alive or dead as the people who depend on them.
    • "Print is Dead" - Egon Spengler
    • Re:Yeah wutever (Score:3, Interesting)

      by NDSalerno ( 465736 )
      God is dead. (Neitzche)

      Actually, Hegel originally wrote that line. However, society seems to attribute this to Nietzsche because he followed up on the idea and proclaimed it louder than any of the other atheist philosophers.
    • The already beleagured disk community was hit with a crippling bombshell today... etc etc... Truly and American icon.
  • I wonder if a kernel could realize many of the same performance benefits with current filesystems by identifying directory inodes and small file inodes and lowering the probability of those falling out when it's time to free pages.
    • Solaris is very finnicky about ever writing out the pages in working RAM into swap. In particular, it fills free RAM with cached inodes, directories, and small files until the free memory ceiling hits a watermark, then it merely starts running the page reclaimer. The page reclaimer is the only way certain files and directories get written back to disk unless some option is turned on in the UFS layer, or the file is opened O_SYNC, AFAIK.

      I imagine linux does this too, I remember reading how the vfs layer and
  • by binaryDigit ( 557647 ) on Monday April 21, 2003 @10:48AM (#5773582)
    a very aggressive caching algo? I mean other than the battery backed part. You should be able to attain similar performance benefits using a purely software solution assuming your app doesn't do a lot of "important" writing to "small" files (where Conquest would do it all in RAM and still be able to persist it). But things like dll's,exes and whatnot don't change.

    I guess I can understand the benefits (as minor as they may be relative to price), but the thing that bothers me the most is why does it take 4 years and NSF funds to come up with something that seems so obvious?

    And one major problem would be getting over the fact that if the machine craters, you can't just yank the drive and have everything there, though I assume they have some way to "flush" the ram (can't read the .ps files to check).
    • I'd go further than that. This is like having a caching algorithm that determines what to keep in RAM based entirely on file size and whether the data is FS metadata. That seems unlikely to be the best way to use whatever RAM you have available for caching.
  • by GrimReality ( 634168 ) on Monday April 21, 2003 @10:56AM (#5773639) Homepage Journal
    ...battery backed RAM...

    Pardon my ignorance, but what happens if the battery fails? Of course, this is highly unlikely, but just a scenario.

    In a conventional disk the data would remain even if power is switched off, but a RAM would lose the data (or get corrupted or cannot be sure if the data is exactly the same).

    Thank you.
    GrimReality
    2003-04-21 15:51:18 UTC (2003-04-21 11:51:18 EDT)

    • by Ars-Fartsica ( 166957 ) on Monday April 21, 2003 @11:18AM (#5773795)
      Battery failure in this case is the as the case of hard drive failure in the nonvolatile model. You either have backups on another media or you are screwed.

      Note that hard drive failures are still common and likely to be much more common than a battery failure, as it would be trivial to implement a scheme through which batter recharding would be automatic while the computer was plugged in. The battery would only be directly employed when the system was unplugged or the power was out. Even in that case it would be also trivial to implement a continuous/live backup system to a nonvolatile media like a hard disk, which by that point would be ridiculously cheap.

    • Use it like write back cache.. or write through. Your choice. Less chance of loss, eh?
  • Umm.. (Score:5, Informative)

    by Anonymous Coward on Monday April 21, 2003 @11:02AM (#5773685)
    My raid controller basically does this already..

    It's an old IBM 3H 64 bit PCI model with 32MB of ram and battery backup.. newer 4H models support more ram.. but how is this any different?

    The most used and smallest files stay in the cache.. the rest are called when needed.. and if god forbid the power fails, and the ups fails.. the card has a battery backup to write out the final changes once the drives come back online.

    • Re:Umm.. (Score:3, Informative)

      by Mannerism ( 188292 )
      I believe the researchers' point was that existing filesystems were (understandably) not designed with a RAM component in mind. Therefore, caching controllers and similar solutions are inefficient compared to Conquest, which assumes the presence of the RAM and is optimized appropriately.
  • right.... (Score:2, Redundant)

    by hawkbug ( 94280 )
    And when the battery goes dead, bye-bye data. Stuff like this scares me. I'm concerned enough about losing all my data on a current hard disk that can exist without power - if I had to keep my machine pumped full of electricity all the time, I'd be even more paranoid about losing stuff at random.
    • "And when the battery goes dead, bye-bye data."

      And when the hard drive crashes, bye-bye data.

      Whether using "old-fashioned" hard drives or "newfangled" solid-state storage, the lesson remains: always backup your data.

  • Reliability (Score:3, Interesting)

    by InodoroPereyra ( 514794 ) on Monday April 21, 2003 @11:04AM (#5773694)
    There are several goals in a next generation filesystem implementation. Low cost and speed are important, and conquest goes in the right direction. But how reliable is persistent RAM for storage ? Hard drives go belly up every once in a while, and we would all like to see some sort of affordable solid state storage come up and replace HDD. Persistent RAM seems to be a step in the opposit direction, or Am I wrong ?
    • Re:Reliability (Score:3, Informative)

      by TClevenger ( 252206 )
      A decent system would maintain an image of the RAM on the HDD. In case of battery failure, replace the battery, boot the system up, and it should rebuild the RAMdisk from the hard disk--just like rebuilding a drive in a RAID.
  • by hpavc ( 129350 )
    battery backed ramdisk, just like my TI99 had.

    i guess this might be neat now that computers dont have extra 'cards' of memory. but when that was the way to expand your computer it was quite easy to have this method of storage.
  • Full paper in HTML (Score:5, Informative)

    by monk ( 1958 ) on Monday April 21, 2003 @11:07AM (#5773724) Homepage
    For those who are not PS worthy.
    The paper [ucla.edu]
    Looks like a great server side file system. This is finally a step away from this whole "file" madness. All storage and IO should be memory mapped, and all execution should be in place. Anything else is just silly.
    • All storage and IO should be memory mapped, and all execution should be in place. Anything else is just silly.

      I'm not sure what you mean by this. I would think that this system would merely adjust the distribution of RAM is in a computer, the only difference being that some of it has power backup for the memory.
  • Dead? (Score:4, Interesting)

    by Sandman1971 ( 516283 ) on Monday April 21, 2003 @11:07AM (#5773726) Homepage Journal
    Dead? I don't think so. Get back to me when they start using Non-volatile RAM and the price per byte is equal or less than the price for harddrives. Until then, the HD is going to be alive and kicking.

    One thing I've always wondered though. Why not release an OS on an EPROM? It would make boot time and OS operations extremely fast. I'm still surprised to this day that this isn't mainstream. Ahhh, the good ol' days of Commodore when you OS was instantly on when you turned on the PC.....
    • Re:Dead? (Score:2, Informative)

      by kitty tape ( 442039 )
      Actually, the full name of the talk was "The Disk is Dead, Long Live the Disk". Obviously, the idea of eliminating the disk was meant to be taken with a grain of salt.
  • by ceswiedler ( 165311 ) <chris@swiedler.org> on Monday April 21, 2003 @11:11AM (#5773746)
    Though I don't think it's a useful general-purpose concept to have a RAM-only FS, I'm hoping that fast RAM will catch up to magnetic disks in size. A standard FS/VM will end up caching everything if the RAM is available. I seem to recall that ext3 on Linux, if given the RAM for cache, is faster than many ramfs/tmpfs implementations. Plan9 completely removes the concept of a permanent filesystem versus temporary memory. Everything is mapped in memory, and everything is saved to disk (eventually). It's a neat concept, and it happens to go very well with 64-bit pointers and cheap RAM.

    I'm hoping that hardware people will realize that we need huge amounts of fast memory...whether or not we think we need it. We're stuck in a "why would I need more RAM than the applications I run need?" kind of mindset. I think that the sudden freedom 64-bit pointers will provide to software developers will result in a paradigm shift in how memory (both permanent and temporary) is used. Though like all paradigm shifts, it's difficult to predict ahead of time exactly what the change will be like...
  • While this is great for some environments, it will remain a research toy until several real-world problems and limitations are addressed. Several people have already brought up the issue of having more small files than will fit into the BB-RAM. Another issue is portability. With a traditional filesystem, if a whole machine dies you can slap the disk into another one (of the same type). With Conquest, you have to transplant the BB-RAM as well. How many slots do you think a machine has for BB-RAM, vs. ho

  • I've always thought it'd be great to have a hybrid storage device, coupling RAM, hard disk and tape or optical into a single cartridge. The device would then manage the migration of data between the three mediums transparently based upon access.

    Rather than being filesystem dependent, I'd have the device not know or care about filesystem, just logical disk sectors. Those that were accessed frequently would stay on the higher speed medium and those that weren't, the less frequent. Large files that were on
  • Ramdrive (Score:2, Interesting)

    My plan-

    Once 64 bit procsesing becomes mainstream, and price per gigabyte of memory better (say, 16 gigs DDR 3200), store the OS on a small (~5 gig) hard drive partition, and transfer the entire thing to a 5 gig ramdrive on startup. Using serial ATA that shouldn't take too long, and the OS will run at dramatically increased speeds, especially if the swap is housed in the ramdrive as well. On shutdown, transfer the contents of the ramdrive back to the hard drive. With the massive RAM support 64 bit proce
  • Way back when I was growing up, we bought an Apple IIgs. After a while, my dad went for the memory upgrade with a battery back-up option. Remeber, this is about 12 years ago. It was nice because you didn't need to wait for the system to boot anywhere near as long.

    Additionally, laptops take a similar concept and save the system memory image to hard drive and just read that in order to make your boot time a little shorter when you are away from the machine and it powers down.

  • ram storage (Score:2, Interesting)

    Back in the 1980s, Applied Engineering produced an Apple II card called "RamKeeper", IIRC. It was a memory card backed up with a battery, so you had a permanent RAM disk, which standard software could read/write from as if it were a normal device.
  • by mseeger ( 40923 ) on Monday April 21, 2003 @11:58AM (#5774065)
    Hi,

    does anyone has a statistic on HD and RAM prices throughout the last years?

    I only have a feeling, that the last years RAM prices have fallen quicker than HD prices. This will naturally lead towards such developments as mentioned.

    I think it would be very interesting to study the technical developments in the light of price developments. My bet is, that most inventions are not caused by bright minds but the need for them. For most technical breakthroughs, the mind is not cause but catalysator ;-).

    CU, Martin

  • by wowbagger ( 69688 ) on Monday April 21, 2003 @12:25PM (#5774310) Homepage Journal
    I wonder what improvement this would have (if any) over using a journaling file system with the journal stored on a battery-backed up volume + a large amount of system RAM.

    If you used full journaling (data writes journaled as well as metadata journaled), then writes will happen at RAM speeds (with the journal flush happening "later" when the system isn't busy).

    Meanwhile, files that are being used will be in the VFS buffer cache (evicted as they age or as the system needs the RAM for other purposes), thus making reads fast (after the initial read from disk).

    It would seem to me that my approach would automatically tune itself to what you are doing, rather than trying to tune things by hand.

    (Granted, this assumes your OS has
    1. Journaling file systems
    2. the ability to place the journal on different block device than the main file system
    3. A decent buffer cache system

    but given those assumptions...)

  • by pingbak ( 33924 ) on Monday April 21, 2003 @12:25PM (#5774312)
    Given that I sit next to Andy (Conquest's author) and have seen this research progress over the last couple of years, I'd like to clear up a few misconceptions.

    First off, Conquest uses the system's RAM. It's not attached by an external bus or network system, e.g.; fibre channel. Not that one would really want to make fibre channel a CPU-RAM bus in the first place. So pointing out products of people "who done this already" doesn't apply if its not done in the system's RAM.

    Secondly, Conquest removes all of the disk-related complexity (buffer management, I/O cache management, elevator algorithms, etc.) from the kernel. This allows Conquest to operate at close to theoretical disk I/O bandwidth. Pages go right from RAM to disk. Minimal metadata to update, no inode arrays to traverse.

    There is currently a 1M threshold that defines the difference between a "large" file and "small" file. Conquest doesn't decide to pull in only shared objects, libraries and executables. In fact, emacs falls into the "large" file category. However, Andy noted that most large files have "stylized" access, e.g.; MP3s, where the first thing is a seek to the end of the file to read its metadata. The same is true of executables. Conquest has the concept of recursive VMs (VMs in VMs) that handle the different stylized accesses. Not that he's implemented all of them, since he's managed to graduate and is teaching the OS course this quarter.

    Lastly, Conquest checkpoints RAM out to disk periodically. No, it's probably not the smartest strategy, but it does work. Thus, if the battery dies or the OS chokes, one can roll back to a reasonable state.

    HTH.
    • by pingbak ( 33924 ) on Monday April 21, 2003 @12:33PM (#5774375)
      And while I'm at it: Conquest is not a RAM disk. If it's anything, it's disk-backed RAM. Conquest removed the disk-related components of the file system. The closest you can get to this today, without Conquest, is by copying the files you need from some other FS into ramfs and putting symlinks in the ordinary file system. The difference between the two approaches is script maintanance (Conquest == automagic, today == scripts.)

      Of course, putting anything into ramfs also eats up your swap file, whereas Conquest doesn't.
  • by one-egg ( 67570 ) <geoff@cs.hmc.edu> on Monday April 21, 2003 @12:58PM (#5774568) Homepage
    Well, since I'm the person who gave the talk referenced in the original post, I suppose I ought to clear up a few misconceptions for folks. I'm not going to address every objection that's been raised, because most of them have been well addressed in our papers. I'll just highlight the most common misunderstandings.

    First, the full title of the talk was "The Disk is Dead! Long Live the Disk!" We make no claim that disk manufacturers are going to go out of business tomorrow; history suggests that the technology will survive for at least a decade, and probably more than two. Talk titles are intended to generate attendance, not to summarize important research results in 8 words.

    Second, the most common objection to the work boils down to "just use the cache". This point has been raised repeatedly on Slashdot over the past few years. However, if you read our papers or attend one of my colloquium talks (UCSC, May 22nd -- plug), you'll learn that LRU caching is inferior for a number of reasons. We were surprised by that result, but it's true. Putting a fake disk behind an IDE or SCSI interface is even worse, since that cripples bandwidth and flexibility.

    Third, for people worried about battery failures, the only question of interest is the MTBF of the system as a whole. All systems fail, which is why we keep backups and double-check them. If your disk failed every 3 days, you couldn't get work done, but there was a time when we dealt with a failure every few months. Conquest's MTBF hasn't yet been analyzed rigorously, but I believe it to be more than 10,000 hours, which is good enough to make it usable.

    Finally, I have chosen not to put my talk slides on the Web, at least not for the moment. But you're welcome to mail me with questions: geoff@cs.hmc.edu. It might take me a few days to answer, so be patient.

Avoid strange women and temporary variables.

Working...