Slashdot is powered by your submissions, so send in your scoop

 



Forgot your password?
typodupeerror
×
GNOME GUI Data Storage

'Storage' to Replace Traditional Filesystems? 599

JigSaw writes "OSNews is reporting on Storage, an innovative project which aims to replace the traditional hierarchical filesystems with a new document store which is database-based (PostgreSQL). The current implementation, built under Gnome 2.x for now, offers natural language access, network transparency, and a number of other features. The project is currently in alpha (screenshots already available), and it is part of the next major generation of Gnome. It is currently developed by Seth Nickell, the person responsible for the enhanced Gnome usability on 2.x and its HIG, among other things."
This discussion has been archived. No new comments can be posted.

'Storage' to Replace Traditional Filesystems?

Comments Filter:
  • by kubla2000 ( 218039 ) on Friday September 05, 2003 @09:02AM (#6878435) Homepage

    Yeah, and as Longdong gets pushed back and delayed and delayed and pushed back and postponed and delayed, it'll be last to market but microsoft will still have been the first to announce it. I guess that's more innovative than they've been in the past when they'd simply wait for someone to do something interesting before buying them out.

    It's not enough to say. One has to do. Microsoft has proved many times over that it often makes grand announcements only to provide something far more watered down by the time they get to market.

    We'll see what they're DB-based file system really is when (and if) it gets here.
  • My major concern with all these database type filesystems is that the gains are always shown as things like, "Find all films directed by Steven Spielberg", and yet this is not information that the computer can necessarily gather for itself.

    Outside of a work environment, I've rarely encounter anyone who keeps consistent, useful filenames, let alone metadata indexes; it seems to me that people will skimp on the metadata, and thus limit the usefulness to metadata that the computer can collect automatically ("All movies that last under 90 minutes"). It's like CD collections, or books; libraries have nicely catalogued and ordered collections. Private individuals don't; they have roughly ordered collections on the shelf, and don't bother keeping them in any better order. I suspect the same will happen with these metadata systems; people won't do the work needed to make them truely useful.

  • by dabadab ( 126782 ) on Friday September 05, 2003 @09:07AM (#6878500)
    "Integrated mime-types. No more relying on file extensions and other hacks. The mime-type (and subsequent viewer) is right there in the query"

    And how does that meta data gets to the db? Oh, right, it will rely on file extensions and other hacks :)
  • by azaroth42 ( 458293 ) on Friday September 05, 2003 @09:08AM (#6878510) Homepage
    Obvious disadvantages:

    SQL is slow compared to things like BerkeleyDB

    We already have journaled file systems that can save metadata (though not user defined, I think)

    Your database becomes corrupt, you lose everything.

    Sorry, give me something that gives me back my data -fast-. If I want to do selects for files, I'll use locate and xargs.

    --Azaroth

  • by Wdomburg ( 141264 ) on Friday September 05, 2003 @09:10AM (#6878540)
    It seems silly to tie the implementation to a single database, when gnome-db is fast approaching 1.0.
  • by Viol8 ( 599362 ) on Friday September 05, 2003 @09:12AM (#6878558) Homepage
    "What this world needs is a really big injection of orginal thought"

    They are original ideas, they just don't make it into the PC world where MS dominates. MS come up with as many original ideas as McDonalds
    and since all KDE & Gnome (and frankly most open source projects) are doing is playing catchup with MS then originality is never going to be
    a prime concern.
  • Re:Windows? (Score:5, Insightful)

    by Zocalo ( 252965 ) on Friday September 05, 2003 @09:18AM (#6878610) Homepage
    Not quite, NTFS is a traditional file table with some bells and whistles, but it's not a "database" in the sense meant here(1). The next version of Windows, "Longhorn", is supposed to introduce a new file system called WinFS that will use a version of SQLServer as its backend. Whether they will actually deliver or not is another matter, since we were promised this in 1995 with Cairo and Taligent (remember them?), and now that Longhorn appears to have been pushed back...

    There are also issues with gaining acceptance for the change in the way things work. This kind of thing has not really been done on a large scale in the wild before, on any OS, so whether people will be willing to accept the security and reliablity issues that may ensue is another matter. For example, what are the implications of a compromise in the database engine? MS is planning on using SQL, so if things go awry and it becomes possible to maliciously inject raw SQL to the filesystem interface... Oops. On the otherhand, the benefits for data retrival are *huge*. Imagine being able to find any audio files on your entire system by Justin Timberlake or Britney Spears and delete them all in one go by searching on the tag fields! ;)

    (1) Technically, all filesystems are databases, it's just that current ones are a collection of flatfile database tables that can point to each other, generally in a heirarchial manner. When people say "database" in the same sentence as "filesystem" they usually mean "relational database". As an aside however, high end databases usually forgo the need for a file system and provide the ability to write their tables directly to disk on a dedicated partition.

  • by kfg ( 145172 ) on Friday September 05, 2003 @09:21AM (#6878637)
    "Well, where do you go?"

    "Stanford."

    "No problemo, I'm heading that way later and I can grab it for you. What's your room?"

    "Dorm 5, Room 109. It's the desk on the left."

    ( We didn't bother to state earth.us because we were already inside those directories)

    Yes, yes we do think heirarchically. Most of the history of human thought has been fitting everything we can lay our filthy little brain cells on into heirarcheis, whether they wish to fit into them or not. It's intuitive.

    As for natural language didn't we learn about that with COBOL? Natural language only speeds the learning process slightly ( the majority of the learning still lying in the realm of understanding the basic concepts involved), but then becomes a pain in the ass forever afterward.

    Looking at the screenshots it's also ugly as all sin. The physicist in me can't help but feel that a model that ugly can't possibly be correct.

    I think this makes just about as much sense as using a document preperation language (XML) as the basis of a database.

    Which is to say, none.

    KFG
  • by Tom ( 822 ) on Friday September 05, 2003 @09:21AM (#6878642) Homepage Journal
    That's why we have community products. For music, CDDB works pretty good and is a working real-life example.

    Other metadata is automatically inserted. When you install OpenOffice, it asks for your name and inserts that as the author into any new documents you create, for example.

    Sure, the metadata on my personal machine will never be comparable to what a library could do. But it doesn't have to be - it has to be useful for me, not - like the library - for thousands of people with very different interests and approaches.
  • by noselasd ( 594905 ) on Friday September 05, 2003 @09:25AM (#6878674)
    Right. Individuals hate having their stuff messed with. So files are kept in a mess, but _you_ know where they are. Atleast untill you use Storage, which will mess them up.
  • by Anonymous Coward on Friday September 05, 2003 @09:35AM (#6878766)
    There are two kinds of metadata: intrinsic and extrinsic. Intrinsic metadata, as the name would imply, is information that's contained entirely within the file.

    Some intrinsic metadata can be extracted with automation. For example, it's pretty trivial to examine a TIFF and tell you that it's X by Y pixels in a given color space. It's harder, but still possible, to tell you that the TIFF is predominantly red and green. It's impossible for the computer to tell you that it's a picture of a barn.

    The same is true of extrinsic metadata: some of it can be extracted automatically, but not much. An example of extrinsic metadata would be a relationship. The computer can tell you that main.c and somefunction.c are both C language source code files, but it may or may not be able to tell you that they're both part of the same program. If the two files are explicitly related to each other through a makefile or some such, then the computer can know that they're related. But consider the collection of JPEGs I just copied to my home directory from my digital camera. A dozen of those pictures were taken in Fiji. The computer cannot know this unless I tell it, nor can it know which pictures were taken in Fiji and which were taken in my back yard last Tuesday. Thanks to my camera, the computer can know what apeture and focal length were used for each picture. In theory, if my camera had a GPS receiver in it, the computer might even be able to tell me there, on earth, the camera was when it snapped a given photo. But these are just automatic methods of telling the computer about the pictures. They're conceptually no different from sitting down and typing the information in. The point remains that the computer can figure out some things on its own, but it cannot know most things unless it is told.

    You don't have to strain your imagination to think about this stuff. Consider your MP3 collection. Your computer can tell you that a given MP3 is 6:15 long, and that it's 192 kbps, and that it's stereo. It can't know that it's "Treefingers" by Radiohead unless somebody tells it first.

    So you're basically right: automatically extracted metadata is marginally useful, but the really useful stuff has to be manually entered. And generally speaking, even in business environments, that sort of information simply never makes it into the database. It exists exclusively in people's heads.

    That's the biggest challenge of digital asset management--which is, incidentally, essentially what we're talking about here. The biggest challenge is how to take information that people have in their heads and store it in some structured, persistent form. That form might be a three-ring binder or a card catalog or an Oracle database; the challenge is the same even if the technology is different.

    Bottom line: this technology is really neat, and has limited applications in which it's very useful. But it is not generally useful, nor does it have widespread applications.
  • by Anonymous Coward on Friday September 05, 2003 @09:43AM (#6878839)
    XFS limits user-defined extended attributes to 32 KB. Big, but not unlimited.

    Also, extended attributes are fundamentally broken because they're stored in the inode. They do not survive, for example, a copy operation. Worse, they do not survive an open/save cycle in most cases, because most programs do not write to open files. Instead, they open a new file under a temporary name, write the data into it, close it, unlink the original file, then rename the temporary file to the original file's name. That way the data is safe if the program or computer fails during the save operation. This creates a new inode for the file data, however, which means extended attributes go bye-bye.

    Extended attributes are not the answer. I'm not sure exactly what the question is, but I'm sure extattr are not the answer.
  • Nope (Score:5, Insightful)

    by varjag ( 415848 ) on Friday September 05, 2003 @09:49AM (#6878901)
    > SQL is slow compared to things like BerkeleyDB

    BerkeleyDB is a hierarchial database. SQL is godzillion times faster on complex searches.

    > Your database becomes corrupt, you lose everything.

    Your filesystem becomes corrupt, you lose everything.

    And yeah, I know about journaling, so don't bother :) But modern RDBMSes have integrity control facilities as well.
  • by nuggz ( 69912 ) on Friday September 05, 2003 @09:54AM (#6878962) Homepage
    What we need is to get some information storage/retreival experts to provide some guidance to the developers of these ideas.

    Librarians have been working on these problems for centuries, why not start with what they know?
  • by selderrr ( 523988 ) on Friday September 05, 2003 @09:54AM (#6878972) Journal
    Very true and insightful !

    Another argument to prove you right is the "rate this song" options in iTunes. With that feature, one can assign 1 to 5 stars to a song so that later, you can quickly select your favourites. Such system is flawed in 2 ways :

    - I have yet to encounter anyone who uses it exhaustively. Most folks rate a few dozen songs. I have a library with 9000 mp3s and sure as hell I'm not going to spend a whole week rating them.
    - I have yet to encounter someone who uses it consistently. Today I might consider Chris Isaak a 4star song since I'm in a depressed mood and it's raining outside. Tomorrow the weather might be beautifull and I mod him down to 2 stars cause he's a bloody negative wanker.

    This is offcourse iTunes specific, but it shows that the assignment of metadata is far far far more complex than the methods to search/organize the stuff, which is what the "Storage" software above is about.

    As an extra complication : consider that my metadata might not match someone elses. For instance if I were to label a mail "message from my brother", the same content would be "message from my son" for my father !

    The fact that metadata based filesystems are not on our desktop is perhaps more to the fact that it's not a valid solution for data on desktop computers. Maybe, just maybe this is not due to MS squashing Be, as someone else was karmawhoring above.
  • by ReelOddeeo ( 115880 ) on Friday September 05, 2003 @10:02AM (#6879059)
    I think Longhorn will be the first Windows with a database filesystem. It will probably be based on SQL Server

    First, about being first. Microsoft will have the First GUI. Microsoft will have the First internet web browser. Microsoft will have the first 32-bit clean API. Back in 1982, some big fat PC magazine (not Byte, but one with PC in the name) said that MS-DOS 2.0 would be the First OS to have a herarchical filesystem! I think I could go on and on, but I trust my point is clear regarding Microsoft having the first database filesystem which they most certianly do not. (Can you say BeOS.)


    I think Longhorn will be the first Windows with a database filesystem. It will probably be based on SQL Server

    Second, Microsoft wants their database based fileserver to be reliable. So maybe it will be secretly based on MySQL. :-) Ooops, wrong license. I meant PostgreSQL.
  • by Twylite ( 234238 ) <twylite&crypt,co,za> on Friday September 05, 2003 @10:13AM (#6879169) Homepage

    Summary of developments:

    • BeOS has a good idea
    • Microsoft announces a breakthrough in file system technology (around 1996), nothing happens
    • newdocms [m-arriaga.net] announced on Slashdot [slashdot.org] in January 2003. Integrates with KDE, so no-one cares
    • Microsoft announces WinFS plans for Longhorn. Slashdot decides that Microsoft sucks.
    • Initial release of Haystack [mit.edu] from MIT. Screenshot has XP interface so no-one gives a toss
    • WinFS is reviewed, Slashdot [slashdot.org] has a flame war about file system layout, and concludes that MS sucks and a database file system is a stupid idea anyway and no-one wants one
    • YEDFS (Yet Another Database File System) announced calling itself "Storage". Integrates with GNOME. FLOSS community bows and worships the superiority, leadership and sheer innovativeness of the application.
  • Re:i think (Score:4, Insightful)

    by glwtta ( 532858 ) on Friday September 05, 2003 @10:20AM (#6879241) Homepage
    and users weren't forced to learn the artificial concept of a "file"

    Um, artificial as they may be, these so called "files" have been around for some time, in fact long before computers. Users can quite intuitively understand the concepts of "file" and "folder." I really think you are trying to make the difference seem greater than it actually is. (on the user side, that is)

  • Re:Windows? (Score:3, Insightful)

    by Zocalo ( 252965 ) on Friday September 05, 2003 @10:22AM (#6879244) Homepage
    However, I do suspect that any robust interface would take a look at the tags, and if they are empty attempt to parse the filename.

    Actually, I was just thinking about this problem, and you know what would make a *really* easy solution and is readily available already? P2P! Think about it; a new file arrives on the system by whatever means, so the file system has zero idea about it's nature beyond what's available from the file. We probably know the type of file from its header, extension or whatever other "file" command type trick was required. We also know its size, any tag type information that may be present, the filename, and we could maybe calculate a checksum too. So we fire off a P2P query with what we have and what we want to know, then wait for responses.

    Sure, you will probably get responses that conflict, so some kind of progessive weighting and elimination system is required. If you search on Kazaa and look at the meta info returned, it's fairly easy to see what is correct and what is not; automating this analysis is the next step. There is also the probabilty of CDDB type services springing up to act as the "Supernodes" of such a system, or as dedicated standalone services.

    Of course, you probably wouldn't want the OS doing this for you automatically. Imagine the fun and games that would ensue if you started getting Bill G. sending out P2P queries to fill in the meta tag blanks on a document about "increasing revenue through tweaking our licensing strategy again"! ;)

  • by fault0 ( 514452 ) on Friday September 05, 2003 @10:25AM (#6879275) Homepage Journal
    Microsoft didn't really put in much investment in Cairo after it was pretty apparent that nobody really cared for it at the time. Most people really don't like novel ways of doing things. There is too much investment in the old ways atm. I guess if the world were different, we would all be using Microsoft Bob right now.

    So, I think this GNOME thing will also sizzle out after a while.
  • Built for gnome? (Score:2, Insightful)

    by adrianbaugh ( 696007 ) on Friday September 05, 2003 @10:39AM (#6879427) Homepage Journal
    Is it just me that sees this as a really bad idea? Nothing against the gnome project, you understand, but I see no earthly reason why a filesystem should require X Windows, let alone a full-blown desktop environment. Surely this kind of thing should be a kernel-level project which userspace tools can hook into as needed, whether from gnome or KDE or the CLI?

    Anyway, I thought Reiser4 was doing exactly what this promises, but with the advantage of a proven track record on high-performance filesystems. Perhaps, if gnome wants this kind of functionality, they should base it on Reiser4 which will at least be widely-used and not locked into the gnome project.
  • by Zocalo ( 252965 ) on Friday September 05, 2003 @10:46AM (#6879488) Homepage
    Yeah, but as I mentioned in an earlier post, *all* filesystems are databases of some type, it's just a matter of context. Generally, when someone says a "database filesystem" today, what they actually mean is "a relational database driven, virtual filesystem providing an infinite variety of views onto a soup of metadata". I think I prefer the former and leaving the rest up to inference, but I'm sure that when these new products finally ship the marketroids are going to think otherwise.

    I do deserve my wrists slapping though... I'd completely forgotten about BeOS! For shame!

  • Re:ext3 + sql (Score:3, Insightful)

    by stephenbooth ( 172227 ) on Friday September 05, 2003 @10:47AM (#6879498) Homepage Journal

    It also solves the problems of if I have a letter to Smith and Sons (Builders) about a bridge construction project do I store it under letters/Smith&Sons/Bridge or Smith&Sons/letters/bridge or bridge/Smith&Sons/letters or projects/builders/bridge/smith&sons/letters or what. With a database based file system you store it once and tag it as a letter, to Smith and Sons (Builders) about the bridge construction project (and any other tags you would like to apply to it) and then when you want to find it again you can just go through which ever access path you like. Also you can then find all letters, all documents relating to Bridge construction, all documents relating to Smith&Sons &*c.

    Last time I posted about something like this a coupel of people posted responses stipulating an 'ideal' directory path. To preclude this I must point out that whatever method of classification makes most sense to you and the tasks you do won't make sense to someone else and the tasks they do. For example to a project manager it makes sense to classify documents by the projects they are part of then probably by the document type as they are only interested in the projects they manage and don't want to have to navigate through multiple branches of a directory tree to locate all of the files for their projects (if they are working ont he brige construction project they want all the documents to be under a directroy called 'bridge_construction' but for a finacial director it makes sense to classify the documents by type they are as they are only interested in financial documants and don't want to have to navigate through multiple branches of a directory tree to find all of the financial documents (if they are looking for all invoices from Smith and Sons they want to find them all under invoices/smith&sons). A database based storage system allows everyone to view the data int he way that makes most sense to them and the way that they work.

    Stephen

  • by ryanvm ( 247662 ) on Friday September 05, 2003 @10:54AM (#6879573)
    Good points, but you're just talking about subjective metadata. The usefulness of which is certainly debatable. But what about factual metadata? Consider a downloaded movie that may have fields like: Year, Director, Rating, Genre, Studio, Cast, etc.

    Granted the end user is not going to be likely to maintain this information, but that doesn't really matter. The end user is also not likely the source of the material in the first place. I contend that metadata is more useful for material that the user has downloaded or purchased. That data SHOULD have accurate metadata and could be extremely useful.

    Consider the following meta-search: ALL MOVIE FILES WITH YEAR>2007 AND RATING>X AND GENRE=SILENT AND CAST='BRITNEY SPEARS'
  • Re:i think (Score:3, Insightful)

    by mangu ( 126918 ) on Friday September 05, 2003 @10:57AM (#6879602)
    What programmers want to be able to do is manipulate data structures and store them persistently


    What programmers want is to be able to manipulate data. Period. What's so good about unix is that everything is a "file"; for instance, you can manipulate data coming from a sound card with exactly the same code you use to manipulate a sound file. You can't do this with this so-called "storage".

  • Re:Nope (Score:2, Insightful)

    by ednopantz ( 467288 ) on Friday September 05, 2003 @12:02PM (#6880222)
    Real world non-Geek use:

    Sales rep needs the revised proposal on the Henderson account!

    Computer: Get all Word documents emailed by Bob between last Thursday and today that contain the word 'Henderson' except where they also contain the phrase 'unreasonable, demanding client who is not worth our time'. --A snap for an db .

  • Re:Nope (Score:3, Insightful)

    by nosferatu-man ( 13652 ) <spamdot@homonculus.net> on Friday September 05, 2003 @12:52PM (#6880633) Homepage
    Who cares about the user? The system itself would have capabilities finally expressible in terms more advanced than those constrained by what some drug-addled graduate student with no maths decided was sufficient in 1971.

    In other words, the user might never think to do that, but it'd be so cheap for the operating environment that all kinds of new applications would appear.

    'jfb
  • by EarthTone ( 12574 ) on Friday September 05, 2003 @01:02PM (#6880722)
    Well sure, but it hasn't successfully been done on a desktop OS yet.
  • by penguin7of9 ( 697383 ) on Friday September 05, 2003 @01:04PM (#6880737)
    Of course, databases are very useful for organizing user data. People already keep PIM info, images, and lots of other stuff in databases. Lotus Notes is built entirely on databases.

    But "replacing the traditional file system" carries with it the notion of ripping ext3 out of the kernel and putting a relational database there. That's a very bad idea. Databases don't belong into the kernel. They are far too inefficient to handle most storage needs, they are far too complex to go into the kernel, and they just don't need to be in the kernel. Operating system kernels need simple, fast storage systems. Something like ext3. ReiserFS is pushing the limits. PostgreSQL would be going too far.

    As an aside, this is an idea that just about every nerd has when they learn about databases and retrieval. It's been tried various times since the 1960's. There are probably good reasons why interfaces don't use them. Perhaps most importantly, keep in mind that the vast majority of files on your system are not user files, they are bits and pieces of the operating system. And for the files that actually are used by users (mail, PIM info, images, text, etc.), they usually already have special-purpose database interfaces available to them as part of the applications that users use to access them.
  • by FooBarWidget ( 556006 ) on Friday September 05, 2003 @01:45PM (#6881170)
    "WinFS is reviewed, Slashdot [slashdot.org] has a flame war about file system layout, and concludes that MS sucks and a database file system is a stupid idea anyway and no-one wants one"

    Wrong! Slashdot concluded that WinFS will make computing soooo much easier that it will blow the competition out of the sky and that if Linux doesn't caught up fast it will die off.

    Open your eyes people. Slashdot is not an anti-MS site anymore!!!
  • Re:Windows? (Score:3, Insightful)

    by netsharc ( 195805 ) on Friday September 05, 2003 @01:59PM (#6881324)
    It's innovative because it's an idea implemented on Linux, whereas when it's to be implemented on Windows it's, a lousy idea (well, lousy because of 3rd party compatibility nightmares).
  • by smallpaul ( 65919 ) <paul @ p r e s c o d . net> on Friday September 05, 2003 @03:39PM (#6882185)
    The file should be self-describing. It should have a header saying its type. You can never trust intermediary software to properly keep data and metadata together. The problem isn't just other operating systems. It is file formats like ZIP and prototols like FTP. Plus there is a problem that the file type a user gives a file on their computer may be just a means of triggering a bit of software (e.g. change a JSP file to HTML so it launches your HTML editor). But the intrinsic type of the file should not be corrupted by these user preferences.
  • Re:i think (Score:1, Insightful)

    by Anonymous Coward on Friday September 05, 2003 @03:40PM (#6882202)
    Except Unix never decided on a standard representation of these data objects. That lead to all sorts of bizarre and flaky program problems when the files were just so, and a very low level of integration between different programs.

    Any attempt to standardize file data objects (recently, XML) inevitably devolved into a discussion of "MY Favorite Format" and "Look Ma, I wrote another parser!" and "Awk skilz make my dick look bigger" and "At least it's not The Registry!".

    Thus, we must declare Unix's approach a failure: Not so much on the technical level and the pure genius of it's simplicity, but a failure on the social level in that we gave Unix 20 years to play with it's file streams and it did not come back with the applications that people wanted to use.
  • Re:BeFS (Score:3, Insightful)

    by cpeterso ( 19082 ) on Friday September 05, 2003 @05:06PM (#6883003) Homepage

    there's hope that MacOS X's filesystem will start incorporating the rich-metadata, dynamic view model of the world.

    you mean like Mac OS 9 and earlier?
  • by Tracy Reed ( 3563 ) <treed AT ultraviolet DOT org> on Saturday September 06, 2003 @02:10AM (#6885948) Homepage
    Not really a comment on "storage" but just a comment on something that has constantly bugged me when someone says "let's put it in a database!"

    A filesystem is a special case of a database. So it is perfectly acceptable to store your data into a filesystem. Some people seem to think everything has to be put into a relational database or that is it somehow cool to do so. I have seen people store loads of graphics as BLOBS in databases. Someone once suggested storing a ton of MP3's in a database. Most recently someone said (and this isn't the first time) that we should store all of the emails in a database. It's just another unnecessary layer of complication, especially when you are going to be referencing the email/graphic/mp3 by name all the time anyway (fs's like reiserfs index on name so it's blazingly fast) and not by a bunch of other pieces of meta-data. And if you are going to need to do lookups by various bits of meta-data then store the meta-data in a db and also store a record pointing to the actual file on disk. I have done that lots of times and it works great.

THEGODDESSOFTHENETHASTWISTINGFINGERSANDHERVOICEISLIKEAJAVELININTHENIGHTDUDE

Working...