Follow Slashdot stories on Twitter

'Storage' to Replace Traditional Filesystems? 599

Posted by michael on Friday September 05, 2003 @07:52AM from the find-porn dept.

JigSaw writes "OSNews is reporting on Storage, an innovative project which aims to replace the traditional hierarchical filesystems with a new document store which is database-based (PostgreSQL). The current implementation, built under Gnome 2.x for now, offers natural language access, network transparency, and a number of other features. The project is currently in alpha (screenshots already available), and it is part of the next major generation of Gnome. It is currently developed by Seth Nickell, the person responsible for the enhanced Gnome usability on 2.x and its HIG, among other things."

This discussion has been archived. No new comments can be posted.

'Storage' to Replace Traditional Filesystems?

Load All Comments

Search 599 Comments Log In/Create an Account

Comments Filter:

i think (Score:2, Interesting)

by Tirel ( 692085 ) writes:

it's better for programs to abstract data like that, the fs should only to provide access to the medium, nothing else.
- Re:i think (Score:2)
  
  by eric76 ( 679787 ) writes:
  
  I agree completely.
  
  On the surface, using a database type file system where files are just objects stored in the database along with other things seems like a great idea.
  
  But I think that the result will probably be less resilient to damage and result in an increased possibilty of losing your data or finding them corrupted.
- Re:i think (Score:5, Interesting)
  
  by laird ( 2705 ) writes: <lairdp@gmaiTIGERl.com minus cat> on Friday September 05, 2003 @08:39AM (#6878811) Journal
  
  I disagree, strongly. Files are an artifact of a bunch of bad implementation decisions when stripping Multics down to produce UNIX. What programmers want to be able to do is manipulate data structures and store them persistently. What files force you to do is waste tons of time writing code to take your data structures and write them out as sequences of bytes and read them back in.
  
  One OS that solved this nicely was NewtonOS. If you wanted to manipulate persistently stored data you opened a "soup" that contained objects. So if you wanted to, say, set up an appointment with someone for lunch, you could find the person in the address book "soup" and then create an entry in the databook "soup" recording the appointment, which would immediately appear in all other apps that dealt with appointments (because app's accessed the same data structures, and were notified of changes so that they could update). So your data was not trapped in a particular application's proprietary format, and users weren't forced to learn the artificial concept of a "file" but instead could think about "my appointments" or "my address book".
  
  If you haven't tried it, don't knock it. As a developer, and as a user, it was wonderful -- much more straightforward than "files" and "directories".
  
  Parent Share
  twitter facebook
  - Re:i think (Score:4, Insightful)
    
    by glwtta ( 532858 ) writes: on Friday September 05, 2003 @09:20AM (#6879241) Homepage
    
    and users weren't forced to learn the artificial concept of a "file"
    Um, artificial as they may be, these so called "files" have been around for some time, in fact long before computers. Users can quite intuitively understand the concepts of "file" and "folder." I really think you are trying to make the difference seem greater than it actually is. (on the user side, that is)
    
    Parent Share
    twitter facebook
  - Re:i think (Score:3, Insightful)
    
    by mangu ( 126918 ) writes:
    
    What programmers want to be able to do is manipulate data structures and store them persistently
    
    What programmers want is to be able to manipulate data. Period. What's so good about unix is that everything is a "file"; for instance, you can manipulate data coming from a sound card with exactly the same code you use to manipulate a sound file. You can't do this with this so-called "storage".
  - Re:i think (Score:4, Interesting)
    
    by spitzak ( 4019 ) writes: on Friday September 05, 2003 @11:20AM (#6880381) Homepage
    
    "Files" are not a bad idea. It is nice to have an interface of commands that is limited in size and easily serialized (ie open/read/write/seek). If Unix had instead mmap'd files in it's original design there would probably not be transparent access to network file systems or many of the other things we take for granted today. So the design of files was actually a huge win.
    
    1. The primary problem is implementation. Filesystems today are designed to store small numbers of very large files (ie more than 1K in size). Anybody who wants to store "objects" that are smaller than about 1K in size (like if you are implementing a "registry", for instance) is forced to write or use a database program, with needless complexity, to force all this data into a single file, so that it can be stored efficiently. What we need is a design where tiny files (like 4 bytes) can be stored efficiently.
    
    Supposedly ReiserFS addresses this, but it is not clear if it does the necessary level of compression: ideally if you had 100 files with the same 50-bytes name and 1 byte stored in them, all those names would be in the same 50 bytes on the disk.
    
    Sadly NOBODY seems to be trying this, and keep spouting "attributes" and "registry" and "config file". Those are all work-arounds for poor file systems.
    
    All files must have the capability of being a "folder" and having subfiles. Any time anybody says "attributes" this should mean this sort of subfile.
    
    2. The other problem is the blinders so people believe the "filename" is some sort of user-friendly data. This leads to brain-dead ideas like "case independence" and "wide characters" and the fact that certain bytes like "/" and zero are disallowed. This requires programs to cook data in strange ways to use it as indexes into the filesystem. This used to be true of the *data* in old systems, and we know now how horrid that was (only a rudimentary piece of that old stupidity remains in Windows text/binary distinction but I hope newer Windows systems will move that out of the kernel).
    
    The filesystem should identify files with a counted length of bytes, just like the data in the file. In fact "name" should be a subfile of any file, and you rename it by writing a new "name". I don't think this can be solved without fixing existing filesystems.
    
    (for "user friendly" names some form of quoting is going to be necessary. Since Windows has made "\" useless I would use that for quoting. "\0" is a null, "\\" is a backslash, "\/" is a forward slash. Just "/" itself indicates a break between hierarchy levels. For semi-Windows compatability you can also make just "\" followed by an unassigned code also mean a break between hierarchy levels.
    
    3. The other thing that is needed (but could be done atop existing implementations) is to change the model of files. They should be "atomic" in that when you open a file for writing, you get an empty file, and this is invisible to any other program. The file only appears at the moment you close it, and only to programs that then open it for reading (programs with the same name already open continue to see the old file). Current files where you can replace a block in the middle are a special case that only a few programs use, and support can be operating-system dependent (and while you are at it, try making it so you can insert or delete data and not just overwrite).
    
    4. As for "database" this can all be done with symbolic links (which can be implemented atop any file system which efficiently stores identical small blocks of data).
    
    Parent Share
    twitter facebook
so is everyone copying BeOS (Score:4, Interesting)

by Anonymous Coward writes: on Friday September 05, 2003 @07:57AM (#6878394)

It's really a sad that there was a perfectly good implementation of database file system, but the company wasn't able to topple a monopoly and got squashed. MS really should have just bought BeOS and ported everything over to it. They could have just called it LongHorn and released it this year instead of waiting until 2006.

Share
twitter facebook
- Re:so is everyone copying BeOS (Score:5, Insightful)
  
  by Twylite ( 234238 ) writes: <twylite@cr[ ].co.za ['ypt' in gap]> on Friday September 05, 2003 @09:13AM (#6879169) Homepage
  Summary of developments:
  
  BeOS has a good idea
  
  Microsoft announces a breakthrough in file system technology (around 1996), nothing happens
  
  newdocms [m-arriaga.net] announced on Slashdot [slashdot.org] in January 2003. Integrates with KDE, so no-one cares
  
  Microsoft announces WinFS plans for Longhorn. Slashdot decides that Microsoft sucks.
  
  Initial release of Haystack [mit.edu] from MIT. Screenshot has XP interface so no-one gives a toss
  
  WinFS is reviewed, Slashdot [slashdot.org] has a flame war about file system layout, and concludes that MS sucks and a database file system is a stupid idea anyway and no-one wants one
  
  YEDFS (Yet Another Database File System) announced calling itself "Storage". Integrates with GNOME. FLOSS community bows and worships the superiority, leadership and sheer innovativeness of the application.
  Parent Share
  twitter facebook
  - Re:so is everyone copying BeOS (Score:3, Insightful)
    
    by FooBarWidget ( 556006 ) writes:
    
    "WinFS is reviewed, Slashdot [slashdot.org] has a flame war about file system layout, and concludes that MS sucks and a database file system is a stupid idea anyway and no-one wants one"
    
    Wrong! Slashdot concluded that WinFS will make computing soooo much easier that it will blow the competition out of the sky and that if Linux doesn't caught up fast it will die off.
    
    Open your eyes people. Slashdot is not an anti-MS site anymore!!!
  - Re:so is everyone copying BeOS (Score:3, Informative)
    
    by leandrod ( 17766 ) writes:
    
    >
    BeOS has a good idea
    
    No!
    When Codd created the relational model, there wasn't the current Unix filesystem idea... the relational model was always intended to store data, and files are data.
    System R, SQL and DB2 prototype, was intended to be the basis for IBM FS.
    IBM realised this in OS/400, which being proprietary hasn't the influence it deserves.
    MS also wanted Jet to be the building block of its OSs since its inception, that is, sometime before MS Access release.
    >
    newdocms announced on Sl
  - Why so cynical? (Score:4, Interesting)
    
    by Spy Hunter ( 317220 ) writes: on Friday September 05, 2003 @05:38PM (#6883857) Journal
    
    Storage is more advanced, at least in concept, than all of these other options. That is why it is more interesting. The first interesting thing about Storage is that it uses natural language parsing instead of a predefined query language. This is essential to wide acceptance. The second interesting thing about Storage is that it goes out of its way to find and catalog useful metadata, whereas in most other systems you must input the metadata yourself (a tedious task that no one likes). An example given is using the name and length of a Divx file to search IMDB to get all the relevant information about a film. In this way, Storage solves what I see as the two main problems with database filesystems (it remains to be seen how well it works in practice, however). A third interesting thing about Storage is that it is backwards compatible with all GNOME applications through the VFS layer. A kioslave could allow it to work with KDE too.
    
    Parent Share
    twitter facebook
- Re:so is everyone copying BeOS (Score:5, Interesting)
  
  by jilles ( 20976 ) writes: on Friday September 05, 2003 @10:00AM (#6879635) Homepage
  
  Be-os deserves some credit for merging meta data with a file system. However, a real database goes a few steps further in terms of the ability to query, to do replication, remote data access etc.
  
  Essentially, the Be-OS filesystem, while much richer than other filesystems, is still a filesystem. This Storage thing is a full blown SQL database in the first place.
  
  Essentially a normal filesystem is a hierarchical database where as modern databases are relational or object databases. Relational databases have proven themselves for storing complex data over the past few years.
  
  Some scenarios to give you a clue as to why the distinction matters: you can set up a database trigger to track changes in wordprocessor documents (i.e. automatically update some table with version info whenever you click save); you can involve external databases when doing a query on your own database (e.g. the imdb example in the Storage proposal, a tv guide); emulate a hierarchical file system by associating directory attributes with an object; emulate multiple orthogonal hierarchical filesystems; integrate security policies and encryption into the database (could also be used for DRM, I know this is a sensitive topic); make the objects themselves database records (e.g. contact information); use report generators and queries to dymamically generate complex documents (e.g. software documentation, financial overviews, etc.). Use special purpose software to browse specific types of information (e.g. a picture album, movie library or an old fashioned filebrowser).
  
  Parent Share
  twitter facebook
Finally something new to play with! (Score:4, Interesting)

by Trigun ( 685027 ) writes: <evil@ev[ ]mpire.ath.cx ['ile' in gap]> on Friday September 05, 2003 @07:58AM (#6878398)

Hopefully they plan on extending this to the networked environment, allowing multiple domain/realm file permissions, authentication, and encryption.

Anything to replace NIS and its bastard stepchildren.

Share
twitter facebook
Replacement for ls (Score:3, Funny)

by Anonymous Coward writes: on Friday September 05, 2003 @07:58AM (#6878403)

SELECT * FROM MY_FILES

Share
twitter facebook
- Re:Replacement for ls (Score:5, Funny)
  
  by Mwongozi ( 176765 ) writes: <slashthree&davidglover,org> on Friday September 05, 2003 @08:50AM (#6878911) Homepage
  
  SELECT * FROM Users WHERE Clue > 0
  0 rows returned
  Ah, humour.
  
  Parent Share
  twitter facebook
- Re:Replacement for ls (Score:5, Funny)
  
  by sharkey ( 16670 ) writes: on Friday September 05, 2003 @09:36AM (#6879383)
  
  SELECT * FROM MY_FILES
  WHERE TYPE = 'video/x-mpeg'
  AND TITLE IS LIKE ('*tit*, *blonde*)
  ORDER BY PERV_RANK
  
  Parent Share
  twitter facebook
Screenshots ? (Score:2)

by makapuf ( 412290 ) writes:

I know, I'm the first to look for screenshots, but antialiased filesystems are a bit too much, maybe.

That's a good thing I think to separate filesystem and document storage. Better than vfs : either it's plain fs (simple == good for admin), or sophisticated document retrieval architecture with richer semantics than a tree (or graphs if you count links).

And then, do not let GUI apps show you the filesystem, only storage system.
- Re:Screenshots ? (Score:3, Funny)
  
  by tsetem ( 59788 ) writes:
  
  > I know, I'm the first to look for screenshots, but antialiased filesystems are a bit too much, maybe.
  
  Reminds me of an internal joke we have here. Our ClearCase file server was an SGI.
  
  Why?
  
  Because the filenames were rendered so much prettier than on a Linux or Sun box...
Obvious advantages (Score:5, Interesting)

by tsetem ( 59788 ) writes: <tsetem@noSpaM.gmail.com> on Friday September 05, 2003 @08:01AM (#6878429)
There's lots of advantages to this kind of system, especially if interfaces are written for other OS's (Windows, Solaris, OSX)
- Networked file system. No more NFS/SMB hacks. Everyone accesses the data in a common way, and can access the same data
- Integrated mime-types. No more relying on file extensions and other hacks. The mime-type (and subsequent viewer) is right there in the query
- Integrated version control. Have and keep a history of all of your files as they were managed and maintained through their lives, as well as a history of who modified them. If this aspect could be enhanced with branching & merging, then would make other CM Systems (CVS, ClearCase) obsolete?
Of course it's only wishful thinking. I'd be nervous to see exactly how this integrates into other "Legacy" applications. I can also see be performance penalties since you are now querying a database, rather than looking at a simple file structure...
Share
twitter facebook
- Re:Obvious advantages (Score:5, Insightful)
  
  by dabadab ( 126782 ) writes: on Friday September 05, 2003 @08:07AM (#6878500)
  
  "Integrated mime-types. No more relying on file extensions and other hacks. The mime-type (and subsequent viewer) is right there in the query"
  
  And how does that meta data gets to the db? Oh, right, it will rely on file extensions and other hacks :)
  
  Parent Share
  twitter facebook
  - Re:Obvious advantages (Score:5, Interesting)
    
    by laird ( 2705 ) writes: <lairdp@gmaiTIGERl.com minus cat> on Friday September 05, 2003 @09:43AM (#6879470) Journal
    
    "And how does that meta data gets to the db? Oh, right, it will rely on file extensions and other hacks :)"
    
    Like it has in MacOS for 20 years -- when applications write files, they tell the OS the filetype. The only time MacOS looks at extensions is if it's dealing with files transferred from operating systems that don't have relevant metadata. Unfortunately, that would be nearly every other OS. :-) But if Linux started transferring filetype metadata that would be a nice step in the right direction.
    
    Parent Share
    twitter facebook
    - Re:Obvious advantages (Score:4, Insightful)
      
      by smallpaul ( 65919 ) writes: <paul@POLLOCKprescod.net minus painter> on Friday September 05, 2003 @02:39PM (#6882185)
      
      The file should be self-describing. It should have a header saying its type. You can never trust intermediary software to properly keep data and metadata together. The problem isn't just other operating systems. It is file formats like ZIP and prototols like FTP. Plus there is a problem that the file type a user gives a file on their computer may be just a means of triggering a bit of software (e.g. change a JSP file to HTML so it launches your HTML editor). But the intrinsic type of the file should not be corrupted by these user preferences.
      
      Parent Share
      twitter facebook
      - Re:Obvious advantages (Score:3, Informative)
        
        by Chelloveck ( 14643 ) writes:
        
        Amen, brother! You just can't rely on metadata stored separately from the file itself. If I ZIP a file, or transfer it via XMODEM, or copy it onto an obsolete FAT-formatted floppy, that file should retain all it needs to be usable.
        
        Some metadata is bound to be lost, such as its modification time or even its filename. If you can afford to lose this sort of metadata, then go ahead and store it separately. But if the file can't afford to lose this stuff you'd better make sure it's part of the data, not ju
  - Re:Obvious advantages (Score:4, Informative)
    
    by jeti ( 105266 ) writes: on Friday September 05, 2003 @10:09AM (#6879712)
    
    You are aware that almost all internet protocols transfer a MIME-type with each file?
    
    Parent Share
    twitter facebook
  - - Re:Obvious advantages (Score:4, Interesting)
      
      by ryanvm ( 247662 ) writes: on Friday September 05, 2003 @09:41AM (#6879445)
      
      You don't happen to be familiar with the Mac's old "fork" filesystem do you? Metadata was kept in a seperate file (or fork). It made downloading or transferring files with non-Macs a bitch.
      
      Parent Share
      twitter facebook
- Re:Obvious advantages (Score:5, Insightful)
  
  by azaroth42 ( 458293 ) writes: on Friday September 05, 2003 @08:08AM (#6878510) Homepage
  
  Obvious disadvantages:
  
  SQL is slow compared to things like BerkeleyDB
  
  We already have journaled file systems that can save metadata (though not user defined, I think)
  
  Your database becomes corrupt, you lose everything.
  
  Sorry, give me something that gives me back my data -fast-. If I want to do selects for files, I'll use locate and xargs.
  
  --Azaroth
  
  Parent Share
  twitter facebook
  - Nope (Score:5, Insightful)
    
    by varjag ( 415848 ) writes: on Friday September 05, 2003 @08:49AM (#6878901)
    
    > SQL is slow compared to things like BerkeleyDB
    
    BerkeleyDB is a hierarchial database. SQL is godzillion times faster on complex searches.
    
    > Your database becomes corrupt, you lose everything.
    
    Your filesystem becomes corrupt, you lose everything.
    
    And yeah, I know about journaling, so don't bother :) But modern RDBMSes have integrity control facilities as well.
    
    Parent Share
    twitter facebook
    - Re:Nope (Score:3, Interesting)
      
      by azaroth42 ( 458293 ) writes:
      
      > BerkeleyDB is a hierarchial database. SQL is
      > godzillion times faster on complex searches.
      
      Great, but who is going to often do complex enough searches for files that makes any sort of RDBMS worthwhile? The vast majority of searches would be simple keyed terms.
      - Re:Nope (Score:4, Interesting)
        
        by Saint Stephen ( 19450 ) writes: on Friday September 05, 2003 @09:37AM (#6879400) Homepage Journal
        
        Just wait till you see the way "Pivots" work in the new Longhorn shell. The canonical example is sorting thousands of mp3s by artist, but it'll be A-FUCKIN-MAZIN.
        
        Face it: databases rock. You never know how many interesting questions you didn't ask because you couldn't think in sets until you do it, and then it's FAST as all get out.
        
        Parent Share
        twitter facebook
      - Re:Nope (Score:3, Interesting)
        
        by varjag ( 415848 ) writes:
        
        > How often does the average user do that?
        > Like never?
        
        No, like, when he suspects his system is infected with trojan or worm and he wants to get the list of executable files installed in last five days.
        
        Re:Nope (Score:3, Insightful)
        
        by nosferatu-man ( 13652 ) writes:
        
        Who cares about the user? The system itself would have capabilities finally expressible in terms more advanced than those constrained by what some drug-addled graduate student with no maths decided was sufficient in 1971.
        
        In other words, the user might never think to do that, but it'd be so cheap for the operating environment that all kinds of new applications would appear.
        
        'jfb
  - - Re:Obvious advantages (Score:3, Insightful)
      
      by Anonymous Coward writes:
      
      XFS limits user-defined extended attributes to 32 KB. Big, but not unlimited.
      
      Also, extended attributes are fundamentally broken because they're stored in the inode. They do not survive, for example, a copy operation. Worse, they do not survive an open/save cycle in most cases, because most programs do not write to open files. Instead, they open a new file under a temporary name, write the data into it, close it, unlink the original file, then rename the temporary file to the original file's name. That way
How does the metadata get into the database? (Score:5, Insightful)

by farnz ( 625056 ) writes: <slashdot.farnz@org@uk> on Friday September 05, 2003 @08:02AM (#6878442) Homepage Journal

My major concern with all these database type filesystems is that the gains are always shown as things like, "Find all films directed by Steven Spielberg", and yet this is not information that the computer can necessarily gather for itself.
Outside of a work environment, I've rarely encounter anyone who keeps consistent, useful filenames, let alone metadata indexes; it seems to me that people will skimp on the metadata, and thus limit the usefulness to metadata that the computer can collect automatically ("All movies that last under 90 minutes"). It's like CD collections, or books; libraries have nicely catalogued and ordered collections. Private individuals don't; they have roughly ordered collections on the shelf, and don't bother keeping them in any better order. I suspect the same will happen with these metadata systems; people won't do the work needed to make them truely useful.

Share
twitter facebook
- Re:How does the metadata get into the database? (Score:5, Funny)
  
  by henbane ( 663769 ) writes: on Friday September 05, 2003 @08:07AM (#6878504)
  
  "It's like CD collections, or books; libraries have nicely catalogued and ordered collections. Private individuals don't; they have roughly ordered collections on the shelf, and don't bother keeping them in any better order"
  Call yourself a geek? How can you possibly but something on a shelf without first checking to see that it's in a proper place observing the subtle cross reference system that backs up the obvious system. Man, I hate it when people move my stuff.
  
  Parent Share
  twitter facebook
  - Re:How does the metadata get into the database? (Score:2)
    
    by rusty0101 ( 565565 ) writes:
    
    you mean you don't order the books by the color of their binding? How unstylish of you.
    
    -Rusty
    - - Re:How does the metadata get into the database? (Score:3, Funny)
        
        by hoggoth ( 414195 ) writes:
        
        > Organizing by color works better with CDs than books
        
        Geez, do you have to count how many times you wash your hands? Do you make sure your right turns always equal your left turns?
        
        Try this:
        Leave a big cardboard box on the floor. Dump all your CDs into it. When you want to listen to music, fish around and enjoy the surprise at what you find.
        Enjoy the extra free time you have now that you are not obsessively organizing your collection.
  - Re:How does the metadata get into the database? (Score:3, Insightful)
    
    by noselasd ( 594905 ) writes:
    
    Right. Individuals hate having their stuff messed with. So files are kept in a mess, but _you_ know where they are. Atleast untill you use Storage, which will mess them up.
- Re:How does the metadata get into the database? (Score:5, Insightful)
  
  by Tom ( 822 ) writes: on Friday September 05, 2003 @08:21AM (#6878642) Homepage Journal
  
  That's why we have community products. For music, CDDB works pretty good and is a working real-life example.
  
  Other metadata is automatically inserted. When you install OpenOffice, it asks for your name and inserts that as the author into any new documents you create, for example.
  
  Sure, the metadata on my personal machine will never be comparable to what a library could do. But it doesn't have to be - it has to be useful for me, not - like the library - for thousands of people with very different interests and approaches.
  
  Parent Share
  twitter facebook
- Re:How does the metadata get into the database? (Score:4, Insightful)
  
  by Anonymous Coward writes: on Friday September 05, 2003 @08:35AM (#6878766)
  
  There are two kinds of metadata: intrinsic and extrinsic. Intrinsic metadata, as the name would imply, is information that's contained entirely within the file.
  
  Some intrinsic metadata can be extracted with automation. For example, it's pretty trivial to examine a TIFF and tell you that it's X by Y pixels in a given color space. It's harder, but still possible, to tell you that the TIFF is predominantly red and green. It's impossible for the computer to tell you that it's a picture of a barn.
  
  The same is true of extrinsic metadata: some of it can be extracted automatically, but not much. An example of extrinsic metadata would be a relationship. The computer can tell you that main.c and somefunction.c are both C language source code files, but it may or may not be able to tell you that they're both part of the same program. If the two files are explicitly related to each other through a makefile or some such, then the computer can know that they're related. But consider the collection of JPEGs I just copied to my home directory from my digital camera. A dozen of those pictures were taken in Fiji. The computer cannot know this unless I tell it, nor can it know which pictures were taken in Fiji and which were taken in my back yard last Tuesday. Thanks to my camera, the computer can know what apeture and focal length were used for each picture. In theory, if my camera had a GPS receiver in it, the computer might even be able to tell me there, on earth, the camera was when it snapped a given photo. But these are just automatic methods of telling the computer about the pictures. They're conceptually no different from sitting down and typing the information in. The point remains that the computer can figure out some things on its own, but it cannot know most things unless it is told.
  
  You don't have to strain your imagination to think about this stuff. Consider your MP3 collection. Your computer can tell you that a given MP3 is 6:15 long, and that it's 192 kbps, and that it's stereo. It can't know that it's "Treefingers" by Radiohead unless somebody tells it first.
  
  So you're basically right: automatically extracted metadata is marginally useful, but the really useful stuff has to be manually entered. And generally speaking, even in business environments, that sort of information simply never makes it into the database. It exists exclusively in people's heads.
  
  That's the biggest challenge of digital asset management--which is, incidentally, essentially what we're talking about here. The biggest challenge is how to take information that people have in their heads and store it in some structured, persistent form. That form might be a three-ring binder or a card catalog or an Oracle database; the challenge is the same even if the technology is different.
  
  Bottom line: this technology is really neat, and has limited applications in which it's very useful. But it is not generally useful, nor does it have widespread applications.
  
  Parent Share
  twitter facebook
- Re:How does the metadata get into the database? (Score:5, Insightful)
  
  by selderrr ( 523988 ) writes: on Friday September 05, 2003 @08:54AM (#6878972) Journal
  
  Very true and insightful !
  
  Another argument to prove you right is the "rate this song" options in iTunes. With that feature, one can assign 1 to 5 stars to a song so that later, you can quickly select your favourites. Such system is flawed in 2 ways :
  
  - I have yet to encounter anyone who uses it exhaustively. Most folks rate a few dozen songs. I have a library with 9000 mp3s and sure as hell I'm not going to spend a whole week rating them.
  - I have yet to encounter someone who uses it consistently. Today I might consider Chris Isaak a 4star song since I'm in a depressed mood and it's raining outside. Tomorrow the weather might be beautifull and I mod him down to 2 stars cause he's a bloody negative wanker.
  
  This is offcourse iTunes specific, but it shows that the assignment of metadata is far far far more complex than the methods to search/organize the stuff, which is what the "Storage" software above is about.
  
  As an extra complication : consider that my metadata might not match someone elses. For instance if I were to label a mail "message from my brother", the same content would be "message from my son" for my father !
  
  The fact that metadata based filesystems are not on our desktop is perhaps more to the fact that it's not a valid solution for data on desktop computers. Maybe, just maybe this is not due to MS squashing Be, as someone else was karmawhoring above.
  
  Parent Share
  twitter facebook
  - Re:How does the metadata get into the database? (Score:3, Insightful)
    
    by ryanvm ( 247662 ) writes:
    
    Good points, but you're just talking about subjective metadata. The usefulness of which is certainly debatable. But what about factual metadata? Consider a downloaded movie that may have fields like: Year, Director, Rating, Genre, Studio, Cast, etc.
    
    Granted the end user is not going to be likely to maintain this information, but that doesn't really matter. The end user is also not likely the source of the material in the first place. I contend that metadata is more useful for material that the user has down
- Re:How does the metadata get into the database? (Score:3, Interesting)
  
  by gr8_phk ( 621180 ) writes:
  
  "I've rarely encounter anyone who keeps consistent, useful filenames, let alone metadata indexes".
  If it allowed natural language interaction with the machine, people might just provide more information. Since it begs for a voice interface, why not have the machine ask a few questions about a document while you're editing/viewing it? When a new file comes in via email with no metadata, the machine says "what's this all about?". You'll naturally describe it using words similar to those you'll use to retreive
ext3 + sql (Score:2, Interesting)

by Dreadlord ( 671979 ) writes:

I don't know how a database system can improve a file system's performance, especially with the unnecessary overhead associated with, the current state of the ext3 file system is doing quite well, and updatedb/locate works fine for me.
What can really interest me is something like updatedb/locate but with SQL syntax support, this could be awesome.
- Re:ext3 + sql (Score:5, Informative)
  
  by rtaylor ( 70602 ) writes: on Friday September 05, 2003 @08:45AM (#6878854) Homepage
  
  It won't improve performance if you know exactly what you are looking for. The goal is to improve performance when you only have a vague idea of what you want.
  
  This isn't a place to store config files or cronned shell scripts which have definitive locations and content.
  
  This is a replacement for that 5TB corporate filestore with a 50 directory hierarchy that nobody can figure out, and a content based find takes days to complete.
  
  Parent Share
  twitter facebook
  - Re:ext3 + sql (Score:3, Insightful)
    
    by stephenbooth ( 172227 ) writes:
    
    It also solves the problems of if I have a letter to Smith and Sons (Builders) about a bridge construction project do I store it under letters/Smith&Sons/Bridge or Smith&Sons/letters/bridge or bridge/Smith&Sons/letters or projects/builders/bridge/smith&sons/letters or what. With a database based file system you store it once and tag it as a letter, to Smith and Sons (Builders) about the bridge construction project (and any other tags you would like to apply to it) and then when you want to
AS400 did this 20 years ago: (Score:5, Informative)

by mikepb78 ( 704506 ) writes: on Friday September 05, 2003 @08:05AM (#6878476)

The filesystem on AS400 is actually a db2 database and it work quite well

Share
twitter facebook
Natural language interface? Hmm... (Score:2)

by Viol8 ( 599362 ) writes:

Not looking at screenshot 15 it isn't!

"select DISTINCT(recordid) from AttrSoup...."

Well call me old fashioned , but in my day we called that SQL. Why do I get the feeling this is
just yet another database dressed up to look like its providing some always-wanted-but-until-now-folks-it-just-wasnt-p o ssible-but-with-new-WizzoFS- etc etc

Call me a cynic but I've seen it all before. Besides , databases are inefficient for manipulation filesystems at a low level so expect
your PC to crawl if you use this on
BeFS hello?? (Score:2)

by OmniVector ( 569062 ) writes:

It's about time we started playing catchup with BeOS's filesystem. Though this seems more user-land when a function like this (file systems) should be more kernel-land.

This is essentialy what Longhorn's taked on SQL extensions are going to provide, and I had no idea there was ongoing progress to have this functional in *nix so soon! By the time 2005 rolls around, I have a feeling this will be a lot further a long than microsoft's implementation.

*cracks whip* On zerocool, on uberh4ck3r, on coding monk
I18n? (Score:2)

by MSBob ( 307239 ) writes:

This is all great but how is this project going to work with languages other than English?
Storage (Score:2)

by SuperBanana ( 662181 ) writes:

OSNews is reporting on Storage, an innovative project which aims to replace the traditional hierarchical filesystems with a new document store which is database-based (PostgreSQL).
I have a new way to get between point A and B. I call this product "Car". To fuel it, I've started a fuel company called "Gas". Of course, people will abuse "Car", so I've also created something to keep them in line, called "Fuzz". Fuzz will be powered by what I call "Donut".
(hey, it's Friday, gimme a break :-)
Patents? (Score:2)

by SynKKnyS ( 534257 ) writes:

Is this method of finding and storing files patented?
Why link directly againsat libpq? (Score:3, Insightful)

by Wdomburg ( 141264 ) writes: on Friday September 05, 2003 @08:10AM (#6878540)

It seems silly to tie the implementation to a single database, when gnome-db is fast approaching 1.0.

Share
twitter facebook
- Re:Why link directly againsat libpq? (Score:4, Informative)
  
  by rtaylor ( 70602 ) writes: on Friday September 05, 2003 @08:41AM (#6878822) Homepage
  
  Their feature list say it will work with Oracle and other SQL99 compliant databases, so I would assume it isn't linked against libpq directly.
  
  Parent Share
  twitter facebook
ok, (Score:2)

by noselasd ( 594905 ) writes:

wtf is this going to help ? People are as much boneheads in putting the right attributes/keywords on a "file" as they are to categorize them in a hierarcy(traditional filesystems). Why is this so great ?
I spot a pirate! (Score:2)

by meshko ( 413657 ) writes:

U2 albums, Spielberg movies, Nicole (Kidman?) movies... I think someone is in trouble!

As for the idea of database file systems -- I don't think we need this yes. Both file systems and database research should concentrate on distributed /mobile aspect (even Coda, AFS and friends are not yet widely accepted/ready for prime time).
ReiserFS future (Score:2)

by realnowhereman ( 263389 ) writes:

I personally like the way reiserfs is roadmapped. If I understand it correctly it will be a superset of existing filesystems. That is /home/myname/documents/report/2003/ will still work, but then so will /documents/reports/2003/myname; and so on.

Multiple paths to the same object seems perfect to me.
"Damn, I left that on my roommate's desk" (Score:5, Insightful)

by kfg ( 145172 ) writes: on Friday September 05, 2003 @08:21AM (#6878637)

"Well, where do you go?"

"Stanford."

"No problemo, I'm heading that way later and I can grab it for you. What's your room?"

"Dorm 5, Room 109. It's the desk on the left."

( We didn't bother to state earth.us because we were already inside those directories)

Yes, yes we do think heirarchically. Most of the history of human thought has been fitting everything we can lay our filthy little brain cells on into heirarcheis, whether they wish to fit into them or not. It's intuitive.

As for natural language didn't we learn about that with COBOL? Natural language only speeds the learning process slightly ( the majority of the learning still lying in the realm of understanding the basic concepts involved), but then becomes a pain in the ass forever afterward.

Looking at the screenshots it's also ugly as all sin. The physicist in me can't help but feel that a model that ugly can't possibly be correct.

I think this makes just about as much sense as using a document preperation language (XML) as the basis of a database.

Which is to say, none.

KFG

Share
twitter facebook
- Re:"Damn, I left that on my roommate's desk" (Score:4, Interesting)
  
  by lawpoop ( 604919 ) writes: on Friday September 05, 2003 @08:48AM (#6878884) Homepage Journal
  
  Human beings can and do think heirarchically, but that doesn't mean it's the end-all-and-be-all of organization.
  I think the examples he shows are pretty good. In my mp3 collection, I would like to see "All bluegrass songs" or "all remixes of Parliament Funkadelic stuff". How do you propose to do this in a hierarchical filesystem? Most of my bluegrass artists are under 'bluegrass', but then there are some bluegrass songs that were in non-bluegrass artists and albums folders.
  In my workplace we are having the same problems. On our shared folders, we have shipping documents in each clients' folder. But then, what if we what to see all shipping documents from a particular vendor? Currently, we would have to go into each customers' folder (which are also broken down by year archives) and grab all documents which *might* be from said supplier, and then open each one, and look to see, because the supplier name isn't in the filename. It's horribly broken, which is why we are moving to a database storage system for such documents.
  
  Parent Share
  twitter facebook
Random thought for the day... (Score:5, Funny)

by fluxrad ( 125130 ) writes: on Friday September 05, 2003 @08:26AM (#6878685)

Am I the only one that isn't totally into the idea of "googling" data on my hard drive?

Granted, it's mostly pr0n on there, so it's almost the same thing, but still...

Share
twitter facebook
Oracel IFS (Score:4, Informative)

by rhinoX ( 7448 ) writes: on Friday September 05, 2003 @08:39AM (#6878806)

It was called IFS and Oracle did it like, almost four years ago.

Versioning and various other metadata existed. It could be exported via SMB, NFS, FTP, and as a regular "local" windows filesystem.

And, why is this such a great big deal? I don't see the same stink raised as the possibility of Longhorn having a DB for a filesystem.

Share
twitter facebook
Microsoft Attempts for decade,GNOME Does in months (Score:5, Interesting)

by NZheretic ( 23872 ) writes: on Friday September 05, 2003 @08:45AM (#6878850) Homepage Journal

1994 Cairo Takes OLE to New Levels [byte.com]
The next version of Windows NT, code-named Cairo and targeted for release sometime in 1995, will be built around the concepts of objects and component software. It will have a native OFS (Object File System) and distributed system support.

1995 Signs to Cairo [byte.com]
Cairo, Microsoft's object-oriented successor to Windows NT, will begin beta testing in early 1996 for release in 1997. Although Microsoft is not revealing the full details of Cairo yet, there are enough clues within current Microsoft OSes to yield a good idea of how it might work.

1996 Unearthing Cairo [byte.com]
At the first NT developers conference in 1992, Bill Gates announced that Cairo would arrive in three years and would incorporate object-oriented technologies, especially an object file system. Since then, we've seen Windows NT 3.1, NT 3.5, NT 3.51, and most recently NT 4.0. None is object oriented, none has an object file system, none is Cairo. It seems that Cairo is Microsoft's sly way of promising the world. "Will we see Plug and Play in NT?" "Oh yes, of course, in Cairo." "Will NT ever produce world peace and cheap antigravity?" "You bet -- in Cairo."
The so call Longhorn WinFS directory is just another rencarnation of the Cairo object orientated file system.
September 1, 2003 Eweek 'Longhorn' Rollout Slips [eweek.com]
Microsoft Corp. has once again shifted the schedule for the release of "Longhorn," the company's next major version of Windows, leaving some users up in the air about an upgrade path.

Microsoft executives from Chairman and Chief Software Architect Bill Gates on down have long described Longhorn as the Redmond, Wash., company's most revolutionary operating system to date. The product was originally expected to ship next year. Then in May of this year, officials pushed back the release date to 2005. But now executives are declining to say when they expect the software to ship.
"We do not yet know the time frame for Longhorn, but it will involve a lot of innovative and exciting work," said Gates at a company financial analyst meeting this summer. Since then, other Microsoft officials have neither retracted nor clarified Gates' statement.
Microsoft have been attempting this type of functionality since 1991, over a decade. Meanwhile, one open source GNOME developer, with help from the other core GNOME developers, provides most of the features within months [gnome.org].

Share
twitter facebook
- Re:Microsoft Attempts for decade,GNOME Does in mon (Score:3, Insightful)
  
  by fault0 ( 514452 ) writes:
  
  Microsoft didn't really put in much investment in Cairo after it was pretty apparent that nobody really cared for it at the time. Most people really don't like novel ways of doing things. There is too much investment in the old ways atm. I guess if the world were different, we would all be using Microsoft Bob right now.
  
  So, I think this GNOME thing will also sizzle out after a while.
GREAT! If it is done well... (Score:5, Interesting)

by evilviper ( 135110 ) writes: on Friday September 05, 2003 @08:45AM (#6878856) Journal

People don't seem to see how great this is. Maybe it's because most people don't have all that much data.

On my home systems, I have over 250GB online. That doesn't even count my music or videos/movies, which I keep on seperate, removable, optical storage.

I can tell you from experience, that managing that much data is a huge hassle. Let's say you've got your files organized well. You probably have hundreds of folders for each subject, and you have to broswe to each one with each new file you save. I have a folder (several actually, for various subjects) where I save thing that I've haven't taken a look at yet. Let's say it's a program that I haven't installed. Well once I do install it, I need to clean up all the temporary files, then browse around to another folder (takes a minute or two when you have hundreds of folders), where I save installed programs, and browse to the appropriate sub-folder, and save it. But then I end up doing the same thing with a video clip... Watching it, deciding to delete or save it, then browsing to a sub-sub-sub-sub folder to move it.

Of course, that's enough of a hassle, but things get complicated when I want to move things to another systems, which obviously isn't going to have the same filesystem. Merging each individual folder, into each different folder is seriously time-consuming, and teedious. Without fail, there always ends up being a couple folders in the wrong place, because they were a sub-folder of something else, that I did happen to see when I coppied the contents of the folder.

Then matters are even further complicated, because I may choose to delete older content months later or so, and locating everything is a huge mess.

Personally, I would like to save everything in one place, not having to change folder to folder for each file. When saving something, I could just enter a handful of keywords (eg. "picture penguin snow") which would be much less work than moving to directories or even typing in a long filename. From there, a simple database system would be be able to know what type of file it is, how large it is, and how old it is. That would make it incredibly easy to manage. Whenever I want a file, I type-in "images older than 1 years" or "programs marked as archived" and I get EVERYTHING I'm looking for in a fraction of the time. Not only that, but it makes pruning out old data as easy as it could possibly be. Just search for "linux" and delete older version, no worries about what folder it's in... If it's in a temporary folder and you haven't used it yet, or if it's archived and been in-use on your system forever. Obviously you'll be able to see that information, but unlike in our current systems, it won't stand in your way when you want to find things.

It's absoultely no work at all to transfer files, since the info should stay with them, and it will automatically integrate perfectly with your local file management/organization scheme. What's more, data like marking something as "archived" is great in that your system could automatically move it over the network where you archive your files. Since your filesystem would be a smart database, when you search for the file, it could still turn up in the search results, and be automatically moved back where you need it, when you need it.

Personally, I think this would not only save time and effort, but money as well, because so many people wouldn't be dealing with their file problems by just throwing more space in their systems, instead of spending time on figuring out where every file is, what they can get rid of, dupilcate files, and junk like that.

With this, I should be able to say "tar -xjf 'newest version of mplayer'" However, this will need to be in the actual filesystem to be useful, not just supported for GNOME applications.

Share
twitter facebook
- Re:GREAT! If it is done well... (Score:3)
  
  by poofmeisterp ( 650750 ) writes:
  
  I can see it now... 'penis enlargement guaranteed' popping up at random places in the database.
  You'll have to type "I do not want a bigger penis" to remove them all.
  Heh.
Ask a librarian how (Score:3, Insightful)

by nuggz ( 69912 ) writes: on Friday September 05, 2003 @08:54AM (#6878962) Homepage

What we need is to get some information storage/retreival experts to provide some guidance to the developers of these ideas.

Librarians have been working on these problems for centuries, why not start with what they know?

Share
twitter facebook
Ah yes, the infamous relational filesystem... (Score:5, Interesting)

by Millennium ( 2451 ) writes: on Friday September 05, 2003 @09:16AM (#6879200)

Although this is an interesting idea, an all-relationsl filesystem would prove to be a usability nightmare.

The relational zealots are quick to point out that a relational system can model any sort of data. Indeed, it can do this. This does not, however, mean that it's always good at doing this. Sometimes it's the right tool for the job, and sometimes it's not. In this case, it is very much not a good tool for sole access to files on the system (though it can make an excellent tool for complementary methods of access).

The reason that hierarchical filesystems have survived for so long is due to one thing: navigability. It's relatively easy for any user to browse what's on the system and get a good idea of how it is organized.

You can't navigate a relational system, which will prove to be the downfall of any all-relational system which comes into being. You can, of course, do a SELECT * FROM volume if you really want to, but that does exactly that: it gives you all the data, with no particular organization. Examining the entire "sea of data" suddenly becomes cumbersome in the extreme. So while User A might be able to set up an all-relational filesystem completely according to his own tastes, User B will be totally lost on that same system. This is, to say the least, a nightmare for anyone working in a shared environment.

This is not to say that the relational model isn't necessarily a useful thing for filesystems. On the contrary, it can be very useful for certain kinds of searches. As time goes on, I believe we'll see more relational-style searching technology incorporated into file managers and search tools. However, there also needs to be a means of hierarchical navigation. Humans tend to think of things in terms of locus, and a means of providing that kind of reference point have to be maintained.

Luckily, this can actually still be emulated using relational-style tables, even though it's somewhat less efficient than classical storage techniques. Some filesystems already do something similar to this, and the results are promising. Look at Be's filesystem for an example of that.

The best way to go, moving forward, is something not unlike what BeOS did, with both hierarchical and relational methods of examining data. This allowed for the best of both worlds. The default method of getting at data is still the hierarchical paradigm, but relational searches can be applied to create what some have called "smart folders" (perhaps "boxes" might be a better term?) Systems like this "Storage" should be focusing on complementing traditional systems in this way, rather than replacing them.

Share
twitter facebook
No, no, no!!! (Score:3, Interesting)

by master_p ( 608214 ) writes: on Friday September 05, 2003 @10:25AM (#6879886)

No, this isn't what is needed. Hierarchical object-oriented persistent object trees is what is needed.

Let me explain.

Information in real life is organized in trees. It is obvious anywhere one can look. From the organizational chart of a company to the chair that you are sitting on, everything is a tree; a tree of information, where each little piece of information consists of other pieces of information.

If you check computer applications, almost all application contains some sort of tree. Take a Word document, for example: the master document, the contents, the heading 1 and subheading paragraphs, the pictures, the drawings. Everything is a tree, and the document can be browsed as a tree.

Take your favorite mail client. Information is organized in a tree: inbox, outbox, sent, trash; each of these contain an e-mail. An e-mail itself is composed of subject, body, attachments. The body consists of paragraphs; the attachment consists of files.

Take your favorite drawing program: the picture consists of layers; each layer consists of shapes; each shape may consist of other shapes.

Take your favorite 3d/cad program: a 3d object consists of other 3d objects.

Take the gui: a window consists of other windows.

Relational databases don't provide tree organization. I don't want a freaking flat table to store my documents. I want to organize them in trees. That's why a filesystem has subdirectories.

The biggest problem of the current filesystem technology, is that a 'file' is as dumb as it gets: it is just a collection of bytes, waiting to be manipulated by some other program. It is even untyped, for God's shake!!! one program may view it as a series of bytes, another program may view it as text, another program may view it as code!!! The file itself can't tell you anything about its properties, about its contents, about the way it is supposed to function....

If a file could tell the outside world how to be operated, then the world would be a much better place. If a file could tell me its properties, if it provided me with the tools to manipulate it, then I would make any type of app that processes the file as needed.

The above is essentially object orientation on the filesystem level. RDBMS don't offer such kind of functionality!!! at best, an RDMBS offers an index on a key for quick searching, and that's it!!! There is no notion of tree, nor each file exposes its properties/methods/functionality to its users!!!

So I say a big 'NO' to relational filesystems.We can immediately move to the upper level:

1) each node of information is AN OBJECT.

2) the object specification is defined at the filesystem level. Much like COM or .NET or CORBA.

3) each object can contain other objects, if it inherits and implements a specific interface.

4) each object is PERSISTENT. The filesystem takes care of persistence, according to attributes of the object's fields. Complex objects that are composed of other objects are also managed in the same way.

5) the parent object provides the storage implementation. The storage implementation would be object-oriented!!! An object could implement an RDBMS-like storage capability with indexes, keys, etc.

At each given time, the information model inside the computer could be:

1) splitted in multiple computers.

2) shared by multiple users.

3) checked for security in ONE place, inside the operating system.

4) provided as a framework to programming languages.

5) replicated across sites with minimum of code

6) a unified GUI could handle it

7) searching through it will become a breeze!!! (for example show me all MP3 with artist = Elvis and title = rock)

After all, 90% of the programming goes to load/store and display information. It is silly not to provide a unified mechanism for that. And a simple SQL-based RDBMS does not cut it.

Share
twitter facebook
why the relational model is not right (Score:5, Informative)

by hansreiser ( 6963 ) writes: on Friday September 05, 2003 @11:11AM (#6880298) Homepage

www.namesys.com/whitepaper.html [namesys.com] describes why the relational model is not the right one for large heterogeneous stores (filesystems), and describes the approach ReiserFS (a Linux filesystem used mostly in Europe) is taking instead.

Hans

Share
twitter facebook
Voice recognition? (Score:3, Interesting)

by gr8_phk ( 621180 ) writes: on Friday September 05, 2003 @11:58AM (#6880680)

If I'm to have a natural language interface to find my files, I'd really like to make spoken requests instead of typing a long sentence. Do they have plans for that in GNome?

Share
twitter facebook
as long as it stays in user space (Score:4, Insightful)

by penguin7of9 ( 697383 ) writes: on Friday September 05, 2003 @12:04PM (#6880737)

Of course, databases are very useful for organizing user data. People already keep PIM info, images, and lots of other stuff in databases. Lotus Notes is built entirely on databases.

But "replacing the traditional file system" carries with it the notion of ripping ext3 out of the kernel and putting a relational database there. That's a very bad idea. Databases don't belong into the kernel. They are far too inefficient to handle most storage needs, they are far too complex to go into the kernel, and they just don't need to be in the kernel. Operating system kernels need simple, fast storage systems. Something like ext3. ReiserFS is pushing the limits. PostgreSQL would be going too far.

As an aside, this is an idea that just about every nerd has when they learn about databases and retrieval. It's been tried various times since the 1960's. There are probably good reasons why interfaces don't use them. Perhaps most importantly, keep in mind that the vast majority of files on your system are not user files, they are bits and pieces of the operating system. And for the files that actually are used by users (mail, PIM info, images, text, etc.), they usually already have special-purpose database interfaces available to them as part of the applications that users use to access them.

Share
twitter facebook
Only as good as your data (Score:5, Interesting)

by kstumpf ( 218897 ) writes: on Friday September 05, 2003 @01:00PM (#6881339)

One thing I haven't seen mentioned yet is that a filesystem of this type is only useful if there is quality metadata accompanying every file you expect to find. Searching for "all jazz music" would return nothing unless the filesystem was told about each file that qualifies as "jazz music". What if I wanted to be more specific and say "jazz horn music"? Even more specific, "jazz trumpet solo"? The filesystem would have to know all of this data to be effective.
Where does this metadata come from? I assume I have to enter it myself. This means the more files I have, the more detailed and specific my data entry becomes. And that much more tedious.
Even worse is the uncertainty that would arise. Is my search for "horn solos" not returning results because there are no such files, or because the filesystem does not have meta data describing the files I want as such?
At this point, hierarchial organization once again becomes much more appealing again.

Share
twitter facebook
Comments from Seth (aka Storage's designer) (Score:5, Informative)

by nullity ( 115966 ) writes: on Friday September 05, 2003 @07:25PM (#6884534) Homepage
I suppose it is probably too late to inject comments and have them moderated to the point of visibility as the madness has largely subsided... but here's to futile acts ;-) I was not really intending Storage to make a big splash right now, I wanted to keep it low-key, but I guess the damage is done so I might as well comment. I'm sorry that I didn't have time to put up a more technically-oriented exposition of Storage. *shrug*
- Slashdot has focused almost exclusively on the "database backing". Guys, this is an implementation detail. Its an important one, but I didn't start off this design thinking "lets write a database backed filesystem store". A set of design goals was established (largely mirrored in the features page). Storage is a lot more than just a database backed XML store. Please read the features page [gnome.org]. The "searchable" stuff is nice, but equally important is providing persistent objects, uniform access (the same URI for a local storage node works globally assuming your computer has a publicly accessible IP address), an improved model for revision and "saving", the ability to localize filesystem resources, and due to a standard object format greater transparency of filesystem resources to the OS which will be useful in weakening the barrier between "apps" and "desktop" found in PCs (and not so much in, say, cell phones and pdas). This is also a key piece in an overall design of the desktop's interaction structure which I haven't had time to write up for the web.
- I'm not trying to make any claims to being the first or being highly innovative, but I am happy to make claims about improving the user experience. That said, contrary to what people are saying, to my knowledge other than the superficial layer of database backing, Storage's features do not have a "one to one" correspondence with any existing system, BFS and the only vaguely specified Windows Future Filesystem included. Most importantly these components do not seem to be a part of the same overall interaction design model that Storage is intended to support. Storage is just a stepping stone, albeit a pretty disruptive one.
- I've been quiet about this project, even inside GNOME. Storage as written today was primarily written by a team of Stanford students as their CS senior project. I've since been working with a few good GNOME developers including the person working on Medusa (Curtis) and the Epiphany maintainer (Marco). They were independently developing a metadata system for GNOME, which it looks like we may implement on top of Storage as a first major test of its capabilities. But nothing is certain right now. But the short story is that although storage is being developed by GNOME developers and I serve as usability project lead, its not an official GNOME module at this point. GNOME developers would need to corporately buy into both the Storage vision and the overall desktop design. This may never happen, and if it does, its going to be very slow in the coming.
Some technical notes... that site is sparse on technical information so I'll fill in some for the curious.
- The data store is backed by Postgresql. Postgresql rocks, though some of the features like instant notification of object changes and live queries do not fill well with existing SQL. We have ways to do all of this using Postgresql extensions, but sometimes its a little tricky and/or hackish.
- A lot of the proposed interface will rise and fall based on the quality of the NL processing. Storage is currently using some pretty cutting edge linguistics theories and tools... notably working within the basic LinGo framework [stanford.edu]. This includes using theories/systems like HPSG (Head-Phrase Structure Grammar), MRS (minimal recursion semantics), and being able to use a set of existing wide-coverage grammars such as the ERG (English Resource Gramm
Read the rest of this comment...
Share
twitter facebook
A filesystem *IS* a database! (Score:3, Insightful)

by Tracy Reed ( 3563 ) writes: <treed@u l t r a v i o l e t .org> on Saturday September 06, 2003 @01:10AM (#6885948) Homepage

Not really a comment on "storage" but just a comment on something that has constantly bugged me when someone says "let's put it in a database!"

A filesystem is a special case of a database. So it is perfectly acceptable to store your data into a filesystem. Some people seem to think everything has to be put into a relational database or that is it somehow cool to do so. I have seen people store loads of graphics as BLOBS in databases. Someone once suggested storing a ton of MP3's in a database. Most recently someone said (and this isn't the first time) that we should store all of the emails in a database. It's just another unnecessary layer of complication, especially when you are going to be referencing the email/graphic/mp3 by name all the time anyway (fs's like reiserfs index on name so it's blazingly fast) and not by a bunch of other pieces of meta-data. And if you are going to need to do lookups by various bits of meta-data then store the meta-data in a db and also store a record pointing to the actual file on disk. I have done that lots of times and it works great.

Share
twitter facebook
- Re:Windows? (Score:2, Informative)
  
  by henbane ( 663769 ) writes:
  
  Longhorn will be database based.
  Why don't these people just put some effort in reiserFS?
  - Not exactly (Score:5, Informative)
    
    by gilesjuk ( 604902 ) writes: <giles.jones@z[ ]co.uk ['en.' in gap]> on Friday September 05, 2003 @09:13AM (#6879172)
    
    http://theregister.com/content/4/30670.html
    
    "The oft-misunderstood Windows Future Storage (WinFS), which will include technology from the "Yukon" release of SQL Server, is not a file system," reports Thurrot. "Instead, WinFS is a service that runs on top of - and requires - NTFS."
    
    Parent Share
    twitter facebook
  - Re:Windows? (Score:3, Funny)
    
    by __past__ ( 542467 ) writes:
    Why don't these people just put some effort in reiserFS?
    
    Because some people value their data
    
    Because some people think "free software" doesn't mean "software you are free to modify as long as it doesn't interfere with Hans Reisers business interests"
- Windows' filesystem (Score:3, Informative)
  
  by mic256 ( 702811 ) writes:
  
  I think Longhorn will be the first Windows with a database filesystem. It will probably be based on SQL Server
  - Re:Windows' filesystem (Score:2, Interesting)
    
    by Serapth ( 643581 ) writes:
    
    Yes, from what I have read, that is true. MS plans to use SQL server 2k3 as the underlying technology for the file system for longhorn. What I just dont get though... if SQL is going to be used as the file system... then every Longhorn PC in a sense either needs to have SQL ( or MSDE ) or needs access to a SQL server which seems unlikely as you would bottleneck on the network speed.
    
    What then happens to SQL as a MS product? If its built in to every OS, why then would anyone buy it. Ive seen MS build o
    - Re:Windows' filesystem (Score:4, Informative)
      
      by cyclist1200 ( 513080 ) writes: on Friday September 05, 2003 @08:03AM (#6878453) Homepage
      
      The filesystem will be based on SQL Server 2003, but it won't be a fully functional version of SQL Server.
      
      Parent Share
      twitter facebook
      - Re:Windows' filesystem (Score:5, Funny)
        
        by simonecaldana ( 561857 ) writes: on Friday September 05, 2003 @10:17AM (#6879789) Homepage
        
        > The filesystem will be based on SQL Server 2003, but it won't be a fully functional version of SQL Server.
        
        you mean it will be a standard version of SQL server? :)
        
        Parent Share
        twitter facebook
    - Limitations in the home edition (Score:5, Informative)
      
      by yerricde ( 125198 ) writes: on Friday September 05, 2003 @08:04AM (#6878465) Homepage Journal
      
      What then happens to SQL as a MS product? If its built in to every OS, why then would anyone buy it.
      
      Remember how Windows XP Home and Pro editions can serve files only to less than a dozen simultaneous clients? This is to boost sales of the IIS bundled with Windows 2000 Server and now Windows Server 2003. Microsoft SQL Server Home Edition will probably be limited.
      
      Parent Share
      twitter facebook
      - SQL Server Desktop Engine (Score:3, Informative)
        
        by illsorted ( 12593 ) writes:
        
        My guess is that they'll use MSDE, which is already freely available [microsoft.com] and "royalty free". I think it's basically just the core of SQL Server without any of the extra tools.
    - Not SQL Server Directly (Score:5, Informative)
      
      by Watts ( 3033 ) writes: on Friday September 05, 2003 @08:28AM (#6878710)
      
      Having SQL Server as the underlying filesystem technology doesn't mean that you're going to be running SQL Server directly. I mean, if you currently use NTFS, there isn't a NTFS daemon that the kernel connects to when it does filesystem transactions. Just like every other filesystem, the support will be built into the kernel. Instead of writing data as NTFS does, the structure will look a lot more like how SQL Server stores data -- with built in indexes, etc.
      
      Many database servers already have some fairly optimized code when it comes to file access. This just implements it at the kernel level, rather than having it sit on top of a traditional fs.
      
      Parent Share
      twitter facebook
    - Re:Windows' filesystem (Score:5, Informative)
      
      by Pfhreakaz0id ( 82141 ) writes: on Friday September 05, 2003 @08:56AM (#6878987)
      
      My guess is it will be something like the MSDE [microsoft.com] engine. So it will be limited. For those who don't know, MSDE is just an embedded, single-user version of the SQL engine. I worked on an app once that used it for laptop users who were offline from the network and would have a copy of the database to search and enter orders in, which would auto-replicate with the master SQL server when it got back on the LAN. It was pretty neat.
      
      Parent Share
      twitter facebook
  - Re:Windows' filesystem (Score:3, Interesting)
    
    by lurvdrum ( 456070 ) writes:
    
    Who owns the patent on this type of filesystem implementation - there must be one? Microsoft, IBM, Seth...SCO?
    - Re:Windows' filesystem (Score:5, Interesting)
      
      by Zocalo ( 252965 ) writes: on Friday September 05, 2003 @08:25AM (#6878679) Homepage
      
      Good guesses. Replace "SCO" with "Apple" and you probably have the right triumvirate. All three were working on this in 1995 or so - Microsoft was going it alone with "Cairo" (should have been Win2K) and IBM/Apple were working togther on "Taligent"/"Pink". Neither project saw the light of day, although whether this was because of the system requirements or a marketing decision based on the paradigm shift is a matter of opinion.
      The idea was probably stolen from Xerox Parc in the first place, of course.
      
      Parent Share
      twitter facebook
      - Re:Windows' filesystem (Score:4, Insightful)
        
        by Zocalo ( 252965 ) writes: on Friday September 05, 2003 @09:46AM (#6879488) Homepage
        
        Yeah, but as I mentioned in an earlier post, *all* filesystems are databases of some type, it's just a matter of context. Generally, when someone says a "database filesystem" today, what they actually mean is "a relational database driven, virtual filesystem providing an infinite variety of views onto a soup of metadata". I think I prefer the former and leaving the rest up to inference, but I'm sure that when these new products finally ship the marketroids are going to think otherwise.
        I do deserve my wrists slapping though... I'd completely forgotten about BeOS! For shame!
        
        Parent Share
        twitter facebook
- Re:Windows? (Score:5, Insightful)
  
  by Zocalo ( 252965 ) writes: on Friday September 05, 2003 @08:18AM (#6878610) Homepage
  
  Not quite, NTFS is a traditional file table with some bells and whistles, but it's not a "database" in the sense meant here(1). The next version of Windows, "Longhorn", is supposed to introduce a new file system called WinFS that will use a version of SQLServer as its backend. Whether they will actually deliver or not is another matter, since we were promised this in 1995 with Cairo and Taligent (remember them?), and now that Longhorn appears to have been pushed back...
  There are also issues with gaining acceptance for the change in the way things work. This kind of thing has not really been done on a large scale in the wild before, on any OS, so whether people will be willing to accept the security and reliablity issues that may ensue is another matter. For example, what are the implications of a compromise in the database engine? MS is planning on using SQL, so if things go awry and it becomes possible to maliciously inject raw SQL to the filesystem interface... Oops. On the otherhand, the benefits for data retrival are *huge*. Imagine being able to find any audio files on your entire system by Justin Timberlake or Britney Spears and delete them all in one go by searching on the tag fields! ;)
  (1) Technically, all filesystems are databases, it's just that current ones are a collection of flatfile database tables that can point to each other, generally in a heirarchial manner. When people say "database" in the same sentence as "filesystem" they usually mean "relational database". As an aside however, high end databases usually forgo the need for a file system and provide the ability to write their tables directly to disk on a dedicated partition.
  
  Parent Share
  twitter facebook
  - BeFS (Score:4, Informative)
    
    by laird ( 2705 ) writes: <lairdp@gmaiTIGERl.com minus cat> on Friday September 05, 2003 @08:32AM (#6878745) Journal
    
    Actually, Be had two flavors of "filesystem as database" in widespread deployment. OK, not as widespread as Windows, but certainly thousands of users. The first version of Be's filesystem, by Benoit Schillings, was very database like, but performance was so-so. The second version of BeFS, by Dominic Giampaolo, was less general in implementation, but had the same metadata-driven capabilities. There's an interesting article on this at http://www.theregus.com/content/4/24485.html. Basically, Be did everything that this project is talking about, years ago. That's not to take anything away from the project -- it's cool if more mainstream operating systems catch up to the innovations of niche players, because more people benefit. Dominic is working at Apple, so there's hope that MacOS X's filesystem will start incorporating the rich-metadata, dynamic view model of the world. And while MS has (I think) pushed the "filesystem as database" out of the next version of Windows NT/XP/whatever, it's still planned for the next version after that, so perhaps in a deade or so we'll all be able to do what Be did back in '91. And of course, Palm owns the Be code, so perhaps PalmOS will lead the way?
    
    Parent Share
    twitter facebook
    - Re:BeFS (Score:3, Insightful)
      
      by cpeterso ( 19082 ) writes:
      
      there's hope that MacOS X's filesystem will start incorporating the rich-metadata, dynamic view model of the world.
      
      you mean like Mac OS 9 and earlier?
  - - Re:Windows? (Score:3, Insightful)
      
      by Zocalo ( 252965 ) writes:
      
      However, I do suspect that any robust interface would take a look at the tags, and if they are empty attempt to parse the filename.
      Actually, I was just thinking about this problem, and you know what would make a *really* easy solution and is readily available already? P2P! Think about it; a new file arrives on the system by whatever means, so the file system has zero idea about it's nature beyond what's available from the file. We probably know the type of file from its header, extension or whatever o
  - - Re:Windows? (Score:3, Insightful)
      
      by netsharc ( 195805 ) writes:
      
      It's innovative because it's an idea implemented on Linux, whereas when it's to be implemented on Windows it's, a lousy idea (well, lousy because of 3rd party compatibility nightmares).
- Re:Windows? (Score:3, Funny)
  
  by Anonymous Coward writes:
  
  SELECT * FROM videos WHERE name LIKE '%porn%'
- - Re:Ahead of the game. (Score:3, Insightful)
    
    by kubla2000 ( 218039 ) writes:
    
    Yeah, and as Longdong gets pushed back and delayed and delayed and pushed back and postponed and delayed, it'll be last to market but microsoft will still have been the first to announce it. I guess that's more innovative than they've been in the past when they'd simply wait for someone to do something interesting before buying them out.
    
    It's not enough to say. One has to do. Microsoft has proved many times over that it often makes grand announcements only to provide something far more watered down by the t
    - Re:Ahead of the game. (Score:2, Interesting)
      
      by Serapth ( 643581 ) writes:
      
      To my understanding, the delay in Longhorn's release is a result of the TrustWorthy computing initive...
      
      This, IMHO, is a good thing. The big difference between MS and Open Source on something like this... in Open Source land, you can often see progress from day one... no matter how unstable it is. With MS, you wont see anything until the whole product is done... Not saying one is better then the other, but...
- Re:Hmm (Score:4, Informative)
  
  by UnuMondo ( 642324 ) writes: on Friday September 05, 2003 @08:03AM (#6878445) Homepage
  
  No, because doing away with the root filesystem, user stuff in /home, config files in /etc, and so forth would break a number of Unix standards Linux's big advantage of being able to run many Unix apps (if you compile from source) would disappear. Storage will apparently be an interface to the existing real filesystem. Joe User won't know the difference.
  
  Parent Share
  twitter facebook
- Re:filesystem is a database (Score:4, Insightful)
  
  by Viol8 ( 599362 ) writes: on Friday September 05, 2003 @08:12AM (#6878558) Homepage
  
  "What this world needs is a really big injection of orginal thought"
  
  They are original ideas, they just don't make it into the PC world where MS dominates. MS come up with as many original ideas as McDonalds
  and since all KDE & Gnome (and frankly most open source projects) are doing is playing catchup with MS then originality is never going to be
  a prime concern.
  
  Parent Share
  twitter facebook
- Re:Backups (Score:2)
  
  by realnowhereman ( 263389 ) writes:
  
  I'm hoping that you are a bit out of date, pgdumpall works fine for me. Since about 7.1 it's done large objects as well. I'm a bit worried that it's not working fine for me and I'm living an illusion. What exactly does it not back up reliably?
  - Re:Backups (Score:2)
    
    by Florian Weimer ( 88405 ) writes:
    
    What exactly does it not back up reliably?
    
    It sometimes dumps database objects in the wrong order, and restore fails as a consequence.
- Re:woot woot (Score:2)
  
  by forsetti ( 158019 ) writes:
  
  But where would you store your CVS file? ;)

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

i think (Score:2, Interesting)

Re:i think (Score:2)

Re:i think (Score:5, Interesting)

Re:i think (Score:4, Insightful)

Re:i think (Score:3, Insightful)

Re:i think (Score:4, Interesting)

so is everyone copying BeOS (Score:4, Interesting)

Re:so is everyone copying BeOS (Score:5, Insightful)

Re:so is everyone copying BeOS (Score:3, Insightful)

Re:so is everyone copying BeOS (Score:3, Informative)

Why so cynical? (Score:4, Interesting)

Re:so is everyone copying BeOS (Score:5, Interesting)

Finally something new to play with! (Score:4, Interesting)

Replacement for ls (Score:3, Funny)

Re:Replacement for ls (Score:5, Funny)

Re:Replacement for ls (Score:5, Funny)

Screenshots ? (Score:2)

Re:Screenshots ? (Score:3, Funny)

Obvious advantages (Score:5, Interesting)

Re:Obvious advantages (Score:5, Insightful)

Re:Obvious advantages (Score:5, Interesting)

Re:Obvious advantages (Score:4, Insightful)

Re:Obvious advantages (Score:3, Informative)

Re:Obvious advantages (Score:4, Informative)

Re:Obvious advantages (Score:4, Interesting)

Re:Obvious advantages (Score:5, Insightful)

Nope (Score:5, Insightful)

Re:Nope (Score:3, Interesting)

Re:Nope (Score:4, Interesting)

Re:Nope (Score:3, Interesting)

Re:Nope (Score:3, Insightful)

Re:Obvious advantages (Score:3, Insightful)

How does the metadata get into the database? (Score:5, Insightful)

Re:How does the metadata get into the database? (Score:5, Funny)

Re:How does the metadata get into the database? (Score:2)

Re:How does the metadata get into the database? (Score:3, Funny)

Re:How does the metadata get into the database? (Score:3, Insightful)

Re:How does the metadata get into the database? (Score:5, Insightful)

Re:How does the metadata get into the database? (Score:4, Insightful)

Re:How does the metadata get into the database? (Score:5, Insightful)

Re:How does the metadata get into the database? (Score:3, Insightful)

Re:How does the metadata get into the database? (Score:3, Interesting)

ext3 + sql (Score:2, Interesting)

Re:ext3 + sql (Score:5, Informative)

Re:ext3 + sql (Score:3, Insightful)

AS400 did this 20 years ago: (Score:5, Informative)

Natural language interface? Hmm... (Score:2)

BeFS hello?? (Score:2)

I18n? (Score:2)

Storage (Score:2)

Patents? (Score:2)

Why link directly againsat libpq? (Score:3, Insightful)

Re:Why link directly againsat libpq? (Score:4, Informative)

ok, (Score:2)

I spot a pirate! (Score:2)

ReiserFS future (Score:2)

"Damn, I left that on my roommate's desk" (Score:5, Insightful)

Re:"Damn, I left that on my roommate's desk" (Score:4, Interesting)

Random thought for the day... (Score:5, Funny)

Oracel IFS (Score:4, Informative)

Microsoft Attempts for decade,GNOME Does in months (Score:5, Interesting)

Re:Microsoft Attempts for decade,GNOME Does in mon (Score:3, Insightful)

GREAT! If it is done well... (Score:5, Interesting)

Re:GREAT! If it is done well... (Score:3)

Ask a librarian how (Score:3, Insightful)

Ah yes, the infamous relational filesystem... (Score:5, Interesting)

No, no, no!!! (Score:3, Interesting)

why the relational model is not right (Score:5, Informative)

Voice recognition? (Score:3, Interesting)

as long as it stays in user space (Score:4, Insightful)

Only as good as your data (Score:5, Interesting)

Comments from Seth (aka Storage's designer) (Score:5, Informative)

A filesystem *IS* a database! (Score:3, Insightful)

Re:Windows? (Score:2, Informative)

Not exactly (Score:5, Informative)

Re:Windows? (Score:3, Funny)

Windows' filesystem (Score:3, Informative)

Re:Windows' filesystem (Score:2, Interesting)

Re:Windows' filesystem (Score:4, Informative)

Re:Windows' filesystem (Score:5, Funny)

A filesystem IS a database! (Score:3, Insightful)