Please create an account to participate in the Slashdot moderation system

 



Forgot your password?
typodupeerror
×
Operating Systems Software

Tom's Hardware Looks At WinFS 809

Alizarin Erythrosin writes "Tom's Hardware Guide has an article about the new WinFS file system. The article talks first about some of the problems and advantages with FAT[16|32] and NTFS, then talks briefly about WinFS. Here is the summary: 'Microsoft is breaking new ground with Longhorn, successor to XP. The upcoming WinFS file system will be the first to be context-dependent, and promises to make long search times and wasted memory a thing of the past. Today, THG compares it to FAT and NTFS.' Personally, I still have reservations about using a relational database to keep track of files. Unless they can keep the overhead to a minimum, I can't see it being as efficient as a file system should be."
This discussion has been archived. No new comments can be posted.

Tom's Hardware Looks At WinFS

Comments Filter:
  • by double_plus_ungod ( 678733 ) on Wednesday June 18, 2003 @12:38AM (#6229907) Journal
    yeah, but you still get a choice--i don't use mac os x's journaling because of the overhead--you don't hve to use winfs if the performance penalty is too high.
  • Again? (Score:1, Interesting)

    by strider44 ( 650833 ) on Wednesday June 18, 2003 @12:41AM (#6229927)
    God - I remember hesitating going up to FAT32 - I had only just gotten win95 and it was nagging me like hell. Then I was nervous about upgrading to NTFS, then discovered how horrible it was not being able to use a drive with anything but WinXP. I have a triple boot with WinXP, linux and win98SE Really, there's no difference between the file systems for the normal user! Why upgrade when it doesn't make your life any better and it takes alot of bother and reduces compatibility?
  • db filesystem (Score:5, Interesting)

    by Horny Smurf ( 590916 ) on Wednesday June 18, 2003 @12:41AM (#6229928) Journal
    Personally, I still have reservations about using a relational database to keep track of files.

    BeOS used indexing for certain attributes, and it is GREAT. Maybe someone is just sour that linux didn't do it first?

  • by tha_mink ( 518151 ) on Wednesday June 18, 2003 @12:41AM (#6229933)
    It just can't be good. Using MS SQL as a database is bad enough, I couldn't imagine depending on it as a file system.
  • by localghost ( 659616 ) <dleblanc@gmail.com> on Wednesday June 18, 2003 @12:42AM (#6229941)
    Journaling doesn't reduce performance much, and at least for me, it's well worth it for the peace of mind, and the lack of fsck. Hard drive space is hardly at a premium, most people can spare the 10%, and without it, I spend 15 minutes scanning my 40GB disk every certain number of boots (or if it's not shut down right). If I used Windows, I'd at least give WinFS a try.
  • Re:db filesystem (Score:5, Interesting)

    by Osty ( 16825 ) on Wednesday June 18, 2003 @12:44AM (#6229959)

    BeOS used indexing for certain attributes, and it is GREAT. Maybe someone is just sour that linux didn't do it first?

    I gathered that the quote was alluding to the fact that while the BFS did initially use a full relational database backend, it performed very poorly. Be replaced the backend with a more conventional one, but kept the SQL-like interface to it. It increased performance, but just wasn't quite as cool anymore. Maybe now that PCs have increased in power by several magnitudes since Be last tried this, Microsoft may actually be able to pull it off.

  • nothing new (Score:3, Interesting)

    by g4dget ( 579145 ) on Wednesday June 18, 2003 @12:45AM (#6229970)
    This will be better than FAT32 and NTFS, but it is hardly "breaking new ground". A number of operating systems have used more-or-less relational databases as their file systems; it's a special purpose technology and has no place in a general purpose OS. I think ReiserFS makes the right kind of compromise here: it uses a little bit of database technology, but it mostly remains a traditional file system.
  • You think? (Score:1, Interesting)

    by ElectricPoppy ( 679857 ) on Wednesday June 18, 2003 @12:46AM (#6229985)
    I find that 4gb (or is it 2gb?) file size limit rather annoying when doing video stuff.
  • by cant_get_a_good_nick ( 172131 ) on Wednesday June 18, 2003 @12:49AM (#6230003)
    I keep thinking back to my Amiga when a 40 Mb hard drive was huge.

    I ran a Mac lab where a lot of the machines had 20meg drives, and that wasn't all that long ago. They used to sell a 10Mb drive (I forgot how ungodly it cost) for Apple ][s. Apple DOS 3.3 could only recognize floppy size chunks, about 140Kb IIRC, so the thing had to be partitioned into along the lines of 80 pseudo-drives. I never saw one physically, but I can imagine what the P.I.T.A. that was.
  • by xWeston ( 577162 ) on Wednesday June 18, 2003 @12:49AM (#6230007)
    As far as i was concerned, WinFS was not actually a real file system but something that just runs on type of an NTFS filesystem.

    This was actually confirmed at WinHEC:

    "Microsoft has scaled back its 'Big Bang', and its Future Storage initiative will build on, rather than supersede the NTFS file system, when the next version of Windows 'Longhorn' appears in 2005."

    "WinFS is not a file system

    NTFS will be the only supported file system in Longhorn, from a setup and deployment standpoint, though the OS will, of course, continue to support legacy file systems like FAT and FAT32 for dual-boot and upgrade purposes. The oft-misunderstood Windows Future Storage (WinFS), which will include technology from the "Yukon" release of SQL Server, is not a file system, Mark Myers told me. Instead, WinFS is a service that runs on top of--and requires--NTFS. "WinFS sits on top of NTFS," he said. "It sits on top of the file system. NTFS will be a requirement."

    Interestingly, when WinFS is enabled, file letters are hidden from the end user, though they're still lurking there under the covers for compatibility with legacy applications. This reminds of when Microsoft added long file name (LFN) support in Windows 95, but kept using short (8.3) file names under the covers so 16-bit applications would still work. Expect this to be the first step toward the wholesale elimination of drive letters in a future Windows version."
  • Re:Again? (Score:4, Interesting)

    by AvitarX ( 172628 ) <me@brandywinehund r e d .org> on Wednesday June 18, 2003 @12:50AM (#6230008) Journal
    ummm, FAT 32 has enormous cluster sizes for large drives. That does affect the normal user.

    Fat 16 was limited to 2GB partitions, that affects the normal users.

    Now, the fact that if the database file system works the way I imagine it would it will be a bad thing for the normal user's more tech savy friend.

    I have spent years explaining to relatives that the same file name in 2 places is 2 different files.

    Now I must spend time explaining that if you brows, documents, taxes, and edit file blah it will effect important stuff, file blah and what not.

    People will be confused by this I believe. And I also think the the techies saying it is stupid would benifit from this greatly, I know I would love to organize things with tons of logical ways to browse there.

    But I am not some overpaid market researcher so what do I know.
  • Oh so informative! (Score:5, Interesting)

    by BasharTeg ( 71923 ) on Wednesday June 18, 2003 @12:59AM (#6230095) Homepage
    I read this article hoping for some real information on the WinFS file system, and instead I got an amature's review of Microsoft file systems I grew up with.

    "There has been much speculation"

    Uh huh.

    "Win FS is modeled on the file system of the coming SQL server"

    Uh huh.

    "In its latest build (M4), Longhorn contains few hints of the technology's imminent implementation."

    Uh huh. You're saying you don't know anything, yeah, I'm getting that part.

    "One of those is more than 20 MB in size and bears the name winfs.exe."

    Neat.

    "In the end, Win FS will probably emerge as an optional file system beside FAT and NTFS. It's also possible that Win FS will supersede its predecessors, however."

    So in the end, it'll be A... but it is also possible it'll be B. I see.

    "That would most likely produce problems for multi-boot systems"

    An astounding feat of logic Mr. Spock!

    This is the most uninformative article I've ever had the displeasure of reading on Tom's Hardware. These people know exactly nothing more about WinFS than any of the rest of us have heard in rumors and vague press releases.
  • by mrklin ( 608689 ) <ken...lin@@@gmail...com> on Wednesday June 18, 2003 @01:06AM (#6230146)
    Normal user does not triple boot their system nor do they bother with two versions of Windows.

    NTFS has tons more advantage than FATxx. The official list can be found here [microsoft.com]. Granted, this benefits the corporate user more than home user.

    At the very least, NTFS offers a quicker way to hide porn than FAT32.

  • by SWTP_OS9 ( 658064 ) on Wednesday June 18, 2003 @01:06AM (#6230149)
    Not the only one!

    With MS database record they have got to be kidding! I know that Windows CE uses a DB format for storage but I want to see it under max load with "n" task accessing it and the planet worth of data to pull from with a good percentage being changing. Then crash it and try to restore the mess. What would the resulting speed be. Recovery time.

    I guess you will need a 4 to 6 ghz system with an insane speed hd array and memory up the wazu!

    Instead of revamping the wrapper why not improve on surviablity of both data/os/programs! When will they get it in their head that the OS does and should not be a swiss army knife with cheep blades that are dull, usless, break and hard to open!

    I cant remember but they was something based on somthing called "tumblers" that was a way to access data. Read somting in a ancient issue of Byte mag. Had to do with Objects and mondering content.
  • Good idea (Score:4, Interesting)

    by Bodrius ( 191265 ) on Wednesday June 18, 2003 @01:07AM (#6230153) Homepage
    Hopefully this will encourage more competitors (including open source) to go for the RDBMS-based filesystem model.

    I don't understand the concerns of the poster regarding performance (at least without evidence of truly dismal performance): no one is forcing anyone to use the FS if they are not satisfied with performance.

    For most users, they main bottleneck in storage is their own organizational faculties. I used to be exasperated when users didn't know where they put their files, but once you get past the 100GB mark, it becomes very understandable.

    Consider what most people use their massive storage for these days: videos, music, multimedia, games. Not only is this the kind of content that SHOULD be stored in a database, it's the kind of content that is ALREADY being handled through a database because the filesystem is not enough: people are using their media players, P2P programs and other software to handle their files, up to the point they rarely ever interact with the filesystem unless they lost a file.

    For most users, the performance penalty is well worth the price.

    For those for whom it is not, it doesn't take a genius to realize you can use more than a single filesystem, and perhaps rediscover the joy of proper partition organization: keep the OS and applications separate from your data, and you can use your highly efficient filesystem for the first and your metadata-loaded one for the second.

  • Better, not best (Score:5, Interesting)

    by Peaker ( 72084 ) <gnupeaker@nOSPAM.yahoo.com> on Wednesday June 18, 2003 @01:09AM (#6230171) Homepage
    Relational databases are better than conventional file systems in both performance and transaction management/journalling.

    However, the best solution is that used by EROS [eros-os.org], which is for the kernel not to provide a file system at all, but instead provide Orthogonal Persistence.

    This is a much simpler layer for applications, since it doesn't require them to explicitly access the memory and disk separately. It is also much simpler to recover from because the entire state of the whole disk is always known to be coherent with itself at all given points in time, without an expensive journal.

    In terms of performance - it beats the hell out of explicit disk access systems (Both conventional and database systems) because it performs big continuous reads and writes (that don't move the head much) rather than small writes on metadata and file data that forcibly jump the disk head around.

    In EROS then, on top of the Orthogonal Persistence, you can create any arbitrary Objects you want easily - because they're just normal processes with normal memory. Conventional File Systems become useless and objects implemented by processes become a much better and more powerful alternative to files.

    A relational database of the user objects is then much more powerful than a string hierarchy, but this is all the user's choice - and not hardcoded into a kernel.
  • Truth be told... (Score:5, Interesting)

    by Da VinMan ( 7669 ) on Wednesday June 18, 2003 @01:11AM (#6230185)
    I'm looking forward to this! I personally am sick and tired of filesystems as we know them today. Today's filesystems are a strict hierarchy, the existence of which is only necessary in the systems of yester-year.

    A filesystem based on a relational database will have some characteristics to which today's filesystems can only aspire:

    1. ACID - In every way that the underlying database supports Atomicity, Consistency, Isolation, and Durability [techtarget.com], so now will the filesystem. In so far as the database is robust, the filesystem will be robust. Please spare me the comments about the supposed unreliability of SQL Server. Itâ(TM)s certainly more reliable than NTFS; which is itself very good.

    2. As an offshoot of the above - Imagine multiple file updates to a filesystem which is transactional! Imagine that transaction failing and being able to just rollback the changes without touching every file in your program! Imagine being able to make file changes programmatically without having to worry about locking because the engine will do it for you (just handle any exceptions)! Yeah, you could do all that today if you like. But it takes extensive to make it happen.

    3. Operational characteristics - We can run queries against databases. We can index them. We can cluster them. We can replicate them. We can access them easily from any development platform you can imagine. Now your filesystem is a database. The possibilities make me shiver! :+) Maybe the initial implementation wonâ(TM)t get all this right. But at least it stands a chance.

    4. Another offshoot from #3 - Security. Databases are inherently better than filesystems (IMNSHO) at enforcing security and enabling administration of security.

    I only have reservations about one issue with the database as filesystem area: recovery. Currently, all good and low-tech filesystem recovery tools really are based on the filesystem allocation table sort of scheme. Obviously, databases usurp this category of tried and true tools. However, good tools already do exist that allow recovery of relational databases. Itâ(TM)s just a matter of getting easily accessible tools of this sort into the hands of professionals that need them. It's more of a training issue I guess, but it will still need addressing.

    I know many people will have a knee jerk reaction to this idea, and I understand why. But I would encourage people to keep an open mind to this. While there will probably be some issues with the idea, there's so much more that could easily be done with a filesystem on top of a database than could be done easily (or well) with a traditional filesystem.

    And for you hard-core naysayers out there, you have to ask yourself this: If this is such a bad idea, then why did Oracle provide this as a feature too? [oracle.com]
  • by Future Shock ( 634657 ) on Wednesday June 18, 2003 @01:11AM (#6230190) Journal
    Near-Term: this thing should be just as stable as every other MS product prior to version 3.0 of it. (In short, damned lousy). To make it worse, it probably also enables DRM at a file system level...

    Mid-Term: FS finally works, and allows easier retrivial by relevance, author, source, etc. in ways that we can just dream of now. It's the kind of thing we didn't realize we needed until we had it...until it inevitably blows up as all MS products must do eventually. But when it works, we will be fairly happy to have it...especially end users, most of whom can't figure out a hierchical file system in the first place.

    Far-Term: FS is finally able to use it's relational roots to distribute filesystems over multiple processors in an cluster or over a network. Such a system would support atomic, distributed file updates by threads of processes on differing processors (including HyperThreaded procs). Imagine a virtual filesystem that can span your whole-house network, with a single file system image...in WINDOWS.

    So I guess my view is: painful in the near-term, but may be cool to have when they get it right.
  • by Zenki ( 31868 ) on Wednesday June 18, 2003 @01:17AM (#6230225)
    Does it matter? HPFS was created at MS. I guess it explains why HPFS hasn't been improved on recent OS/2 beyond HPFS386, and JFS is now an optional FS on OS/2. JFS is probably a much more capable FS than HPFS anyhow.

    http://www.cs.wisc.edu/~bolo/shipyard/hpfs.html
  • Fast != Fast (Score:4, Interesting)

    by WoodstockJeff ( 568111 ) on Wednesday June 18, 2003 @01:18AM (#6230232) Homepage
    I can see where a more efficient directory structure might be helpful, but... will they continue to sacrifice file access speed for file search speed?

    I recently installed a Win2K server that is blindingly fast at finding documents and such... but horridly slow at serving up portions of files, for things like legacy database programs. Three of the customer's applications started running at 1/4 speed.

    It got so bad, even after all the "fix win2k speed" patches, that we re-introduced the 200MHz NT4 server to feed the database apps, and the dual-processor 2GHz system just serves up documents!

  • by Nucleon500 ( 628631 ) <tcfelker@example.com> on Wednesday June 18, 2003 @01:42AM (#6230381) Homepage
    Most of what we say is guessing, because we don't know the question MS is trying to answer. I can't think of any goals best met by WinFS.

    A directory tree is a very useful structure, at least to the software. Similar stuff is grouped together, and easily cached. It provides a very clean and simple way of putting data somewhere and getting it back later. This should not lightly cast aside.

    So, you want to use a relational database to keep track of files? Go for it, but instead of keeping track of the files themselves, keep track of their paths. Let the filesystem do the efficient storage, and the database do the efficient lookups. The database can be made faster and smaller, the filesystems can remain as fast as they are, and the files are still there even if the database gets corrupted.

    Put hooks wherever necessary to update the database when the filesystem changes. For example, put a database in the root of each filesystem. Use a stacked mount to mount that disk, so when interesting things happen, the kernel tells a userspace process that updates the database. Then, make some standard libraries that use the database. Make file browsers that can query it, but pass the path to programs. Make save dialogs that can also save metadata about the file, and open dialogs that can search for it. Use LUFS or FUSE to make directories that correspond to queries.

    This is just as effective as what MS is doing [theregister.co.uk], but it's more efficient, it's more compatible, and it doesn't reinvent the wheel.

  • by xWeston ( 577162 ) on Wednesday June 18, 2003 @01:47AM (#6230418)
    I believe they originally planned to have it as an entirely new filesystem but that they wouldnt be able to hit the mark with it...

    The article that i got some of that information from was from The Register: http://www.theregister.co.uk/content/4/30670.html [theregister.co.uk]

    Also, there is more information here: http://www.winsupersite.com/showcase/longhorn_prev iew_2003.asp [winsupersite.com]
  • New FS (Score:3, Interesting)

    by rf0 ( 159958 ) <rghf@fsck.me.uk> on Wednesday June 18, 2003 @02:16AM (#6230613) Homepage
    New file systems always worry me esp with regard to data loss. With FAT and NTFS they are old but they are stable. I've never seen the OS lose data, though if there is a sudden crash then yes, but not during normal operation.

    New FS = New corruption?

    Rus
  • by Anonymous Coward on Wednesday June 18, 2003 @02:38AM (#6230706)
    fsck is in OS X. Just Command-S at startup for single user mode and run it
  • by ComputerSlicer23 ( 516509 ) on Wednesday June 18, 2003 @02:40AM (#6230718)
    Hmmm, several things.

    First, the async, means that not all reads and writes are syncronous which is an incredibly good thing for speed. Try putting your UFS/FFS filesystem into fully sync mode and then talk about performance, I'm willing to bet that UFS/FFS isn't sync by default either. However, calling fsync in the mail server (normally sendmail) in Linux will actually make it sync before returning. So no worries about RFC 1123. It's the SMTP server's job to ensure that it tells the filesystem, make sure the bits are on the disk. If Linux didn't have the ability to ensure bits where actually on the disk nobody would use it. That's why in Moshe Bar's series comparing Linux, FreeBSD, and OS X, he always said he recompiled after removing the fsync calls, otherwise you just compared how fast the disks in each system were.

    For goodness sakes, Oracle ships on Linux, if Linux couldn't get the bits on the disk Oracle would have never ported to it. Not a chance. If Linux tells you the bits are on the disk, they are on the disk in my experience.

    I've heard of people losing UFS filesystems while running them under NFS, or losing them due to any number of naferious VM race conditions. So what? Welcome to the real world, people lose data, buy a tape drive, make backups. Knew a guy who got really good at rebuilding filesystems by using dd on Solaris to recover email for customers.

    Oh, and as I recall, async actually affects directories more then files, if you put the sync modifier on the filesystem, it only affects directories, not the file data for ext2/3. In ext3, directory writes are always journaled as I recall, so it shouldn't make much difference.

    Now, from what I've heard of Linux and FreeBSD, is that until the late 2.2.X and early 2.4.X, there we're certain jobs Linux couldn't do like run big Usenet News services, or really disk intensive applications they the filesystem buffering was really hard to get right, and might cause corruption. The guy who ran a local ISP always said FreeBSD never did that when he was running the Usenet server on it, but Linux did with some regularity.

    ext2 hasn't lot any data of mine in my 7 years of using Linux, including running a 120GB Oracle Database for the past 30 months. Ext3's never lost any data since I started using it. I've lost disk drives, I've lost mirrors, I've lost files, never lost a complete ext2 filesystem unless the disk just stopped spinning. Lost a couple of ReiserFS filesystems after installing RedHat7.0. Never tried most of the other journalling filesystems.

    Kirby

  • by Jugalator ( 259273 ) on Wednesday June 18, 2003 @02:49AM (#6230755) Journal
    Also, WinFS is no file system like FAT and NTFS. It's just a service running on top of NTFS.

    It's really funny how they try to compare it with a file system, since they're just looking at NTFS with a layer giving the user an easier time to do certain things.
  • 160,000 Files (Score:2, Interesting)

    by canowhoopass.com ( 197454 ) * <rod@nOsPaM.canowhoopass.com> on Wednesday June 18, 2003 @02:53AM (#6230774) Homepage
    I have ~ 160,000 files taking up 55 gigs on my NTFS partitioned hard drives. It took over 5 minutes on my 1.6ghz machine to come up with that.

    To search for a specific file often takes much longer.

    Personally I look forward to a better, faster file system on Windows. Although I'd still hold off judgement of the new system until it becomes available.

    -
    Rod
  • by Gldm ( 600518 ) on Wednesday June 18, 2003 @03:11AM (#6230841)
    I did use command line copy on both OS's. I've also tried tons of "accelerator" programs that claim to be faster at copying. I tried same partition, different partitions, changing cluster size to every possible setting it allows, etc. With Atto, the drive performs well once write size gets over 256K, but the OS's internal copy routines for both command line and drag/drop apparently want to write one 512 byte cluster at a time and confirm it went to disk before writing the next one, which is crippling the write performance because the writes won't cache or stripe.

    Just "right click, turn on write cache etc" (from a previous post to this) DOESN'T WORK. If you'd care to READ what I originally posted I mention that it indicates write cache is enabled when IT ISN'T. It's pretty obvious whether it is or isn't based on the write performance. When read is 80-90MB/s and write is 1/10th of that, there's a problem. It's called the OS is forcing write-through, i.e. confirm all data to physical disk instead of just write back to the cache.

    As for the controller, it's a 3Ware Escalade 7500-4, not one of those POS promise things. The drives are 4 Western Digital 1200JBs in RAID5. My previous Escalade 6400 and 4 75GXPs in RAID0 had the same problem. (this was asked in one of the previous posts)
  • by Anonymous Coward on Wednesday June 18, 2003 @03:14AM (#6230856)
    Copying DLLs between two points on the same machine probably isn't a copyright violation. And, even if it is, copyright law has an exception for repair software.
  • by adamsc ( 985 ) on Wednesday June 18, 2003 @03:15AM (#6230860) Homepage
    I think the performance of WinFS will tell us how serious Microsoft is about really changing the way files are used. Performance is just a question of time and engineering resources - OS X's journaling is slow but HFS+ is an antique filesystem; in contrast BeOS had BFS, a journaled filesystem with all of the indexing buzzwords WinFS claims except free-text context searches and it was also extremely fast.

    The difference isn't features - BeFS supported everything HFS+ does and arbitrary attributes, journaling, much larger file/filesystem support, and indexing and it was still faster. Be simply made performance a much higher priority than Apple has so far; fortunately they've hired the BeFS lead developer and perhaps 10.3 will have some surprises.

    Another good example is ReiserFS - while some of their choices reflect overall design goals (e.g. targeting large numbers of small files instead of BFS's massive videos) they've largely passed the traditional filesystems in most areas despite having to do more work to keep all of the extra features going.

    Microsoft has a number of engineers who do understand performance; the question is simply whether it'll be a significant priority for them to make WinFS fast enough that we'll realistically be able to use it.
  • by DarkEdgeX ( 212110 ) on Wednesday June 18, 2003 @04:20AM (#6231101) Journal
    Yeah, but would they be 100% compatible with NT's internal representation of how ACL's are supposed to work? Highly unlikely, so that would require Microsoft to still embrace and extend, and with the GPL involved, be forced to give out the changes that were necessary to make the open source FS work with an NT-based OS.
  • by Twylite ( 234238 ) <twylite&crypt,co,za> on Wednesday June 18, 2003 @04:59AM (#6231220) Homepage
    It's a hierarchical organization; it doesn't matter how big the drive gets.

    One word: FAT. You are making three assumptions here. The first is that the underlying implementation is capable of supporting near-infinite extension without degradation. Invalid for FAT, valid for the FS types mentioned in the grandparent, and the reason for what I said. The second is that the file system will be used as a hierarchy, which is invalid for most end users. The third is a combination of the first and second, being that the file system extends without unreasonable degradation to a vasst number of files in a single directory, and performing operations (esp. searches) on them quickly. This is invalid for all of these file systems, because of how they store metadata.

    Why would I want to do "a simple name search for a file across an entire drive"? The file name is meaningless outside the context of its hierarchy.

    Again, you're assumiung you, a technically savvy user. End users don't behave like this. By and large they use meaningful file names in a single directory. If you're looking for a document someone else did, it will be in their single directory, not in a common folder for documents relating to that topic. If you don't know who worked on the document, you need to do a broad search based on keywords.

    Yes, and it's fine for what it is. I certainly wouldn't want the overhead of updating the "locate" database every time a file changes somewhere.

    Which shows how little you've thought about the implementation of this system. You only have to make a change if the file metadata changes. In many file systems you already have to write that change in a different location to changes to the file itself (if you don't, your metadata search time goes out the window). If your "locate" database is a relational database, making a change has trivial overhead.

    Well, welcome to the club. For years, Linux has had several implementations, among them FAM, dnotify, and changedfiles, with hooks into indexing systems, and Linux is hardly the first.

    Actually, this isn't what I was meaning. I was referring to the relationship between the data in the FS and in the locate database (or any other metadata search database), and indicating that WinFS (in theory) takes out the step of building a separate database by using the database as the "index" of the file system. Unfortunately in this incarnation of WinFS (the current implementation) MS will not be implementing it quite in that fashion.

    But to answer your point ... Win32 systems have had file change notification in their APIs from day 1 (NT 3.1 / Win95 + have FindFirstChangeNotification; NT 3.51 + have ReadDirectoryChangesW).

    [snip] you are far better off with a real document database

    And that's pretty much what MS is doing by converging a tradition file system with a metadata view.

    Microsoft should focus on creating a robust file system with decent performance, not get side-tracked with gimmicks. NTFS could still stand a lot of improvement.

    Of course, WinFS was intended for client operation systems, not servers. And while NTFS could still be improved, it doesn't make a lot of sense to do so: most high data volume applications store their data in structured files, and don't require much from the file system in any place where performance could be signficantly improved.

  • by EddWo ( 180780 ) <eddwo@[ ]pop.com ['hot' in gap]> on Wednesday June 18, 2003 @06:06AM (#6231439)
    I think that is what it will end up being. NTFS underneath and the WinFS running as a service. Most applications will still use normal file system access calls. The winfs seems mainly for shell integration, searches etc. If the files are still saved on ntfs they will still be accessible from linux, minus the fancy queries. The only problem will be you won't be able to update the database structures from linux, at least not right away.
  • Re:db filesystem (Score:3, Interesting)

    by a_n_d_e_r_s ( 136412 ) on Wednesday June 18, 2003 @06:54AM (#6231556) Homepage Journal
    To stop competitor DB software firms like Oracle.

    "You dont need a database when you have our file system." they now can say.

    MS is basically putting everythingn under the sink into the operating system so that no one can compete with them. A practice they started after they could not have hidden systems calls to make MS own applications faster than competitors and could not force others to ship IE.

    This way all other software companies go belly up since they canÃt offer anything that not alreadu is part of the operating system.

  • by illtud ( 115152 ) on Wednesday June 18, 2003 @07:15AM (#6231616)
    Nice information, but it has nothing to do what I was talking about.

    The term for that is 'non-sequitur', and you've just posted a lovely example of one. Let's go back to my post and see what I was replying too (hint, nothing to do with XP at all) - it's the bit you snipped:

    As for '(I belive)2Gb', you are referring to the FAT16 installation of NT4. It doesn't apply to WindowsXP.

    That's what I was replying to. I was attempting to clarify that the limit (4GB, not 2GB) also applied to an NT install in which you specified NTFS (your post seemed to imply FAT16 only).

    I don't think we're disagreeing. I was clarifying a point you made which could imply something which wasn't the case.

  • by rekkanoryo ( 676146 ) <rekkanoryo AT rekkanoryo DOT org> on Wednesday June 18, 2003 @08:48AM (#6232147) Homepage
    NTFS is actually not bad with fragmentation. I'm running a Win2k Pro box and a WinXP Pro box; neither's partition has ever reached more than 12% fragmentation, and even that was after having the partitions 98% full. Most people won't notice NTFS file fragmentation as a problem until it reaches 50% to 60% anyway. FAT32, however, is quite a different story. I can start to notice performance hits around 9% fragmentation or so. Also, according to my MCSE training kit, the main cause of filesystem fragmentation on windows machines is using a page file that does not have a static size. Using a statically-sized page file can decrease defragmentation dramatically. (If you don't believe me, set your paging file to a static size between 1.5 and 3 times the amount of physical RAM you have, reboot, defragment, and test with every FS torture you can think of.)
  • by zeugma-amp ( 139862 ) on Wednesday June 18, 2003 @08:53AM (#6232190) Homepage

    Now storing the meta data in a database, which is essentially what WinFS and such are doing, is not as clear a benfit. Personally I can imagine that it would be a very practiacal FS for keeping movies and MP3's on. I don't really see the benefits of running the OS files on that FS though. A lot of unneccesary overhead. (I don't search for files in my OS partition very often.)

    Indeed. It seems like what they are claiming as an improvement, (i.e., faster searching for files), does not appear to help what people actually do most of the time. It is similar to claims of "boots much faster!" that you used to hear about new versions of windows. I would think the thing that would be important to people is data integrity and access efficiency. I know my primary concern is "how safe is my data".

    I also question the need to include the overhead of a database frontend to the filesystem. Seems like a catastrophe just waiting to happen.

    Also, since the DB is always active, what issues do you have with backups? I'd be concerned about backup and restoral issues with this type of filesystem. I haven't seen that addressed at all.

  • by Salamander ( 33735 ) <jeff@ p l . a t y p.us> on Wednesday June 18, 2003 @10:04AM (#6232855) Homepage Journal
    Oh, and forget softupdates, they are _not_ comparable to journaling filesystems, for instance you still need to fsck, it's just faster.

    That's true, but misleading. If soft updates are done right, the only reason to fsck is to reclaim resources (orphaned blocks etc.). It is not necessary to get your filesystem into a usable state, and can therefore be done in the background after you've come up. Journaling filesystems also still need to fsck, it's just faster and it's called a log redo, and that is necessary to make the filesystem usable. I'd say the two are very comparable, and soft updates come out slightly ahead. BTW, I'm one of those guys who writes filesystems, the ones you say are not so dumb. :-P

  • Exchange File System (Score:2, Interesting)

    by kyoko21 ( 198413 ) on Wednesday June 18, 2003 @10:08AM (#6232896)
    Not having read Tom's Hardware review of WinFS, but from the sound of things of the post, if WinFS is supposedly using a relational database to keep track of files, this sort of already sounds like what MS is implementing in some of their exisiting software such as Exchange and Sharepoint. Sharepoint utilizes Exchange-like file system where you can store files into the Sharepoint repository. In the first delivery of Sharepoint, there was an upper bound on the size of an individual file to be stored (I believe it was an 4GB limit) but in the current release I believe there is no upper limit. From what someone had told me if i recall right, Sharepoint utilizes the SQL engine to keep track of the files that are stored in Sharepoint. Maybe MS is just taking what they have learned from Sharepoint and making it more 'general purpose' for day to day use.
  • Re:db filesystem (Score:3, Interesting)

    by aphor ( 99965 ) on Wednesday June 18, 2003 @10:28AM (#6233123) Journal
    Maybe now that PCs have increased in power by several magnitudes since Be last tried this, Microsoft may actually be able to pull it off.

    CORRECTION: Now that PCs ... anyone may actually be able to pull it off.

    What I don't see addressed here is the added complexity of the bootstrap required to support RDBMS based files. Where are you going to stick this bootstrap? I see a tightly controlled licensing arrangement between motherboard suppliers and MS. "Thou shalt not boot except through WinFS bootstrap code which is licensed to you for this purpose. We will revoke your license to distribute WinFS bootstrap if you make us cry. We will take OUR ball and go home, and you will not be able to sell any PCs to our captive users."

  • by Salamander ( 33735 ) <jeff@ p l . a t y p.us> on Wednesday June 18, 2003 @11:12AM (#6233630) Homepage Journal

    BTW, cool name. If you ever decide to abandon that identity, let me know. ;-)

    Is it right to assume that softupdates are faster, but journaling might save a little bit more data in case of a crash?

    I'd have to say it depends. The beauty of soft updates is that they require exactly zero additional writes beyond what you'd be doing anyway; you're just being careful about the order in which you do them. Performance is fine, but this pretty much does nothing to ensure that data is consistent without some sort of sync/flush. With journals the picture is more complicated. Yes, there are additional writes, but they can be overlapped with the writes you're already doing so they often don't impact performance that much. Also, there are usually more opportunities to combine/strategize the metadata writes. Ultimately, the performance ends up being affected very little. As far as data protection, it's a big tradeoff. Most journaling filesystems only journal metadata, so they provide the exact same non-guarantee regarding data that soft updates would. If you want to journal data as well you get a better guarantee but worse performance, and it's rarely done; if you're heading in that direction you might as well go all the way to a log-structured filesystem.

    There are certainly ways that either journaling or soft-update filesystems can be tweaked to provide guarantees for data or metadata. In either case, you write to a "clean" set of blocks (never write in place) and take care of the metadata updates in such a way that if the metadata makes it the new data automatically comes along for the ride and if it doesn't then the blocks containing new data get reclaimed. This can be useful in certain cases, but it can also suck massively for performance if you have a lot of sub-block updates.

    As you can see, it's an interesting set of tradeoffs. It gets even better when your filesystem is distributed. No matter what, though, I tend to prefer soft updates due to greater storage efficiency and less need for provisioning/tuning.

  • by c13v3rm0nk3y ( 189767 ) on Wednesday June 18, 2003 @11:53AM (#6234077) Homepage
    ...many of these features are already available in one form or another and no one uses them

    For most file data, perhaps.

    I will use this, and to good effect, as well.

    The point to take into consideration is that the context will also change depending on the metadata available. Your view of the aggregate file objects changes, depending on the context. Not to mention that this same metadata will be available, in the same format, to all participating applications. Your apps can have all the same view, if you like.

    What this means in concrete terms is that your carefully sorted directory of MP3's can look like a file library in iTunes. There are searchable, sortable columns for Title, Album, bitrate, Cover Art, year, label, and whatever (note I did not say "filename", which is just another attribute under a modern filesystem). This is possible with only the most basic gestures on the part of the user, and is remembered for the next time you visit this same view.

    Similarly, a tree of photographs appear in any participating file browser with whatever columns you want (bit depth, format, date taken, date published, ICC info). It's important to consider that you can do this with any arbitrary collection of data, even one's you define yourself (to take the BeFS example, anyway).

    So you can take your collection of widgets, define attributes about these widgets, and your file browser applet works the same for the same user in all applications. It should, anyway. This is why we have APIs.

    To cite your example, why visual grep through a bunch of thumbnails looking for a particular photo when you can just indicate with a few gestures the "type" of photo you are looking for? I like the iPhoto interface when I'm browsing photographs, but if I want a particular photo of the GF from a rough date taken at night, I certainly don't want to browse through 1000's of images, especially when some of them can be hard to discern at thumbnail resolutions. I certainly don't want to do this repeatedly when I'm assembling a photo album on a specific subject.

    Let the computer do the grunt work of selecting a result set that matches my criteria, and then I can use my human abilities to select the object I want, or refine the search.

    Most of us already keep our aggregate file types in associated groups on the filesystem already. In most cases, the tree structure of most filesystems is sufficient. All this does is extended the functionality of the filesystem so that you can choose to abstract aggregate file objects and treat them in a a myriad of different ways. In the most basic sense, you tell the OS, "look, when I have the Explorer/Finder open on this directory of MP3's, make sure you change the column view so it shows this, this and that. In icon view, make sure that mouse-over pop-ups (if enabled) display this that and that. Default sort is alphabetically by Artist's Last Name. I don't want to see the filename, as that doesn't contain any useful information."

    That is, you don't have do anything special to make use of the file attributes in this way. You just tell the ultimate app that all of us use the most (the operating system's file browser) to treat certain directories in a different manner.

  • by platypus ( 18156 ) on Wednesday June 18, 2003 @11:54AM (#6234099) Homepage
    BTW, cool name. If you ever decide to abandon that identity, let me know. ;-)

    Hehehe, I forgot to compliment you for the name of your homepage ;).

    Most journaling filesystems only journal metadata, so they provide the exact same non-guarantee regarding data that soft updates would.

    I read once a paper about softupdates (quite old, I think it's the paper presenting the idea of softupdates for the first time, at least it reads that way), where they (completely IIRC) talk about an "update daemon" which writes the dirty in-memory metadata blocks to disc at regular intervals. That lead me to the conclusion about softupdates loosing somewhat more metadata in case of a crash. But OTOH there's a lag in writing to a journal, also.

    As you can see, it's an interesting set of tradeoffs.

    And as if that weren't enough things to think about, I heard that there are these drives which plainly lie about what has been really written to the platters.

    No matter what, though, I tend to prefer soft updates due to greater storage efficiency and less need for provisioning/tuning.

    Oh come on, in reality you prefer softupdates because you are a BSD zealot. ;)))

    Thanks for your explanations, and if I ever decide to sell my slashdot handle and the attached wellness of super positive karma on ebay, I'll make you a special offer ;).

"May your future be limited only by your dreams." -- Christa McAuliffe

Working...