The Design Of The Google File System

Follow Slashdot stories on Twitter

The Design Of The Google File System 210

Posted by timothy on Monday September 29, 2003 @06:32PM from the library-of-babel dept.

Freddles writes "This is an interesting paper (PDF) describing the design approach to Google's file system. The design had to take account of requirements for huge file sizes, a highly responsive infrastructure and an assumption that hardware components will always fail."

This discussion has been archived. No new comments can be posted.

The Design Of The Google File System

Load All Comments

Search 210 Comments Log In/Create an Account

Comments Filter:

In case you don't like PDF (Score:5, Informative)

by Brahmastra ( 685988 ) writes: on Monday September 29, 2003 @06:34PM (#7089397)

Here's the html link [216.109.117.135]

Share
twitter facebook
- Re:In case you don't like PDF (Score:3, Funny)
  
  by redJag ( 662818 ) writes:
  
  In case you hate highlighting as I do:
  try this :) [216.109.117.135]
  
  I wish I had enough RAM to use as a harddisk. Then I could...well no, I wouldn't do anything useful. It would be cool, in a geeky way.
  - Re:In case you don't like PDF (Score:5, Funny)
    
    by Short Circuit ( 52384 ) writes: <mikemol@gmail.com> on Monday September 29, 2003 @07:16PM (#7089717) Homepage Journal
    
    A ramdisk would make for a great swap partition. :)
    
    Parent Share
    twitter facebook
    - Re:In case you don't like PDF (Score:2)
      
      by proj_2501 ( 78149 ) writes:
      
      actually i used to use a RAM disk on my mac as my browser cache.
- In case you don't like links at all (Score:3, Funny)
  
  by Bingo Foo ( 179380 ) writes:
  
  In case you don't like reading stories and links before posting, remember this is Slashdot.
- - Re:In case you don't like PDF (Score:2)
    
    by gl4ss ( 559668 ) writes:
    
    well, most probably he meant it as a joke(hiding the name behind blank number and all).
    
    as a sidenote, here is the google pdf-to-html cache of it: http://www.google.fi/search?q=cache:m0TMQYgIlIoJ:w ww.cs.rochester.edu/sosp2003/papers/p125-ghemawat. pdf+&hl=fi&ie=UTF-8 [google.fi]
  - Re:In case you don't like PDF (Score:3, Funny)
    
    by bugnuts ( 94678 ) writes:
    
    How ironic, that the HTML-ized file on Google is available from Yahoo!...
    
    Yahoo uses the evil Anti-Google FS. It's the 1's complement called GllgOe. It can store 01111111111 1111111111 1111111111 1111111111 1111111111 1111111111 1111111111 1111111111 1111111111 1111111111 bytes of data.
Thoughtful... (Score:5, Funny)

by Anonymous Coward writes: on Monday September 29, 2003 @06:37PM (#7089412)

It was thoughtful of the poster to link to google.com for those that have never heard of it.

Share
twitter facebook
- Re:Thoughtful... (Score:5, Funny)
  
  by Queuetue ( 156269 ) writes: <queuetue AT gmail DOT com> on Monday September 29, 2003 @06:40PM (#7089450) Homepage
  
  Absolutely - I was about to go look google up on teoma and askjeeves...
  
  Parent Share
  twitter facebook
- Re:Thoughtful... (Score:5, Funny)
  
  by Anonymous Coward writes: on Monday September 29, 2003 @06:48PM (#7089510)
  
  Last week I had a co-worker ask how to spell it. He is MS cert'd for Win2k Pro. Don't mod this funny, it's sad.
  
  Parent Share
  twitter facebook
  - Re:Thoughtful... (Score:4, Funny)
    
    by daeley ( 126313 ) * writes: on Monday September 29, 2003 @07:33PM (#7089891) Homepage
    
    Last week I had a co-worker ask how to spell it
    
    I-T. Really now, how hard is that?
    
    Parent Share
    twitter facebook
  - Re:Thoughtful... (Score:2)
    
    by Pfhreakaz0id ( 82141 ) writes:
    
    save him some time... www.google.com/microsoft for windows troubleshooting issues rocks!
    
    that is sad though.
  - Re:Thoughtful... (Score:2)
    
    by lostchicken ( 226656 ) writes:
    
    Yeah. We need a +1 Sad moderation.
    We can use it whenever people know just too much about trivial things.
    
    +5 would show up as (Score:5, No Life)
- Re:Thoughtful... (Score:2)
  
  by OneFix ( 18661 ) writes:
  
  While everyone should have already heard of google, it's kinda dumb to use one search engine when you can use a meta-engine like Turbo10 [turbo10.com] that uses all of the main search engines and some lesser known...
  
  Of course, AllTheWeb is giving Google a run for its money...in the race [searchenginewatch.com] to make it to 4 billion pages indexed, so Google may fall back down for a while...
  
  However, I don't think many ppl will switch because of a few thousand pages...
  - Re:Thoughtful... (Score:4, Insightful)
    
    by xoboots ( 683791 ) writes: on Monday September 29, 2003 @08:11PM (#7090214) Journal
    
    There's a reason not every search engine is considered the same. Try a simple search for a popular item. I searched for "PHP" on the three sites you mentioned. The top returned results are as follows:
    
    Google:
    - top result: php.net
    - 2nd place was php.net/downloads
    
    AllTheWeb:
    - top result: Hands-On PHP Training - 4 days $1695 (also ranked #10 on Turbo10, but not ranked in the top 20 at Google) -- oops, that is a sponsored link, but in AllTheWeb's default view, it looks like a normal link. php.net is actually ranked #1, but it appears 4th in the list of available links.
    
    Turbo10:
    - will not provide ANY results without Javascript turned on (BOO!)
    - top result: GBF Masonry Cleaning Services..Stone Cleaning
    - php.net ranked 5
    
    Draw your own conclusions, but meta-search engines existed prior to Google yet even at its launch it excelled over them in terms of provision of relevant links. It appears that it still does. At least for a first pass :)
    
    I suspect that one of the reasons that Google can bring higher quality links to the forefront is that being #1, they have a wider and more generous revenue base and therefore don't have to be as generous to "paying patrons" *cough cough*.
    
    Another problem is that meta engines have to mix "high-quality" results (say from Google) with lower quality results (say from some dippy paid for advertising search engine).
    
    Parent Share
    twitter facebook
    - Re:Thoughtful... (Score:2)
      
      by Steeltoe ( 98226 ) writes:
      
      I suspect that one of the reasons that Google can bring higher quality links to the forefront is that being #1, they have a wider and more generous revenue base and therefore don't have to be as generous to "paying patrons" *cough cough*.
      
      Not just that. Google revolutionized the web-search stage with their Pagerank software and other improvements. It's not something new, librarians have used such algorithms for a long time. However, it consistently gives "better" results than most of the competition.
      
      I sus
- Re:Thoughtful... (Score:2, Funny)
  
  by Morosoph ( 693565 ) writes:
  
  In case of Slashdotting [google.com]
  
  Take note: "Google is not affiliated with the authors of this page nor responsible for its content."
You mean FAT don't cut it no more? (Score:1, Redundant)

by inertialmatrix ( 675777 ) writes:

I say screw the inovation and lets all just move back to FAT16!
Weeeeeeeeeeeeeeeeeeeeee!
- Re:You mean FAT don't cut it no more? (Score:3, Funny)
  
  by Russ Steffen ( 263 ) writes:
  
  I think you mean "WEEEEEEE.EEE." Or possibly "WEEEEEE~1.EEE."
  - Re:You mean FAT don't cut it no more? (Score:5, Funny)
    
    by Wumpus ( 9548 ) writes: <IAmWumpus@gmail. c o m> on Monday September 29, 2003 @06:45PM (#7089496)
    
    Surely you mean "WEEEEE~1.EEE".
    
    Parent Share
    twitter facebook
    - - Re:You mean FAT don't cut it no more? (Score:2)
        
        by Wumpus ( 9548 ) writes:
        
        Even if there wasn't, it'd still be WEEEEE~1, not WEEEE~1.
Story summary (Score:4, Funny)

by slash-tard ( 689130 ) writes: on Monday September 29, 2003 @06:37PM (#7089416)

Google uses MS access as a backend to store all of its cache files. It is redundant by having a batch file setup with the windows "at" command to "xcopy" the data to another backup server.

Share
twitter facebook
- Re:Story summary (Score:2)
  
  by radish ( 98371 ) writes:
  
  Heh! Someone should show them how to use robocopy!
PDF mirror (Score:5, Informative)

by Tyler Eaves ( 344284 ) writes: on Monday September 29, 2003 @06:37PM (#7089418)

PDF mirror on my server [tylereaves.com] /Feels sorry for the Rochester cs server

Share
twitter facebook
Interesting... (Score:3, Insightful)

by petermdodge ( 710869 ) writes: <petermdodge.canada@com> on Monday September 29, 2003 @06:37PM (#7089421) Homepage

It's an interesting enough read, it certainly is interesting to see how one of the biggest-volume servers out there cope. Now, the question is, what can us little server guys do to implement the ideas therein to our server? What can we take from it?

Share
twitter facebook
- Re:Interesting... (Score:2)
  
  by Mister Black ( 265849 ) writes:
  
  Now, the question is, what can us little server guys do to implement the ideas therein to our server? What can we take from it?
  
  Nothing, or you'll be sued for copyright infringement.
  - Re:Interesting... (Score:1)
    
    by petermdodge ( 710869 ) writes:
    
    Are we talking SCO or Google.com? - plus, I always thought that you cannot copyright ideas, just patent them. (And thus the EU patent issue is ressurected.. bleh)
  - Re:Interesting... (Score:2)
    
    by LiquidCoooled ( 634315 ) writes:
    
    I believe the grand-parent was referring to learning from those who succeed.
    
    There is nothing wrong with following and learning from our ancestors.
    
    Google have given a great deal of thought into their filesystem, and most likely made some huge mistakes along the way. In the end they have a stable workable system that still gives me the shivers occasionally.
    
    I would see these as guidelines for a further next generation filesystem rather than ripping the code from underneath them and calling it our own.
Just to make it clear.. (Score:5, Informative)

by Doodhwala ( 13342 ) writes: on Monday September 29, 2003 @06:38PM (#7089427) Homepage

Okay, so I read this paper as a part of the SOSP reading group here [cmu.edu] at school [cmu.edu]. Just want to make it clear that this is not the file system used by the front end that we all see. It is used by internal dev groups as well as the web spiders that they employ. Their unique usage has definitely led to a number of interesting choices (such as the atomic appends) for the file system design. Read the paper for more details :-)

Share
twitter facebook
- Re:Just to make it clear.. (Score:4, Insightful)
  
  by Klaruz ( 734 ) writes: on Monday September 29, 2003 @08:07PM (#7090190)
  
  Could you cite your source please? In the first page of the paper linked:
  
  "It is widely deployed within Google for the generation and processing of data used by our service as well as research and development that requires large data sets."
  
  Parent Share
  twitter facebook
  - Re:Just to make it clear.. (Score:4, Informative)
    
    by Doodhwala ( 13342 ) writes: on Tuesday September 30, 2003 @01:21AM (#7091342) Homepage
    
    And if you read that statement, it does not mention the front-end. Generation and processing all takes place offline as most of the query results are only updated once a month (the Google-dance). And this question was asked of Howard Gobioff (one of the co-authors) at a presentation on the Google File System (GFS) at Carnegie Mellon.
    
    Parent Share
    twitter facebook
    - Is there still a Google dance? (Score:3, Interesting)
      
      by harmonica ( 29841 ) writes:
      
      I thought the Google dance was history, and the index is now being updated more continuously (how exactly, I don't know)?
Hmmm. (Score:4, Funny)

by Pig Hogger ( 10379 ) writes: <(moc.liamg) (ta) (reggoh.gip)> on Monday September 29, 2003 @06:38PM (#7089434) Journal

I'd like to see a beow...
Never mind.

Share
twitter facebook
Everything's stolen nowdays. (Score:2, Funny)

by Anonymous Coward writes:

Why the google file system is nothing but a waffle iron with a phone attached.
- Re:Everything's stolen nowdays. (Score:1)
  
  by marine_recon ( 652565 ) writes:
  
  that may be so, but have you ever seen such a useful waffle iron? i think not mmmmmmm, e-waffles
Only a file system? (Score:5, Interesting)

by jrrl ( 635743 ) writes: on Monday September 29, 2003 @06:39PM (#7089441)

Back in the early days at Lycos [lycos.com], Danner Stodolsky, now at Akamai [akamai.com] used so many weird little tricks to make things faster that we used to joke that we'd end up with a custom operating system. The supposed name? LycOS.
Luckily the world was saved from this possibility.
-John (now, one of those "why, back in my day..." story telling guys... sigh.)

Share
twitter facebook
- Re:Only a file system? (Score:2)
  
  by Alethes ( 533985 ) writes:
  
  Luckily the world was saved from this possibility.
  
  Not Really. [lycoris.com] :)
  - Re:Only a file system? (Score:3, Interesting)
    
    by FireBreathingDog ( 559649 ) writes:
    
    Nice menu [lycoris.com]: not alphabetized, and "Use a digital camera" appears twice with two different icons. Then there's the inexplicable and unexplained "scribus" menu item, the only item that is neither a phrase nor capitalized.
    Steve Jobs must be shitting in his pants.
Is it open source? (Score:4, Funny)

by The Ancients ( 626689 ) writes: on Monday September 29, 2003 @06:39PM (#7089446) Homepage

I need something for my p...err, book collection.

Share
twitter facebook
- Re:Is it open source? (Score:3, Funny)
  
  by daeley ( 126313 ) * writes:
  
  book collection
  
  Ah, yes. You want a new-fangled "ShelFS" system.
Word processor? (Score:2, Interesting)

by Anonymous Coward writes:

What word processor/text editor is used to write all of these technical papers? Almost every paper I've seen looks like it's written in the same program.
- Re:Word processor? (Score:1)
  
  by jrrl ( 635743 ) writes:
  
  Way back when, when I was in academia at CMU, it seemed like most conference papers were done in LaTeX (or straight TeX, for the fearless).
  Nowadays, who knows? Probably Word (shudder).
  -John (managing to not be nostalgic for LaTeX hackery).
  - Everyone still uses Latex in university. (Score:2, Funny)
    
    by Anonymous Coward writes:
    
    Just for covering their penis, not reading papers.
  - Re:Word processor? (Score:2)
    
    by Jeremy Erwin ( 2054 ) writes:
    
    It looks like LaTeX to me, though the macros aren't the default ones. The tables are very much in LaTeX's style.
  - Re:Word processor? (Score:2)
    
    by Jason Earl ( 1894 ) writes:
    
    I also was curious to see what software they had used to write the paper. It looked like a LaTeX document to me. Sure enough a quick peek at the document info reveals:
    Title: paper.dvi
    Application: dvips(k) 5.86 Copyright 1999 Radical Eye Software
  - Re:Word processor? (Score:2)
    
    by LinuxHam ( 52232 ) writes:
    
    Exactly. I helped build NYU CompSci's very first web site and spent many days converting the technical paper collection to PS when electronically available and scanned to TIFF when it wasn't.. like for papers dating back to the late 60's.
    
    There was some cool stuff buried in there.
- Re:Word processor? (Score:2)
  
  by gloth ( 180149 ) writes:
  
  I think it's FrameMaker.
- Re:Word processor? (Score:2, Informative)
  
  by Saunalainen ( 627977 ) writes:
  
  The PDF file claims to have been made by dvips, so it was written in Latex. It was then converted to PDF using Distiller.
- Re:Word processor? (Score:2)
  
  by SamBC ( 600988 ) writes:
  
  It's probably LaTeX [latex-project.org], which can be prepared from your favourite text editor, and rendered to print or PDF (or postscript) by entirely open-source software.
  
  It's very nice.
- LaTex is not a word processor (Score:3, Informative)
  
  by maxmg ( 555112 ) writes:
  
  It's more of a "text compiler" where you concentrate on writing the content and leave all of the formatting to a template that is responsible for transofmring the content into (normally postscript) output. Anybody who has worked with LaTex and then moved to Word, only to have that stupid piece of sh*t bunch all images in a document together, on top of each other, on the first or last page of their document will appreciate the LaTex workflow. And LaTex absolutely rocks when it comes to formulas.
  
  That being
  - Re:LaTex is not a word processor (Score:2)
    
    by hanssprudel ( 323035 ) writes:
    
    That being said, LaTex comes with a siginificant learning curve, and due to its nature misses some of the features that are important in a business environment (most notably changes tracking).
    
    For changes tracking, why not just use cvs?
html version (Score:4, Informative)

by kaan ( 88626 ) writes: on Monday September 29, 2003 @06:43PM (#7089476)

thanks to, ehh, Google, here's an html version [216.239.39.104] of the article

I didn't read the whole article (kinda lengthy) but it seems pretty informative. I found their assumptions interesting, as they reveal some of the essence of what makes Google such a great search tool. Here are a few from the article:

- The system is built from many inexpensive commodity components that often fail. It must constantly monitor itself and detect, tolerate, and recover promptly from component failures on a routine basis.

- High sustained bandwidth is more imprtant that low latency. Most of our target applications place a premium onprocessing data in bulk at a high rate, while few have stringent response time requirements for an individual read or write.

- The workloads primarily consist of two kinds of reads: large streaming reads and small random reads. Successive operations from the same client often read through a contiguous region of a file.

Share
twitter facebook
Various hardware life expectancies? (Score:3, Interesting)

by The Ancients ( 626689 ) writes: on Monday September 29, 2003 @06:47PM (#7089504) Homepage

...and an assumption that hardware components will always fail.
I think perhaps this is something we could all take a little more seriously. Part of me realises this is a comment on the sheer data being manipulated, but then something else that sprung to mind is the gradual reduction of warranties on HDDs, for example. I wonder what sort of stats an operation of this size could gather on various hardware components, and their varying propensities to wither and die.

Share
twitter facebook
- Re:Various hardware life expectancies? (Score:2, Interesting)
  
  by forevermore ( 582201 ) writes:
  
  Gradual reduction of hard drive warranties? Didn't Maxtor just bump up the warranty on their drives to 5 years? And WD and Seagate both have 3 year warranties on their drives. Granted, I'm talking about the "good" (SATA, 8 meg cache, etc.) drives, not the cheap ones that most of us users are using rebates to get for really-cheap.
  - Re:Various hardware life expectancies? (Score:2)
    
    by Alizarin Erythrosin ( 457981 ) writes:
    
    Google probably (most likely) uses SCSI drives anyways, which most often carry a 5 year warranty regardless of the company who makes it. Enterprise users wouldn't settle for less.
    - Re:Various hardware life expectancies? (Score:2)
      
      by CrystalFalcon ( 233559 ) writes:
      
      Enterprise users wouldn't settle for less.
      
      No, Enterprise users won't settle for interruptions. It's the IT guy's work to figure out how to make a noninterruptible environment as cheap as possible.
      
      Such a solution may well involve ultra cheap drives (one-third the cost of reliable ones) in a redundant RAID setup with hotspares, for example.
Interactive demo (Score:1, Funny)

by javaaddikt ( 385701 ) writes:

Check out the interactive demo [google.com] of how GFS works.
Fabulous Insights (Score:5, Informative)

by dolo666 ( 195584 ) writes: on Monday September 29, 2003 @06:57PM (#7089570) Journal

I really enjoyed that read about the file system Google uses. The fact that they usually append to their files, is of special note. By appending data you only need to know a simple pointer address. Seems quick enough. Add a bunch of threaded concurrent writes and you could get into trouble on other systems... The "atomic append" seems interesting because of the use of multiple machines to append simultaneously (hazard free).

64meg chunk size is pretty huge, but I'm guessing that's blocked out based on continual threads of data, not typical files.

At first glance, this file system seems fairly wasteful. But hey, Google likely require speed and reliability over cost. Right?

This reminds me of the discussions about not-so-far-off database filesystems coming to an OS near you.

Share
twitter facebook
- Re:Fabulous Insights (Score:2)
  
  by harmonica ( 29841 ) writes:
  
  64meg chunk size is pretty huge, but I'm guessing that's blocked out based on continual threads of data, not typical files.
  
  64 MB is the maximum chunk size. The assumptions section at the beginning talks about typical read/write operations working on about 1 MB.
When will it be in the kernel? (Score:4, Funny)

by caluml ( 551744 ) writes: <slashdot@spamgoe ... minus herbivore> on Monday September 29, 2003 @07:00PM (#7089590) Homepage

I hope they're going to release it to us mere mortals. I mean, they're probably the only people that need millions of gigabyte+ files floating around thousands of machines, but it would be nice to see

[ ] Google File System.

in the kernel config.
Must be 12pm - the updatedb script it running.

Share
twitter facebook
- Re:When will it be in the kernel? (Score:2)
  
  by Jellybob ( 597204 ) writes:
  
  Must be 12pm - the updatedb script it running.
  
  Someday I'll set that to a time when I won't be sat at my computer developing.
  
  Maybe 11am.
  - Re:When will it be in the kernel? (Score:2)
    
    by MikeFM ( 12491 ) writes:
    
    I set mine to run once a week. I think on Sunday afternoons. I also limit it to things I'm likely to want to locate and won't know where to look.. so it ignores my huge disks full of ripped movies, porn, mp3's, etc.
- Re:When will it be in the kernel? (Score:2)
  
  by mOdQuArK! ( 87332 ) writes:
  
  I mean, they're probably the only people that need millions of gigabyte+ files floating around thousands of machines
  
  Actually, since it's designed for lots of hardware which is expected to die regularly, I wonder if any of the technology could be applied to P2P networks?
And starting with Linux 2.7... (Score:5, Funny)

by JessLeah ( 625838 ) writes: on Monday September 29, 2003 @07:01PM (#7089597)

...the Linux kernel will have googlefs support. It will be marked (EXPERIMENTAL), though, and will only run on 10,000-node Babelfish clusters...

Share
twitter facebook
- Re:And starting with Linux 2.7... (Score:3, Interesting)
  
  by Anonymous Coward writes:
  
  Actually this sounds exactly like the sort of file system that would be useful in a render farm.
  
  How long before ILM or Weta has a GFS disk array?
  - Apples vs. oranges (Score:2)
    
    by SoupIsGoodFood_42 ( 521389 ) writes:
    
    Rendering doesn't need super-fast storage. It may need lots of storage for the whole movie, but the render farms spend far more time rendering than they do outputing data.
    - - Re:Apples vs. oranges (Score:2)
        
        by SoupIsGoodFood_42 ( 521389 ) writes:
        
        Not sure, but I do know that Lemons run Windows 98.
- Re:And starting with Linux 2.7... (Score:2)
  
  by __past__ ( 542467 ) writes:
  
  10,000-node Babelfish clusters
  Is that the successor of the Shakespear cluster system based on an infinite number of monkeys with typewriters?
they published it ... (Score:5, Interesting)

by trick-knee ( 645386 ) writes: on Monday September 29, 2003 @07:02PM (#7089606) Homepage

... which may not have happened from just any company of google's prominence. I mean, they have highly successful business and technical infrastructure models and they didn't HAVE to share it with anyone.

I wonder what they believe will protect their business from poaching of these ideas?

Share
twitter facebook
- Re:they published it ... (Score:2)
  
  by MoobY ( 207480 ) writes:
  
  I wonder what they believe will protect their business from poaching of these ideas?
  
  It's called "creating prior art" without patenting the stuff. That's good. It's not evil. It's the google folks.
- Re:they published it ... (Score:2, Insightful)
  
  by Anonymous Coward writes:
  
  The catch up Law.
  
  Basically it says that if you spend all your time playing catch up you never be first.
  
  If the other Search engines use the GoogleFS then you know they aren't the leader. Sort of like if kernal.org was running windows 2003 or if www.msn.com was running on linux.
  
  Now if they go and create a FS so they can be the same as google then they are just catching up. Once they catch up to Google, Google will be somewhere else.
  
  The other thing is they're are lots of Clustered file systems around so it
- They have no reason for worry (Score:2)
  
  by ttyp0 ( 33384 ) writes:
  
  It's apparent that Google employs by far the best programmers in the world. Google has published numerous white papers details their infrastructure and technology. By the time a competitor has time to implement, Google would already be far ahead with new innovations.
  Show your hate for SCO [anti-tshirts.com]. Get a cool t-shirt and donate to the Open Source Now Fund.
- Re:they published it ... (Score:5, Insightful)
  
  by hankaholic ( 32239 ) writes: on Monday September 29, 2003 @10:02PM (#7091017)
  
  I wonder what they believe will protect their business from poaching of these ideas?
  
  Perhaps the fact that it's taken many very smart people a good amount of time to implement and tune the original design, even after having come up with the basic layout?
  
  Go take a look at the ReiserFS Future Vision page [namesys.com] -- you'll see some more interesting discussion of filesystem design, and overall direction. There are a few solid developers working full-time on the concepts discussed in the Reiser docs, and they still have enough work to keep them busy for years to come.
  
  Google releasing information regarding the structure of their systems is a bit like John Carmack discussing the structure of his graphics engines: there's a hell of a distance between a conceptual description and a fine-tuned, tested, working implementation.
  
  Given Google's history, I'd also imagine that they're on the lookout for up-and-coming young researchers. As such, if some grad student takes their work and extends it, they can certainly benefit.
  
  Parent Share
  twitter facebook
RAIC?? (Score:3, Interesting)

by More Karma Than God ( 643953 ) writes: on Monday September 29, 2003 @07:13PM (#7089703)

Could we call Google a Redundant Array of Inexpensive Computers?

What else can it be programmed to do? Could this become the basis for a personal computer where you just add computers seamlessly when you need more power?

Share
twitter facebook
- RAID (Score:2)
  
  by _ph1ux_ ( 216706 ) writes:
  
  No, It would still be RAID - although the D would denote "Devices"... unless they had a purchasing contract with Dell...
- Re:RAIC?? (Score:2)
  
  by mindriot ( 96208 ) writes:
  
  Wouldn't that be called Grid?
Google cache (Score:5, Funny)

by Skreech ( 131543 ) writes: on Monday September 29, 2003 @07:21PM (#7089764)

In case Google gets slashdotted, here is the Google cache [216.239.41.104] for Google.

Share
twitter facebook
- Re:Google cache (Score:3, Funny)
  
  by CableModemSniper ( 556285 ) writes:
  
  Best part is the disclaimer at the top:
  
  Google is not affiliated with the authors of this page nor responsible for its content.
GFS and GWS? (Score:2, Funny)

by cpopin ( 671433 ) writes:

They designed their own file system as well as Web server? Did they design their own receptionists? If so, I want to work there!
Prevayler anyone? (Score:2, Informative)

by 12357bd ( 686909 ) writes:

The in-memory master behaviour described in the paper ressembles a lot the Prevayler [prevayler.org] software.
GooFS? (Score:2, Funny)

by hajejan ( 549838 ) writes:

Yeah, that'll definitely sell.
PC #1782563 (Score:2, Interesting)

by can56 ( 698639 ) writes:

See Verity Stobs article -- Cold Comfort Server Farm -- in the August/2003 edition of Dr. Dobb's Journal, for the sad truth about Googles' server farm. Sniff ;-(
Chunkservers... (Score:2)

by HotNeedleOfInquiry ( 598897 ) writes:

and chunkhandles. I love it. Great read.
user-mode? PVFS? (Score:2)

by penguin7of9 ( 697383 ) writes:

I can't quite tell from a quick reading of the paper, but this seems to be a user-mode file system. That is, if you call the regular POSIX "open" call, you probably can't open a file in the GoogleFS. It appears that some library code linked directly into the application handles all file system operations. A number of distributed file systems take that approach--it can be more efficient.

I wonder how it compares to PVFS [clemson.edu]. It seems like GoogleFS deals more aggressively with component failure. Any ideas?
Ironically Google has been down all day... (Score:2)

by quinkin ( 601839 ) writes:

Well ok, at least from Oz... and it seemed to be a backbone routing issue (Sydney Telstra Reach.com)... but don't ruin my fun with logic and facts! :)
Q.
Failure (Score:2)

by jetmarc ( 592741 ) writes:

Interesting.. Just yesterday the google groups database suffered failures. A lot of threads appeared in the search results, but couldn't be browsed.
People, people (Score:3, Funny)

by Epistax ( 544591 ) writes: <epistax@g[ ]l.com ['mai' in gap]> on Tuesday September 30, 2003 @08:41AM (#7092922) Journal

The question really on all our minds is can you play doom on it?

Share
twitter facebook
What a waste.... (Score:2, Insightful)

by abramsh ( 102178 ) writes:

Should have just bought one of these: SGI SAN 3000 [sgi.com] It would be easier and cheaper to manage, scales better, and you wouldn't have to spend the money to create and maintain the file system.
- Re:great. now, deal with the spam issue (Score:5, Funny)
  
  by winkydink ( 650484 ) * writes: <sv.dude@gmail.com> on Monday September 29, 2003 @06:52PM (#7089529) Homepage Journal
  
  how many times have you searched for something on google, only to find that the search engine spammers have taken over almost every top 10 result?
  Ummm... not very many. Then again, I try not to search on "teen panties" very often. :)
  That reminds me of the winter I spent in Chicago. I needed some galoshes to protect my shoes and keep my feet dry. Back in New England, we called them "rubbers" (I am not making this up). Needless to say, a google search on "buy rubbers" did not yield the intended results.
  
  Parent Share
  twitter facebook
  - Re:great. now, deal with the spam issue (Score:2)
    
    by tconnors ( 91126 ) writes:
    
    how many times have you searched for something on google, only to find that the search engine spammers have taken over almost every top 10 result?
    
    Ummm... not very many. Then again, I try not to search on "teen panties" very often. :)
    
    Hmmm, searching for help on LaTeX can sometimes be... distracting.
  - - Re:great. now, deal with the spam issue (Score:2)
      
      by Alizarin Erythrosin ( 457981 ) writes:
      
      There's an Indian (not Native American, from India) guy here at work who one day asked a coworker if he could "bum a fag"... I don't know if the guy ever figured out he was asking to bum a cigarette or not... I was laughing so hard.
- Re:great. now, deal with the spam issue (Score:2)
  
  by hondo77 ( 324058 ) writes:
  
  how many times have you searched for something on google, only to find that the search engine spammers have taken over almost every top 10 result?
  
  Err, never. Even searches for porn images are still pretty useful (as useful as porn images are, I guess). Dozens of non-porn searches a day and always useful.
- let them know (Score:2)
  
  by Therlin ( 126989 ) writes:
  
  I've come across that situation a couple of times. They have an address for that type of complains. I let them know both times and a human got back to me within 48 hours and said that they would look at the issue. Sure enough, a week later it was taken care of.
- Re:great. now, deal with the spam issue (Score:2)
  
  by MikeFM ( 12491 ) writes:
  
  Most of the people I see having trouble searching just don't know how to to search for things properly. My parents are a prime example. They knw how to get to Google but not how to pick the combination of keywords most likely to return the result they're looking for. I wish I could think of a way to put into code my mental process for doing this.. if I could then maybe Google would hire me. :)
  
  The other major problem is that many webpages aren't made to be easy to locate. At times they don't even include th
- Re:google groups mostly down all day (Score:2)
  
  by Threni ( 635302 ) writes:
  
  "They could use a more robust file system then. It seems like postings within the past 48 have headers, but google dies when accessing the body."
  
  Sure! Also, some of the counts of messages per thread are optimistic. I guess they've been told 1000 times already..or maybe I should mail them about it too?
- Re:Thank God (Score:2)
  
  by X ( 1235 ) writes:
  
  This is also a pretty strong indication of just how noteworthy this article is. This kind of stuff has been time and again. Things like OceanStore [berkeley.edu] are far more innovative. But of course that stuff isn't from Google, which is what makes this article noteworthy. ;-)
  
  Tomorrow's slashdot headline: Google proves definitively that 1 + 1 = 2
- well... (Score:2)
  
  by pr0ntab ( 632466 ) writes:
  
  it's not really a clustered filesystem. It's sort of like uber-intelligent iSCSI.
  
  A "real" GFS has multiple masters, as far as I'm concerned. This is a very specific app tied to a specific need for Google's web collection system.
  
  So I think you're okay, even so. :-)
  
  Also, the article was published before Sept. 17 (earliest commentary I saw), so this is moot.
  
  But anyway, kids, listen to him, don't procrastinate! And if you do, make sure you have adequate forged documentation on your 17 grandparents gruesome
- Re:This sounds like a GPL violation to me! (Score:2)
  
  by ZigMonty ( 524212 ) writes:
  
  No, they don't have to provide source code unless they distribute binaries. Even if they did, they'd only have to provide source to those they distribute to.
- Re:Google FS? (Score:2)
  
  by Captain Large Face ( 559804 ) writes:
  
  Duh! Everyone knows that common benchmark for application growth is a web browser!
- - Re:Story Summary (Score:2)
    
    by Anonvmous Coward ( 589068 ) writes:
    
    I metamodded it as unfair. Damn these peeps have no sense of humor.

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

In case you don't like PDF (Score:5, Informative)

Re:In case you don't like PDF (Score:3, Funny)

Re:In case you don't like PDF (Score:5, Funny)

Re:In case you don't like PDF (Score:2)

In case you don't like links at all (Score:3, Funny)

Re:In case you don't like PDF (Score:2)

Re:In case you don't like PDF (Score:3, Funny)

Thoughtful... (Score:5, Funny)

Re:Thoughtful... (Score:5, Funny)

Re:Thoughtful... (Score:5, Funny)

Re:Thoughtful... (Score:4, Funny)

Re:Thoughtful... (Score:2)

Re:Thoughtful... (Score:2)

Re:Thoughtful... (Score:2)

Re:Thoughtful... (Score:4, Insightful)

Re:Thoughtful... (Score:2)

Re:Thoughtful... (Score:2, Funny)

You mean FAT don't cut it no more? (Score:1, Redundant)

Re:You mean FAT don't cut it no more? (Score:3, Funny)

Re:You mean FAT don't cut it no more? (Score:5, Funny)

Re:You mean FAT don't cut it no more? (Score:2)

Story summary (Score:4, Funny)

Re:Story summary (Score:2)

PDF mirror (Score:5, Informative)

Interesting... (Score:3, Insightful)

Re:Interesting... (Score:2)

Re:Interesting... (Score:1)

Re:Interesting... (Score:2)

Just to make it clear.. (Score:5, Informative)

Re:Just to make it clear.. (Score:4, Insightful)

Re:Just to make it clear.. (Score:4, Informative)

Is there still a Google dance? (Score:3, Interesting)

Hmmm. (Score:4, Funny)

Everything's stolen nowdays. (Score:2, Funny)

Re:Everything's stolen nowdays. (Score:1)

Only a file system? (Score:5, Interesting)

Re:Only a file system? (Score:2)

Re:Only a file system? (Score:3, Interesting)

Is it open source? (Score:4, Funny)

Re:Is it open source? (Score:3, Funny)

Word processor? (Score:2, Interesting)

Re:Word processor? (Score:1)

Everyone still uses Latex in university. (Score:2, Funny)

Re:Word processor? (Score:2)

Re:Word processor? (Score:2)

Re:Word processor? (Score:2)

Re:Word processor? (Score:2)

Re:Word processor? (Score:2, Informative)

Re:Word processor? (Score:2)

LaTex is not a word processor (Score:3, Informative)

Re:LaTex is not a word processor (Score:2)

html version (Score:4, Informative)

Various hardware life expectancies? (Score:3, Interesting)

Re:Various hardware life expectancies? (Score:2, Interesting)

Re:Various hardware life expectancies? (Score:2)

Re:Various hardware life expectancies? (Score:2)

Interactive demo (Score:1, Funny)

Fabulous Insights (Score:5, Informative)

Re:Fabulous Insights (Score:2)

When will it be in the kernel? (Score:4, Funny)

Re:When will it be in the kernel? (Score:2)

Re:When will it be in the kernel? (Score:2)

Re:When will it be in the kernel? (Score:2)

And starting with Linux 2.7... (Score:5, Funny)

Re:And starting with Linux 2.7... (Score:3, Interesting)

Apples vs. oranges (Score:2)

Re:Apples vs. oranges (Score:2)

Re:And starting with Linux 2.7... (Score:2)

they published it ... (Score:5, Interesting)

Re:they published it ... (Score:2)

Re:they published it ... (Score:2, Insightful)

They have no reason for worry (Score:2)

Re:they published it ... (Score:5, Insightful)

RAIC?? (Score:3, Interesting)

RAID (Score:2)

Re:RAIC?? (Score:2)

Google cache (Score:5, Funny)

Re:Google cache (Score:3, Funny)

GFS and GWS? (Score:2, Funny)

Prevayler anyone? (Score:2, Informative)