Become a fan of Slashdot on Facebook

 



Forgot your password?
typodupeerror
×

How Facebook Stores Billions of Photos 154

David Gobaud writes "Jason Sobel, the manager of infrastructure engineering at Facebook, gave an interesting presentation titled Needle in a Haystack: Efficient Storage of Billions of Photos at Stanford for the Stanford ACM. Jason explains how Facebook efficiently stores ~6.5 billion images, in 4 or 5 sizes each, totaling ~30 billion files, and a total of 540 TB and serving 475,000 images per second at peak. The presentation is now online here in the form of a Flowgram."
This discussion has been archived. No new comments can be posted.

How Facebook Stores Billions of Photos

Comments Filter:
  • Re:I dunno. (Score:3, Informative)

    by OverlordQ ( 264228 ) on Wednesday June 25, 2008 @11:12AM (#23935235) Journal

    To view the slideshow . . err I mean 'flowgram' (whatever the fuck that's supposed to mean), you dont need to register.

  • Ahh haha (Score:2, Informative)

    by Bored MPA ( 1202335 ) on Wednesday June 25, 2008 @11:14AM (#23935277)
    Perhaps I should turn on audio, or they should have a less friggin confusing UI.
  • Very interesting (Score:3, Informative)

    by phase_9 ( 909592 ) on Wednesday June 25, 2008 @11:17AM (#23935331) Homepage
    Fascinating Presentation for those of you who actually bother to watch the Hour or so of content.
  • Re:I dunno. (Score:4, Informative)

    by aproposofwhat ( 1019098 ) on Wednesday June 25, 2008 @11:28AM (#23935497)
    Not only that, but the UK Facebook site has been down most of the afternoon - some infrastructure, huh?
  • Re:FLASH?! (Score:2, Informative)

    by Mathiasdm ( 803983 ) on Wednesday June 25, 2008 @11:44AM (#23935737) Homepage
    I do have Javascript installed, and am running Adobe Flash (Linux version). Doesn't work :(
  • Re:FLASH?! (Score:3, Informative)

    by nahdude812 ( 88157 ) * on Wednesday June 25, 2008 @11:50AM (#23935857) Homepage

    Worked for me from Ubuntu.

  • by Ralph Spoilsport ( 673134 ) * on Wednesday June 25, 2008 @11:59AM (#23936011) Journal
    equals about 18k per image?

    RS

  • by JuanCarlosII ( 1086993 ) on Wednesday June 25, 2008 @12:28PM (#23936543)
    A quick survey of the most recent images on my profile tells me a full size image comes in at 50-60k and a standard thumbnail at ~5k so given the other sizes of thumbnail as well I'd say 18k per image is about right.
  • Already been done. (Score:5, Informative)

    by sirrube ( 622137 ) on Wednesday June 25, 2008 @01:01PM (#23937089) Homepage

    This stuff is cool either way, even if it is just "childish spam." Many of us only dream to work on something that will become this large scale.

    ...

    Fortune 500 companies could probably learn a thing or two...

    This Fortune 500 company could teach a thing or two on this subject. [datatree.com] Since before 1999 DataTree has already did this. With over 40 billion land records online, and 600+TB of data, they deliver many millions of images daily. Not to put down FaceBook's Implementation, but DataTree does not need to run 10k webservers and 1800 SQL databases to provide images. It is nice to see the scalability factor of their design, but it does not mean that it is the most efficient way to do things, or to follow and learn from.
  • Re:Akamai? (Score:2, Informative)

    by Anonymous Coward on Wednesday June 25, 2008 @01:27PM (#23937481)
    ...you still have to do that part of uploading to Akamai. And if Akamai brings on a new node, it has to refresh most of the content from you anyway (yeh, its a tiered caching network that usually uploads from other nodes, but sometimes it doesnt). Cache hits from them tend to be in the 97%+ range if done right, but still, 97% of 8Gbs+ leaves 240Mb+ you have to serve. Akamai is a cache, not a content store. What you suggest is akin to saying its ok to pull the Raid array once things are loaded to RAM, cause the OS just keeps the data there. You still have to keep the storage, with redundancy and backups and the bandwidth to serve cache refreshes to Akamai. It does greatly reduce the problem, but it is not a complete solution in itself. It also does not work for most dynamic content, since it doesnt store your DB for you, those requests still have to go home, thus you still have to have the storage, capacity, DB horsepower, etc to serve the requests, including the ones that actually point the requester at Akamai for the static bits.

    I dont work for FB, but a company that does make use of 3rd party caching networks for very large content distributions

    Tm

  • server load fixed (Score:2, Informative)

    by gobaudd ( 897341 ) on Wednesday June 25, 2008 @01:37PM (#23937647)
    we had some problems in the beginning but the server should be much better now.
  • by Kimos ( 859729 ) <kimos.slashdotNO@SPAMgmail.com> on Wednesday June 25, 2008 @03:28PM (#23939383) Homepage
    Get the Big Photo [facebook.com] application.

    It's not ideal, but it works quite well. A friend of mine is a professional photographer and she puts all her work up there. Works well for her.
  • User-mode GoogleFS (Score:5, Informative)

    by Panaflex ( 13191 ) <<convivialdingo> <at> <yahoo.com>> on Wednesday June 25, 2008 @03:53PM (#23939753)

    (summarizing the big long presentation)

    This is basically want to make a usermode GoogleFS. Their biggest problem is reducing reads - which are hampered by Posix file standards (inodes, metadata, etc...)

    Instead they use a database-like index/data file arrangement. The index stays in memory and files are stored together in large contiguous spaces on a single file. It's possible to utilize a LUN for storage - but not there yet.

    There... where's my cookie?

    (Oddly enough - I'm writing the exact same code they are... bazaar world, eh??)

  • Re:Akamai? (Score:3, Informative)

    by Anonymous Coward on Wednesday June 25, 2008 @04:28PM (#23940395)

    That won't work considering the number of files. Given the quote (which require nearly a year of hassle with the Akamai morons and sighing an NDA, thus the AC post) we got from those idiots, it would cost us almost $200k/year given our bandwidth use to store ~1,000 files. Facebook has 30 billion files and assuming the same price per file as we were quoted, Akamai would charge $6,000,000,000,000/year to host them. To put that number in perspective, that's more than the GDP of the Germany plus that of the UK. The Akamai VP (something Danzig, I remember the name because of the band of the same name) I talked to just wasn't able to comprehend why we wouldn't consider paying that much to host a few files. We ended-up renting five 1U servers in four different countries for about $15k/year. While we have a little less total bandwidth and it requires more management time to maintain, it's only 7.5% of the cost of Akamai and we can store many(1,000 times?) more files than Akamai would let us at that price point. To say that the Akamai guys don't understand math is an understatement.

An authority is a person who can tell you more about something than you really care to know.

Working...