Scaling Facebook To 140 Million Users 178
1sockchuck writes "Facebook now has 140 million users, and in recent weeks has been adding 600,000 new users a day. To keep pace with that growth, the Facebook engineering team has been tweaking its use of memcached, and says it can now handle 200,000 UDP requests per second. Facebook has detailed its refinements to memcached, which it hopes will be included in the official memcached repository. For now, their changes have been released to github."
Re:Thank goodness (Score:2, Insightful)
Well, I think it's kind of cool that they are putting back, so to speak. If they can use that tweak, so can everyone else. If your requirements all fit on one host server, then that server might now be able to do much more. Perhaps the next changes should be to allow a setting that penalizes retail advertisements by adding some arbitrary delay of greater than 10 seconds?
Blaming Linux... (Score:5, Insightful)
We discovered that under load on Linux, UDP performance was downright horrible. This is caused by considerable lock contention on the UDP socket lock when transmitting through a single socket from multiple threads. Fixing the kernel by breaking up the lock is not easy. Instead, we used separate UDP sockets for transmitting replies (with one of these reply sockets per thread). With this change, we were able to deploy UDP without compromising performance on the backend.
I bolded the quote to show what their real problem was. They had a shit load of threads trying to use a single socket and of course there was huge overhead involved due to the mutex lock (Semaphore on kernel side) on a shared resource (the socket). So they blame Linux instead of them selves for such a half-ass implementation of sending out packets from multiple threads with a single socket. They would have gotten the same exact result if they tried it with a single TCP connection socket and attempted to have multiple threads firing off packets with that. If you want multiple threads sending out packets use multiple sockets... Wow what a concept!
Sorry for my ranting, but it just pisses me off when moron programmers blame the operating system for their own stupidity.
Anyway, haven't nearly all MMOs gone with using UDP internally of the game cluster network and TCP externally to reduce latency and network overhead? So this is nothing new to me.
Re:Blaming Linux... (Score:2, Insightful)
Linux is pretty terrible for performance multi-threading, that's a fact. It features unreliable file IO too, but I digress..
In the case of Facebook, it's true that it's not the OS fault since Mutexes are always slow anyway.
There are lockless libraries that lock the CPU(s) for one cycle so that the program doesn't need to lock a mutex to increment a counter, for example. Thousands of times faster...
But these wouldn't have helped there. Like you said, it just seems like a design problem in the software. Still 140M users is very impressive.
Re:Blaming Linux... (Score:5, Insightful)
That is *not* a Facebook problem (Score:5, Insightful)
It's just a standard trojan with an unusual delivery method of using fake Facebook profiles run by trojan bots. I can't see how this is Facebook's problem any more than it's your email program's fault that you clicked on a dodgy link without checking it.
Re:Blaming Linux... (Score:3, Insightful)
Wow, you're uninformed on multiple levels with this post.
1. "They" didn't write memcached. Livejournal did, and then they open sourced it. "They" didn't provide a half-assed implementation. They pushed a piece of open source software further than it had before, and found problems.
2. If you'd read the next sentence right after your bold line, you'd notice they were talking about a kernel lock. Not a lock in memcached. Thats a totally valid reason to blame linux.
If you bothered to actually spend some time programming hugely complex high performance applications, you'd realize quite quickly that the Linux kernel, while damn near the best kernel out there, isnt perfectly suited to your application. I can list off five or six things right now that I have problems with in the Linux kernel. But -every- application designer with sufficient experience (especially in large-scale performance apps) can probably do the same.
Before you say: well why don't you fix it yourself.. look at the quote you referenced again. They considered it, and took a different route.
Re:... And Yet Very Lacking From a Security Angle (Score:5, Insightful)
It can't be addressed... because it's not a security issue with the site. It's an issue that the user needs to be trained on how to spot, and good luck getting that to happen.
I mean, come on, banks have the "problem" you described, and most banks aren't what we'd call insecure.
Re:... And Yet Very Lacking From a Security Angle (Score:1, Insightful)
Now, if /. allowed me to post the (fake) link above, how are they any more at fault than facebook is for allowing potentially dodgy links to be shared via their service?
This is ridiculous, if you can't think of a way to combat this, you don't have a very good imagination. The fact that Slashdot includes a [i-promise.org] after your link is one very simple way to inform the user.
... would it really be that hard for them to test it against a known malicious links database like Firefox's phishing extension does?
Facebook already notifies you that you're leaving Facebook when you click on mail or an instant message inside Facebook
You are creating a product for 140 million users, I would expect you to be doing all you can to protect their security and safety. Right now, it's becoming a hotbed for crime.
Don't get me wrong, it's WAY better than any other social networking site but if someone can overcome these problems, they're going to be more secure than Facebook.
Yes, (Score:4, Insightful)
Being able to find old friends you haven't been able to contact in years.
Having a central pull information spot rather than the push model of spaming every email address you have with pics of the new baby, house, car, toaster.
A central and standardized organization spot for arranging informal gatherings with friends, like parties.
Re:Blaming Linux... (Score:5, Insightful)
This statement is just downright disingenuous and wrong. UDP performance in general on Linux is comparable or better than other Operating Systems. What he found out is that accessing a single UDP socket on Linux requires a lock, and that when trying to share that lock over multiple threads you have a performance issue. Welcome to intro level operating systems.
This has nothing to do with UDP performance, which I define as either throughput or in some cases packets per second. He then goes on to imply that he worked around some issues in Linux, when in actuality he attacked the problem from the wrong angle and through trial and error found the obvious solution. Why would you even think to use the same socket in a connectionless protocol like UDP in the first place?
I do agree that in general the article was written in more or less praise of Linux, but reading that sentence makes my blood boil.
Re:Blaming Linux... (Score:4, Insightful)
Too often the people that are left to explain the problem in detail to the press are not the engineers that worked on the solution for that problem. If we had a discussion with one of them, we would hear a totally different story!
Re:140 million (Score:3, Insightful)
According to a poster further up, the figure is based on the number of users that have logged in in the last 30 days. While that number will still be a bit high it shouldn't be awful.
Re:Blaming Linux... (Score:5, Insightful)
[...] So they blame Linux instead of them selves for such a half-ass implementation of sending out packets from multiple threads with a single socket.[...]
Sorry for my ranting, but it just pisses me off when moron programmers blame the operating system for their own stupidity.
The point is that it wasn't their own stupidity. They took someone's open source project and improved it so it could better handle high loads. I don't see them blaming Linux, I see them recognising the limitations of the system they are using and coming up with a solution and then sharing it. Normally, this is cause to say "Yay! Open source!" rather than calling them "moron programmers".
Re:[Unintelligible] Facebook [Unintelligible] (Score:1, Insightful)
Actually, you're right in that it's not "millions." I meant to make that point and completely forgot in trying to remember the hyphen issue.
I know you might have been going for the comedy thing... but if "talking like a human being" means speaking incorrectly, then I'll pass, thanks. Not that I don't use colloquialisms or always use formal English, but I like trying to avoid grammar, spelling, and pronunciation errors...
Re:Blaming Linux... (Score:1, Insightful)
> Therefore you need a lock around updates to that data structure.
I guess you haven't heard of lockless queues?
Re:Wow (Score:3, Insightful)
What they know about you can fill a warehouse.
What they know about you is only what you tell them.
Re:Yes, (Score:2, Insightful)
You left out 'providing a commercial data mine for companies to exploit'.