Facebook Rewrites PHP Runtime For Speed 295
VonGuard writes "Facebook has gotten fed up with the speed of PHP. The company has been working on a skunkworks project to rewrite the PHP runtime, and on Tuesday of this week, they will be announcing the availability of their new PHP runtime as an open source project. The rumor around this began last week when the Facebook team invited some of the core PHP contributors to their campus to discuss some new open source project. I've written up everything I know about this story on the SD Times Blog."
Re:is this being used now? (Score:5, Informative)
Try the "Lite" [facebook.com] version. It's much faster, and doesn't have that annoying chat bar.
Misleading Summary (surprise!) (Score:5, Informative)
From TFA: UPDATE: After sifting through the comments here and elsewhere, I'm inclined to agree with the folks who are saying that Facebook will be introducing some sort of compiler for PHP.
Not a fork. Not as newsworthy as implied.
Was revealed 3 weeks ago by insider (Score:5, Informative)
Re:High performance in scripting languages? (Score:1, Informative)
Not true. When you start using memcache (which FB heavily relies on), the latency of data retrieval drops down below the latency to execute PHP opcode.
PHP has a big problem, it executes scripts from scratch on every request. Every request has to load over and over again same configuration data, even if it comes from memcached -- there is TCP latency involved.
Applications that use regex-based URL mapping to controllers suffer heavily in PHP because on each request the interpreter has to compile the regex and then do the matching.
Instead of preparing all the app specific configuration once upon startup, and use that as process-local data.
Re:Misleading Summary (surprise!) (Score:2, Informative)
If thats so, then they are reinventing wheel, since there is already PHP compiler available, with is also open source: http://www.roadsend.com/home/index.php [roadsend.com]
Re:Gotten (Score:3, Informative)
http://wiki.answers.com/Q/Is_gotten_correct_grammar [answers.com]
Can't you?
Re:High performance in scripting languages? (Score:2, Informative)
You should check your facts first.
I did not say PHP had to PARSE each script per request -- which would happen without an opcode cacher. It has to EXECUTE the script, opcode cached or otherwise, from scratch on each request.
So there is no way for PHP to hold application data in memory between requests, except by using shared memory or a memcache. Other languages like Python, Java, etc... allow you to instantiate your classes and their data upon startup, and then call methods per request, without having to instantiate all the classes over and over again in each request.
Of course, sometimes you have to, but PHP does not even allow you to pre-instantiate those classes that do not have to.
So on each request, the application has to load its configuration, even if it is stored as PHP array in some config.inc.php. It has to re-evaluate the arrays (construct the array, build hash keys, etc...) even when opcode cached.
Granted, certain important extensions do keep pools of resource handlers between requests, like PDO, memcached, etc...
Also, I did not mention Apache. I said (PHP) applications that map URL patterns to controllers -- aka central index.php that patches URL to controllers and their methods. That is different from Apache rewriting that maps URL patterns to PHP scripts.
Re:High performance in scripting languages? (Score:5, Informative)
At some point they have so much traffic from their webservers to their backendsystems, they saturated their internal network and were dropping UDP.
That's the kind of problems/scale they deal with, I'm surprised PHP wasn't their biggest bottleneck before (they did some work on PHP already, but not something like this).
After all Facebook is the second site after www.google.com-search page (which handles 'just' one task) and Google has pretty much a custom-build platform.
Re:High performance in scripting languages? (Score:1, Informative)
That's the kind of problems/scale they deal with, I'm surprised PHP wasn't their biggest bottleneck before (they did some work on PHP already, but not something like this).
It is easy to add new front end servers, they all connect to the same pool of backend resources. Backend coordination is the most fragile point in the life of a request. You can't just add backends, which they realized with memcached, adding more did not help.
However, if they used a faster language, which is the point of TFA, they would have to use LESS front end servers and with that spare the money.
Because, the backend is already working as fast as possible -- in terms of programming language used -- native on CPU. I guess the backend could get faster only if they reimplemented MySQL and memcached, taking out the functionality that is not required, or adding critical new functionality. Oh, wait... they did.
So that leaves removing PHP from the chain.
Re:Is compiled PHP even possible? (Score:3, Informative)
For a recursive Fibonacci calculation, implementing the same algorithm in C, Objective-C and Smalltalk took 2.35, 6.60, and 5.69 seconds, respectively (calculating fib(30) 100 times), with about a 50% variation margin on successive runs. The Smalltalk version was not always faster than the ObjC version (it was most of the time; not sure why, probably some weirdness with the Smalltalk code happening to line up with cache boundaries better), so it's safe to consider them roughly the same speed.
It's worth noting that the Smalltalk version, unlike the C and Objective-C versions, will never suffer from integer overflow. Tweaking the benchmark so that it computes fib(47) in the three versions, the timing results are: 50, 130, and 280 seconds, respectively.
The difference is that the Smalltalk version generates the correct answer, while none of the others do. Personally, I'd rather have slow code generating the correct answer, but maybe that's just me. It is, of course, possible to write code in C that would check for overflow (in this case it's relatively easy, you can just test whether the sign bit flipped because you're just adding two positive integers), but returning something that is either an integer or an arbitrary-precision value in C is a bit harder and you'd end up with at least four times as much code to make the C version, and a lot more potential for bugs.
By the way, calculating fib(47) with a sensible algorithm in Smalltalk takes a tiny fraction of a second, highlighting the fact that good algorithms are usually more important than good compilers.
The compiler targets the (GNU) Objective-C ABI, so Smalltalk and Objective-C classes can be used interchangeably (you can, for example, subclass an Objective-C class with Smalltalk and then call the Smalltalk methods from Objective-C). Some of the improvements I've recently made to the Objective-C runtime mean that the compiler can now emit code to do polymorphic inline caching and speculative inlining. It doesn't yet do either, but in benchmarks these reduce the cost of a message send to within a hair of the cost of a C function call. For most uses, Objective-C is already fast enough, so I'll probably only implement these as a profile-driven optimisations and enable them for hot code paths where the message sending overhead is actually important.
I'm giving a talk about this at FOSDEM in the GNUstep developer room next weekend.
Re:Resin Quercus (Score:3, Informative)
In summary:
It is OpenSource, 100% Java and it brings all the advantages of using a JVM to PHP - performance (JIT), Safety, Scalability (clustering/load balancing), quality tools (Development, Profilers). One can use most of the Java technologies in PHP to ease development even further - XA Transactions, JNDI, Connection pooling, object caching for example.
Besides, improving performance of this pure Java PHP implementation ought to be easier than improving the PHP runtime. (Java6 onwards the available tools to debug and optimize Java applications have made significant progress. jmap/jhat , easy heap dumps on OutOfMemory, Object Query Language etc. already come bundled with the JVM and then there are Eclipse and NetBeans GUI profilers.)
Also worth checking out Dr. Cliff Click's extensive Java vs. C performance blog post - http://blogs.azulsystems.com/cliff/2009/09/java-vs-c-performance-again.html [azulsystems.com] .
Re:Screw PHP, I write everything in C (Score:3, Informative)
Re:is this being used now? (Score:3, Informative)
Re:They should spend more on the upload tool (Score:5, Informative)
Scale your photos down to 604x453, which is the size Facebook displays them at, and you will get to control the sharpness and image quality.
Upload at any other size, and Facebook will re-sample them with some very cheap algorithm and apply aggressive compression and they will look like ass.
Try it, you'll be amazed how much better your photos suddenly look.
I normally use "convert -strip -sharpen 0.3 -quality 85 -geometry 604x604" before uploading - it just takes a second, and makes a huge difference.
Re:High performance in scripting languages? (Score:3, Informative)
There's no need to to use a system language for everything. Facebook is probably using PHP on its own and that's just not wise for a site like that.