The Environmental Impact of PHP Compared To C++ On Facebook 752
Kensai7 writes "Recently, Facebook provided us with some information on their server park. They use about 30,000 servers, and not surprisingly, most of them are running PHP code to generate pages full of social info for their users. As they only say that 'the bulk' is running PHP, let's assume this to be 25,000 of the 30,000. If C++ would have been used instead of PHP, then 22,500 servers could be powered down (assuming a conservative ratio of 10 for the efficiency of C++ versus PHP code), or a reduction of 49,000 tons of CO2 per year. Of course, it is a bit unfair to isolate Facebook here. Their servers are only a tiny fraction of computers deployed world-wide that are interpreting PHP code."
Assumes PHP Dev Effort = C++ Dev Effort (Score:5, Funny)
What about all the cycles compiling and debugging C++ code? Or all the trees torn down for C++ books? Or the environmental impact of C++ developers? I mean, have you ever had to share a cube with one of them? Pheewww.
Re:Assumes PHP Dev Effort = C++ Dev Effort (Score:5, Insightful)
I know your being funny but you've got a good point. Developing and maintaining C++ code is not like developing and maintaining PHP script. Which of course is why we have PHP to begin with. It's designed for the web and ease of implementation. Sure C++ would be faster running but not necessarily more efficient in terms of dollars.
Re:Assumes PHP Dev Effort = C++ Dev Effort (Score:4, Funny)
Have no fear, turning devs into disposable resources will ensure bright future to efficiency being judged only in hardware terms.
Incompetent developers require more servers (Score:4, Interesting)
It's a phenomenon we have also noted.
Sure C++ would be faster running but not necessarily more efficient in terms of dollars.
I think you'll find that the servers come out of the operational budget, not the development one. So the costs of running 10x more servers don't factor into development effort. The costs should of course be charged back to the dev teams.
It isn't just one server (Score:4, Insightful)
Running a server is cheap.
Paying a developer is not.
Civilisation is largely about the multiplication of human effort through the consumption of energy and automation. So, we multiply this developer's effort by a couple of thousand when running one machine and then do the same on another several hundred machines beyond. Each costs several thousand dollars to purchase and several thousand more every year in electricity, in cooling, networking, management and maintenance.
So, the effects of developer incompetence are also multiplied several thousand times often across hundreds or thousands of systems. Millions if we're really lucky.
So it isn't just one server, it's just one extra datacenter. It often pays to hire better people.
running a server for a day - $1
You think you get a real server for that? You get a tiny division of a server for that kind of money.
2) why doesn't these big server farms start looking at migrating code from PHP to C or C++ when the PHP+web design is solid?
The network effect. They migrate to Java instead.
Speed to delivery is nearly always primary importance.
Indicating speculative projects and disposable code.
Re: (Score:3, Insightful)
If you take APC or similar compile caches into account, I think you'll find that the gap is remarkedly smaller than you'd expect. It'll never close entirely, but given that I've seen 20x speedups on some pages, the benefit is huge.
Plus, C++ is an environment-hostile choice (Score:4, Funny)
Imagine if every website was implemented as an ASIC. Then we could talk about efficient datacenters. Maybe, if you're relly strapped for cash, you could implement each website in an FPGA. But that should only be a stopgap measure until you can afford a proper implementation.
Re:c++ is 'write-only' code (Score:5, Insightful)
Sure, when the code is written by someone who really knows how to use C++. Ever read bad PHP code? Bad Java code? I have seen programmers do things like this:
int int1, int2, int3, int4, int6, int7;
No, that is neither a joke nor an exaggeration, and the missing number is deliberate. This is a declaration I saw on a recent project. This kind of poor coding is language agnostic, and it is entirely irrelevant whether someone is using C++, PHP, or even a language like Haskell (bad Haskell code is worse than that worst C++ code I have ever seen -- if you use a functional language, get it right!).
On the other hand, I have seen some maintainable C++ code, with appropriate and useful comments, well thought out classes and class relationships, and expert use of the STL. I once worked on a project with C++ code that dated back to the early 90s, and had been continuous updated to support new features and needs, to make use of the STL (yes, this can be written into old code without causing a disaster), and so support systems that did not even exist when the code was originally written.
Don't blame the language, blame programmers who never learned about good programming practices. Blame computer science programs that give people degrees they do not deserve. Blame an industry that will hire anyone who can write a hello world program and then assume that they are capable of writing a maintainable system with millions of lines of code. The best programming language in the world will not solve the problem of poor programmers and poor coding practices.
Re: (Score:3, Insightful)
Ever read bad PHP code?
My hobby is refactoring PHP code. Note I say hobby, and not job.
After cutting my teeth with C, I moved on to web development with Perl. I was really annoyed at all the quirks in that language, namely, bizarre subroutines instead of functions, and clever regular expressions everywhere. Perl was just a pain, and I still don't like it! So, I decided to give PHP a spin, and I liked it because it was closer to the C code I used to write.
It didn't take long for me to realize there was something seriously wron
Re:c++ is 'write-only' code (Score:4, Insightful)
Maybe you should learn the language first. It seems there are an awful lot of people who love to comment on the complexity and performance of C++, who never bothered to really learn the language. Yet this doesn't stop them from pretending the be experts on it.
Re:c++ is 'write-only' code (Score:5, Insightful)
What C++ has always lacked, and PHP, Java and others do not, is a bundle of standard libraries that let you do things like process XML, talk to databases, and make templating EASY.
That's it. php does the same things C++ does, but go one beyond and add a rich library and of course, the ability to skip the "compile" step in the write -> compile -> test
I agree with you, but there's one small thing I don't get.
Faced with this piece of information, someone thought the logical thing to do was to, er, write an entirely new language?
Re:c++ is 'write-only' code (Score:4, Funny)
Re:c++ is 'write-only' code (Score:5, Interesting)
It's more like you decide you want a whole new room dedicated to watching movies, but in order to add that to your current house you'd have to spend tens of thousands of dollars and get approval from city hall and your homeowner's association. Just for a fairly small addition.
So instead you decide to go build a new house the way you like it, from the ground up, and while you're at it you add ethernet outlets into the planning because you always wanted that in your old house but you would have had to take down the drywall in order to get them where you wanted.
Re: (Score:3, Insightful)
Strings are a perfect example. The C++ standard defines a string type that is decent enough and fixe
Re: (Score:3, Insightful)
What C++ has always lacked, and PHP, Java and others do not, is a bundle of standard libraries that let you do things like process XML, talk to databases, and make templating EASY.
I agree with you, but there's one small thing I don't get.
Faced with this piece of information, someone thought the logical thing to do was to, er, write an entirely new language?
What? Your logic is circular. PHP did not have standard libraries for XML (etc.) until after it existed, obviously.
PHP was invented as a lightweight server-side preprocessor as an alternative to CGI, not as a general-purpose systems-engineering low-level compiled language.
(I don't disagree with your gist that PHP is not well suited to many of the jobs it's used for today, but I wanted to clarify the history.)
-b
Re: (Score:3, Funny)
"Faced with this piece of information, someone thought the logical thing to do was to, er, write an entirely new language?"
by my understanding, the whole new language slant is because of the nightmare of c++ code out there to reuse, with unintended consequences. php is very web centric and java the last attempt at a 'universal' coding setup. python is an example of new language and how more complicated new language implementation is.
Are you suggesting that they wrote PHP to avoid code reuse, that there hasn't been an attempt at a cross-platform language since Java, and that Python is complicated, all in the same paragraph?
Ridiculous (Score:3, Insightful)
Re:Ridiculous (Score:5, Funny)
What about the impact of whole classes of C++ bugs that don't exist in C++
I've spent many a sleepless night worrying about C++ bugs that don't exist in C++. I'm glad I'm not alone.
Re: (Score:3, Insightful)
Developers that are diligent enough to make only 1 memory-related bug/year can certainly spell variable names correctly.
If you have statically typed language, you rely on types. If you have dynamic, you rely on unit tests. Both are probably equally slow :)
Languages not for everyone (Score:4, Insightful)
Re:Languages not for everyone (Score:4, Funny)
A PHP programmer who turns out good PHP code
The Easter Bunny, Santa Claus, a PHP programmer who turns out good PHP code, and Steve Balmer are in the four corners of a room. In the center of the room is a chair. Who throws the chair first?
Steve Balmer, because the other three don't fucking exist!
Re:Languages not for everyone (Score:5, Insightful)
> a PHP programmer who turns out good PHP code
Yeah, that'd be me. Hi! We do exist, and there are plenty of us.
Granted, we tend to be outnumbered 100:1 by the PHP programmers who produce complete crap. The same is probably true of nearly any language.
Re:Languages not for everyone (Score:5, Funny)
A self proclaimed good PHP programmer... yeah there are about a 100 of those to every 1 that doesn't do that.
Re: (Score:3, Funny)
A PHP programmer who turns out good PHP code
Ontological argument: A good PHP programmer is better than a PHP programmer that doesn't exist. Therefore a good PHP programmer must exist.
Re: (Score:3, Interesting)
Unfortunately, the C++ programmer who writes bad C++ code is more common than the C++ programmer who writes good C++, and the bad C++ is probably harder to rework than bad php.
I once rewrote a bit of software that some MIT grads did. Theirs was 20K lines of C++, used 110 MB ram (constantly newing and deleting), used dozens of threads (constantly spawning and harvesting), and drove the system to its knees (90% system, 10% user load). My 2K (yes, one-tenth) lines of straight C used 5 threads (preallocated)
Re: (Score:3, Insightful)
10:1... Really? (Score:5, Insightful)
Re: (Score:3, Informative)
PHP's primary issue in the database department is it doesn't have a clean way of say, maintaining prepared statement declarations across connection instances. Which is frustrating. APC's handling of shared memory is not the best, either, and the memcached extensions for it need polish. Don't get me started on how PHP treats constants.
Where PHP really fails, however, is in memory usage. It takes up dozens of times as much RAM as a well-built C program would. Facebook would not reduce their computer count by
Re: (Score:3, Informative)
In terms of total page delivery latency for a typical I/O bound application, sure. In terms of actual cpu usage, 10x overhead for any dynamically typed language is to be expected. If the application servers are CPU bound, that means a lot more servers.
In addition, dynamic languages do not compile or JIT well, compared to statically typed languages, which severely limits the overhead reduction achievable.
First post from TFA nails it (Score:2, Informative)
The REAL solution (Score:5, Funny)
Just serve up plain text files. Anything else is pure decadence!
where did he get this factor? (Score:5, Insightful)
Re: (Score:3, Informative)
Re:where did he get this factor? (Score:4, Insightful)
I could copy and paste each paragraph of your post followed by some comment to the effect I have no idea how the paragraph is relevant, but I will spare the readers and just pick a couple general points of confusion.
You seem to confuse static (plain html, something that does not enter into the conversation _at_ _all_) with server generated (precisely the sort of thing PHP is used for). E.g.,
in fact, for static content, it can be highly efficient. . . . the majority were serving static content that was pregenerated and refreshed every so often
Again, no one is talking about static content, including GP who was talking about forms, and server generated pages.
You also seem to be confused about the general load on the server (factoring in thing like total MB served or something) as opposed to precisely the CPU load (again this is what we are talking about: no one cares about the fact that more complex web sites need CSS and JS files served up). E.g.,
The AJAX code itself must be sent, at the very least, as well as the various UI elements and CSS that are necessary with AJAX -- all of which is still being served like a static page.
Re: Server side overhead with AJAX applications (Score:3, Insightful)
Ergo, it is going to reduce the processing necessary on the server to do any given job
Any given job, yes. But if there are a lot more "jobs" (i.e. more requests that require server side processing), the efficiency of the language used on the server side tends to become more critical, not less, especially if the per request overhead is significant, something that happens to be one of Facebook's primary complaints about PHP.
No. (Score:5, Insightful)
Simply put: no.
The reason why they have so many servers is because Facebook contains so much data. The servers are there for a reason, and the reason is CACHING.
The overhead of PHP is very small for a platform that is all about sharing data and the bulk of processor time surely goes towards fetching that data in the first place. What, do you seriously think that when you hit your home page on Facebook, there are database queries issued for that? Lulz.
Besides, I'm almost sure that FB uses something like Zend Accelerator, which increases code execution speed a lot.
Anyway, just no.
Re: (Score:3, Informative)
Very true: they are are big contributor to projects like memcache.
Please (Score:2, Informative)
I don't care about your environmentalism.
Re:Please (Score:5, Informative)
Why stop there? (Score:5, Insightful)
Re: (Score:3, Insightful)
Seriously, if you take into account templates and inlining, there is a good chance that moderately good C++ code will run faster than moderately good assembly, on x86-64 of course - simply because assembler coder would not have the patience to take all opportunities of inlining.
You might think so, but it's not as simple as all that. You also have to take into account the CPU's caching behavior; large numbers of inlines can (i.e., I've seen it happen) make the size of the working set too large to fit in L1 (or even L2) cache. That in turn means that you're taking a substantial performance hit. What's better, the size of those caches is dependent on exactly what sort of processor you're working with, so compilers don't take them into account.
Inlining is a trade-off, as it increases
Just think how much greener they could be... (Score:2, Funny)
...were they to rewrite it all in assembly language!
Interpreted Languages... (Score:3, Insightful)
For something that is deployed to tens of thousands of machines..
Is there some reason why these languages couldn't be compiled and optimized? Code is just the programmer's will expressed as text that the machine can somehow interpret, right? If there is so much PHP out there, why wouldn't/couldn't there be an efficient compiler (by which I mean something that produces executables and not just "executables that are really just an interpreter tacked onto a script")
The dearth of such compilers on the market suggests to me that the gains wouldn't be as great as claimed for the majority of applications where interpreted languages are used.
Re:Interpreted Languages... (Score:4, Informative)
For example, consider the following. Say bad things about PHP all you want (it deserves it) but one of the things you don't generally see with PHP code is a buffer overflow, where you try to copy a bunch of strings and concatenate them together and you run out of room and don't notice it and you go clobbering memory. That's because the string manipulation code goes through a bunch of checks when you're appending strings. You can't just skip these checks and hope that everything will work the same. You may know that such and such a code-path isn't going to need all the bounds checking because you're, say, idunno, assembling fixed-length ZIP+4 codes or something, but the scripting language can't be informed of that fact using any extant mechanism (nor is it clear how you could integrate such a mechanism with the powerful abstraction that lets you not worry about the rest of your strings to begin with).
Moreover, as has already been pointed out, a lot of the computational price of rendering a web page is database queries and memory-cached-object queries which employ compiled code already. The string-manipulation overhead isn't all that significant compared to the abstraction that it buys you. It's probably a better idea to track down logic issues, where your code does stupid useless computations that it doesn't need that make it slow, or could do certain computations in advance to make it faster, or such.
I think there's a lot more potential for interesting machine optimization of code for things coming from the functional paradigm, where you can mathematically show the equivalence of certain portions of code with its optimized replacement, and that this paradigm will be making a resurgence in some places during the upcoming era of 128-core processors. This might be interesting.
Re:Interpreted Languages... (Score:5, Informative)
Actually, Facebook uses APC [facebook.com] to compile and optimize the code in the shared memory so it doesn't have to be compiled over and over again.
There are other libraries for caching PHP functions on many different levels as well, and they're open source, for the most part. Some real bright minds from Facebook and other large PHP applications have contributed to them.
Bottom line: PHP is quite powerful and efficient when built and extended properly.
Umm... no. (Score:4, Insightful)
Does the author seriously believe that Facebook isn't running some sort of PHP compiling/caching service, like APC or something similar?
It would be ridiculous for them NOT to be running something like that, which eliminates much of the advantage C++ would enjoy through being pre-compiled. While there still may be a reduction if Facebook were magically changed to precompiled C++ code, the reduction would be fairly minimal. In addition to that, you'd need to factor in the debugging and coding/compiling times, which would exceed the PHP times by an order of magnitude at least.
Re:Umm... no. (Score:5, Interesting)
The author is pulling numbers out of his ass and has no clue about what uses most time (waiting for database results mostly), about PHP accelerators and about caching systems like memcached.
He's comparing performance of php script running on a raw PHP installation versus running a C++ version of the same script, doing calculations that almost never apply to real world scenarios.
I don't see how any company would use C++ to develop their whole systems except maybe for some CGI scripts. Not even Google does it, afaik they use Perl and Python a lot.
Anyway, the number of servers has no direct correlation to the programming language. Out of those thousands of systems, lots of them are read only database servers in a cluster, lots are only serving static files (thumbnails, images used in CSS files on people's pages and so on), some servers are used solely for memcached instances and content used very rarely, some are load balancers....
Basically, the author has no clue.
I always found Livejournal's presentation about scaling very insightful, especially as it's a pretty big site, just like Facebook and other big time sites. The second link gives a lot of details about how they fine tune mysql and other parts of the system, which just goes to show how the apparent speed improvement of C++ versus PHP can overall be actually insignificant.
http://video.google.com/videoplay?docid=-8953828243232338732&ei=3VUuS5-hLaKi2ALXqanJBQ&q=livejournal# [google.com]
http://www.danga.com/words/2004_mysqlcon/mysql-slides.pdf [danga.com]
Re: (Score:3, Informative)
the author...has no clue about what uses most time (waiting for database results mostly)
Like many here, you are confusing page delivery latency with total processor overhead. If you need more than one processor for page processing, how many you need has little or nothing to do with how much latency there is elsewhere in the system.
Even if PHP is running 10 times slower... (Score:2)
I'm assuming the claim about 10 times is true, which I don't really think so...
But they could have done something - like precompile the PHP, just like JIT of Java, to make it better or on par with compiled C program.
There are PHP accelerators like Zend Accelerator for that.
Re: (Score:3, Insightful)
A trolling weak argument (Score:5, Insightful)
What a troll. Any point or argument based on assumptions is very weak. Here there are two: "..Let's assume this to be ..." and "...assuming a conservative ratio of 10...".
Don't make stuff up.
-Foredecker
Assuming... (Score:5, Insightful)
"assuming a conservative ratio of 10 for the efficiency of C++ versus PHP code"
ARRRRRGGGGHHHHHHHHHHHHH
Why? On what evidence? I mean, I hate PHP as much as the next guy, but last time I wrote a web application platform in C++, I got to the end, analysed the result and went "Great, I've made the fast bit even faster. Now, about that database engine..."
Re: (Score:3, Informative)
Latency is a different question than efficiency. If your page generation efficiency is bad, on a small setup the difference may be imperceptible. On a large installation, i.e. one with a large number of servers dedicated to page generation, the efficiency of those servers makes a big difference. Holding latency constant, in a large installation less efficient page generation means more servers. In a small installation, not so much.
Hells about to freeze over ... (Score:5, Interesting)
.. because I didn't ever think I'd be defending PHP.
However, it is a much better choice for a web application than C or C++ - and I say that as someone who codes C, C++ and Java for a living. There are no decent web frameworks for C++, memory management is still an issue despite the STL, and the complexity of the language means both staff costs and development time are inflated. Peer review is harder, as the language is fundamentally more difficult to master than PHP. Compared to Java, the development tools are poorer, and things like unit testing a more complicated despite the availability of things like Cppunit. There's no "standard" libraries for things like database access, and no literature that I am aware of that describes how you would go about designing a framework for C++. You'd most likely end up porting something like Spring to C++, and the even if you published your code on the web, I doubt much of a community would build up around it.
If you want a less contentious argument, and one which can be backed up with hard evidence, then argue PHP that should be replaced with Java. A well written Java web application, using a lightweight framework such as Spring or PicoContainer, should outperform ad-hoc C++ code.
He's not wrong.... But... (Score:3, Interesting)
Seriously, years ago I started working on a c++ version of j2ee (not just servlets, the whole kit) and i mean providing similar functions not identical methods of execution obviously. It wasnt terribly hard actually. But it all falls apart really quickly cause of several reasons:
1) platform architecture - the dependence here, even between different versions of the same distribution was a pain and essentially spelt the end of my work. So I was stuck with "do i make web apps c++ soruce, or shared library binaries?" to which there is only one real answer for portability - source.
2) its a systems langauge - dear god that makes it painful for so many reasons.
There are caveats to both those, but the reality is that php exists because it fulfils a need and it does it quite well. To compare the two (c++ and php) is a little ridiculous and ultimately this article just reeks of "please everyone advertise my c++ web tool kit for me!". Sure, facebook (and trillions of others) MIGHT move to c++ web tool kit, but find me a dev that knows how to code an app it, now find me 2, now find me 200 cause thats how many i'd need to write and maintain faceboot apps in c++.
Even taking the OP's assumtion c++ is 10 times more efficient at what php does and that you could actually code facebook in it as actually acurate and that php vs c++ is a one-to-one relationship for things like code maintenance, your still stuck with "how many API's am i going to have to re-write and how many php api's do i use that dont even exist in c++". Its ludicrous to assume that you could drop-in replace php with witty without ending up coding tonnes of c++ code just to do things that PHP already provided. Not to mention the zillions of little extensions that revolve around php to accelerate its web-abilities (memcached for example). The number of things that can be used along side php for web-related things and the number of api's in-built to php just mean witty is never even going to be viable as an alternative. Lets also not forget there are millions of people round the globe using php for web stuff - which ultimately leads to php being a good web language (i.e. security problems being found, optimizations, etc etc).
Of course, wouldn't facebook be using something like zend to compile php pages? I mean seriously, if the 25000 servers are running php and not running zend the waste here just in cost of servers would be unbelievable - shear idiocy on facebooks part (if it were true, and i'd very much doubt it) and I imagine zend would have almost given it away for free just so facebook could say "we got a x% improvement using the zend compiler".
So, I wonder how many people are now learning about witty for the first time (which seems like the only real reason for the article to begin with). Better advertising than adwords!
Author needs a clue about metrics (Score:5, Informative)
Yes, PHP is a heck of a lot slower on proccessor-bound tasks than C++. In a pure benchmarking contest, no doubt C++ will win.
But what about when both languages have to query a database (be it mysql/postgress/oracle, etc)? In this case, both are blocked on the speed of the database. a 15 ms query takes 15 ms no matter what language is asking. Facebook is not calculating pi to 10 gazillion digits, and it is not checking factors for the Great Internet Mersenne Prime Search. It is serving up pages containing tons of customized data. This is not proessor-bound... it is I/O bound both on the ins and outs of the database and the ins and outs of the http request. It is also processor bound on the page render, but the goal of this many machines is to cache to the point where page renders are eliminated.
Once a page is rendered, it can be cached until the data inside of it changes. For something like facebook, I bet a page is rendered once for every ~10 times it is viewed by someone. Caching is done in ram, and large ram caches take a lot of machines.
So lets look at those 30,000 machines not by their language, but by their role. We can argue the percentages to death, but lets assume 1/3rd are database, 1/3rd are cache, and 1/3rd are actually running a web server, assembling pages, or otherwise dealing with the end users directly (BTW, I think 1/3rd is way high for that.)
So 1/3rd of the machines are dealing with page composition and serving pages. If they serve a page ~10 times for every render request, then abtou 1/10th of the page requests actually cause a render... the rest are being served from cache. Those page renders are I/O bound, as in the example above - waiting on the database (and other caches, like memcached), so even if they are taking a lot of wait cycles, they are not using processor power on the box. The actual page composition (which might be 20% of the processing that box is doing), would be a lot faster in C++... So 10,000 servers, the virtual equivalent of 2000 are generating pages using php, and could be replaced by 200 boxes using stuff generated in C++.
So the choice of using php is adding ~1800 machines to the architecture. or ~6% of the total 30,000. Given that a php developer is probably 10x more productive than a developer in C++, is the time to market with new features worth that to them? I bet it is.
Re: (Score:3, Insightful)
You cache pages on your server so that instead of going to the database to fetch info, the info is already there.
Until you have a good reason to believe the info has changed. Say, the user updated something or someone posted a message. Then you go back and get new data and cache it again.
You also cache page components. Parts of the page that are on a different update schedule than other parts of the page may be cached separately or not at all (like ads).
This is stupid (Score:3, Insightful)
Companies use PHP to develop and run web app functionality because it saves them huge amounts of time and money over rolling out the same thing if you were to write it all in C++. Realize what the cost structure of a company like Facebook is - the amount they pay their engineers, marketing personnel, and so on is significantly more than their amortized server expenses and server operating expenses (including energy costs, etc.).
Furthermore, the 10x speedup assumption seems ridiculous - how much time is spent on their server in compute-intensive PHP loops where huge gains would be made from switching to C++? And how much of the "code" is really database queries of various sorts? Furthermore, you can generally isolate small areas like that in your codebase and rewrite them as modules in C or C++ to be invoked from PHP land - and if they could easily cut their server expenses even in half (let alone by 90%) by having a few engineers spend a few weeks rewriting some components, don't you imagine they've probably set about doing that already?
Re-casting a discussion in terms of greenhouse gas emissions or energy use doesn't change any of this - saving energy generally means saving money, unless it takes more expensive resources (such as 100s of humans, who have to spend hundreds of months re-writing code in C++, while they, their families, and dependents emit tons upon tons of greenhouse gases, use electricity, buy groceries, and so-on and so-forth). The cheapest solution certainly isn't always the most environmentally friendly solution (such as when negative externalities are involved - lower labor and pollution standards in China, for example, that make a less "green" product manufactured there less costly in the US), but a vastly more expensive solution that no company in its right mind would implement isn't necessarily greener just because it might save some electricity and a few servers once it was implemented.
Time for Congress to legislate language efficiency (Score:4, Funny)
This is brilliant! I think it's clear now the direction we must go. Overuse of energy-guzzling languages like PHP have put us on an unsustainable trajectory fueling out of control global warming.
Congress must act to regulate the use of these energy-guzzling languages. No longer will programmers and corporations be permitted to turn out inefficient code with impunity.
PHP, Perl, Ruby, Bash, your days are numbered!
Just wait until we can get UN involved. Python, you and your CO2 spewing simplicity are next!
Wasted Energy (Score:3, Insightful)
Isn't this "study" a waste of energy?
I am a C/C++ programmer by trade; I'm not fond of PHP. Yet this "C++ saves energy over PHP" argument smells like more selfish politics to me. And selfish politics is what is bringing doom down on humanity's head -- the use of PHP vs. C++ is a sideline, a distraction, and only truly valuable for people who have a philosophical axe to grind.
You want to save a lot of energy? Shut down all the computers running MMOs. And stop wasting cycles looking for alien signals in cosmic radio waves. And get rid of banal YouTube videos... and... the list is endless. The science behind Global Warming is being used to further political and social agendas that have little or nothing to do with adapting our species from a potential environment change.
In the end, selfish politics will kill us all. We will become a footnote in history is we do not discover enlightened self-interest.
Green Languages?? (Score:3, Insightful)
Ok, this has gone WAY too far .. we all need to just take a step back..
PHP vs. C++ (Score:4, Insightful)
This is idiotic, and is typical of the kind of pseudo-science underlying much of the climate alarmism currently en vogue. Like a lot of things, it is pretty much impossible to quantify which language ultimately uses more power, because of all the variables. As others have pointed out, you might save some power in the deployment of the code, but you would surely use more power in the development of that code. Then, you have to figure out what the total impact of that is, since you'd have more man-hours of coding, using human coders, who sit at desks, in offices, which must be heated and cooled, etc., etc.
This logic is crap (Score:4, Interesting)
It would take a really serious amount of in-depth analysis of the server application to even approach knowing what the efficiency impact of using a compiled language vs an interpreter would be on any specific stack. Or even stacks in general. Plus we don't even know what it really means to be "using PHP". What is PHP doing? Is it processing templates, doing just some post or pre processing with some kind of XML pipeline in the middle, how is the PHP deployed, etc?
It is simply ridiculous to make any assertions and claim accuracy for them. I'm no PHP fan boy by a LONG shot, but I know from hard experience that often a higher level tool which is optimized for a particular job can get the job done quite a lot MORE efficiently than a lower level one that isn't.
Coding C++,savings would be due to the 2015 launch (Score:4, Insightful)
Re:php is bad for the environment (Score:5, Insightful)
Seriously, is somebody taking seriously the 1 to 10 ratio of the story?
I mean, maybe raw execution of pure code is going 10 times slower in PHP than C++ (ouch, I didn't know that) but even then, it's far from representing the same ratio when talking about a number of servers. You have to take into account all other parameters (disk access, network, IO, etc... Those aren't 10 times as slow in PHP one would guess).
I would be astonished if this ratio is close to be the truth. Does anyone have any insight/information on this?
Re:php is bad for the environment (Score:5, Informative)
Re:php is bad for the environment (Score:4, Interesting)
Some optimized assembler would make a difference (ducks).
But network latencies, number of sustainable TCPs per session, db latency, weird table lookups (even arp drags a server down when you have 20K+ connects) are all at issue. Add in various dirty caches, file locks/unlocks and other OS machinations, and life can be tough for any app written in anything.
Then there are the backup servers, the availability servers, the DNS servers, the coffee servers, it just gets bogged down. A 10:1 efficiency claim is probably just language fanboy-ing..... or a consulting job looking for a spot marked X.
Certainly it's nice to be green... but using better optimization tricks (like GCD) for multi-cores is bound to help.... tickless kernels..... SSDs..... C++ wouldn't be my first pick.
Re:php is bad for the environment (Score:5, Funny)
"even arp drags a server down when you have 20K+ connects"
Are you perhaps a server admin in my company? I swear this is the best excuse for poor performance I've ever heard.
Re:php is bad for the environment (Score:5, Insightful)
It probably is a valid excuse if you have 20,000 client machines connecting locally via ethernet from a B class subnet such that the arp tables on the server keep overflowing.
Of course if you, as a system administrator ever let such an environment be setup you probably are really good at excuses anyway.
Re: (Score:3, Interesting)
Wrong - the language makes a huge difference. Try using the c api and CLIENT_MULTI_RESULTS and CLIENT_MULTI_STATEMENTS and concatenating 10,000 queries into one request, then using mysql_next_result() to get the next result set (no, not the next row, the next result set - 0 or more rows).
One connection. Not 10,000. A BIG difference in execution time. Testing showed that the optimum
So the bindings make a difference? (Score:3, Insightful)
Why is it that a decent PHP (or Python, or Ruby) MySQL binding couldn't do the exact same thing?
Re: (Score:3)
Wrong - the language makes a huge difference. Try using the c api and CLIENT_MULTI_RESULTS and CLIENT_MULTI_STATEMENTS and concatenating 10,000 queries into one request, then using mysql_next_result() to get the next result set (no, not the next row, the next result set - 0 or more rows).
One connection. Not 10,000. A BIG difference in execution time.
Are you trying to imply that PHP establishes an entirely new connection to the database for every query? If so, you basically lose all credibility you might otherwise have.
Figures off by a factor of 10 to 100 (Score:3, Informative)
My own experience doing server development in c was that it's a minimum of 30:1 (and in in some cases, much greater). Plus the speed differential is huge, and also in favour of c.
There's a big difference between a couple of hundred requests a second and 6,000 - 10,000.
Then again, the php code had to be served through apache, while the c code was served directly by a custom server sitting on a separate socket, so there's no telling how much of the overhead was from apache.
Even the absolute worst-case
Re: (Score:3, Insightful)
What kind of work were those 10K req/sec on your own custom server doing? Was it a standard db-backed web app, or something more specialized and computationally intensive?
Not that I doubt the difference you saw - but I'm still skeptical of the 10:1 factor as applied to Facebook servers, which seem relatively standard webapp cycle (request -> datastore lookup -> html), *just from the programming language*.
Admittedly, I don't do PHP, so the language could be as bad and impossible to scale as you claim.
Re:Figures off by a factor of 10 to 100 (Score:4, Informative)
Those were actual benchmarks run at peak load for 5-minute periods. sustained rate of over 600,000 queries in 5 minutes, or 2,000 per second (around 2,200 iirc), on absolutely craptastic hardware, against an 8 gig mysql table. Benchmark was by running ab (apache benchmark) against a custom forking server instead of apache, tested with between 100 and 400 simultaneous requests. Threads were never "reaped", always reused, so it was important that there were no memory leaks, but never having to spawn another thread after initial startup also contributed to the difference.
Contrast to php, where every script has to be loaded, interpreted, then flushed out of the system so it leaves a clean memory footprint for the next script, and where tons of variables that your script may never call have to be initialized each run. Obviously only compiling what you need and loading it once is more efficient :-)
Re: (Score:3, Interesting)
Contrast to php, where every script has to be loaded, interpreted, then flushed out of the system so it leaves a clean memory footprint for the next script, and where tons of variables that your script may never call have to be initialized each run. Obviously only compiling what you need and loading it once is more efficient :-)
You're forgetting all the php optimizers, script and chunks of code caching in bytecode and ram caching of scripts. These make a major difference, but are probably just used on larger websites (like on facebook)
Re: (Score:3, Insightful)
Yes, it is harsh, but anyone who has not programmed in c and assembler, and then spouts off nonsense about how php can't possibly be 10x slower, doesn't have the programmer mind-set.
That mindset includes understanding the runtime environment - which means knowing the limitations of your tool - in this case php. That means you'll not "have" to do something in php because "when all you know is php, everything looks like it needs a script" rather than a different tool.
Case in point - generating test data
Re: (Score:3, Insightful)
While you're churning away your super optimized C code which runs faster than god knows what and finally debugging the library to handle your super cusotmized tcp/ip replacement, I'll have already rolled out the application you wanted to do, but in some "non-programming/scripting" language like PHP, Ruby, Python, or hell... even Java.
There's a purpose for every language out there and frankly, writing some form of code to have a computer perform specific tasks is called programming. So please contain your e
Re: (Score:3, Insightful)
Which would be very relevant if Facebook was doing heavy number-crunching. The only numbers on the site are comment and friend counts, which isn't especially taxing work (especially since it's all de-normalized). The majority of FB is database activity and transforming that into HTML and JSON. If you want to place blame for inefficiency, MySQL would probably be your best bet.
Re: (Score:3, Funny)
Seriously, is somebody taking seriously the 1 to 10 ratio of the story?
Only 1 to10 ?!? I would have thought 1 to 100.
Re:php is bad for the environment (Score:4, Informative)
From my personal experience: Data-heavy applications run at a complete crawl in PHP. 10 times slower, is, in my opinion, a vast understatement.
Then again, that’s not the point of PHP. The point is, that in PHP, provided you already know how to program, also get things done more than 10 times faster, than in C++. Because there is a simple function with defaults and automatisms for literally everything.
Only if those defaults and automatisms are other than what you expect, you will get into big trouble. And because the PHP interpreter is truly a horrible piece of shit (I was able to run totally illegal constructs, with plain text right in the middle of the code, and it ran, doing nothing of what I expected it to do.), that happens quite a lot.
It’s one reason that drove me to the extreme strictness of Haskell, where you have to get it right upfront, so it doesn’t bite you in the ass later.
Re:php is bad for the environment (Score:5, Funny)
Re:php is bad for the environment (Score:5, Insightful)
"development" also has one.
Not to mention clients. 20K servers is nothing compared to the millions of clients drawing higher power due to running looping flash commercials.
Re: (Score:3, Interesting)
You mean kind of like Road Send
http://www.roadsend.com/home/index.php?pageID=compiler [roadsend.com]
Re: (Score:3, Informative)
Re:Sounds like cheap C-- drugs ! (Score:4, Insightful)
while true it ignores things like your comparing a simple search box, with millions of users who post multi megabyte files to their personal space for everyone to see. try it some day save a facebook user's page locally and see just how much data is coming down that pipe, on top of the scripts that are running.
Your comparing googles front door with facebooks entire company. Google probably has that many servers running web crawlers, and twice over again to store that massive database they use.
Re:people use PHP? (Score:4, Informative)
Re: (Score:3, Insightful)
It was built from day one to integrate with Apache, it's not a nasty bolt-on hack like mod_perl. It's in-process so there's no startup overhead like with CGI
So mod_php is not a nasty bolt-on hack?
Re:people use PHP? (Score:5, Insightful)
mod_php has never integrated into Apache nearly as deep as mod_perl did. That is, lower level Apache APIs are not exposed to PHP. Using mod_php is an acceptable replacement for CGIs, but mod_perl does a lot more than that. That means taking over the entire server life cycle handlers to the point where, in Apache2, you can implement (say) a Gopher server if you want.
mod_perl is not a hack. PHP, as a language and an API, very much is.
Re: (Score:3, Funny)
I came here for an argument!
Re:people use PHP? (Score:5, Insightful)
Your post is really annoying. Did you mean to be so obnoxious? And +5, Insightful. Come on, php isn't popular with slashdotters but whatever one calls reverse fanboyism it isn't cool either.
No, features that make web development "dead simple" are those that actually do something to make web development simpler...
Absolutely. And PHP does it. That's why it's so popular. There may be even more that can be done but if no popular language is doing it already that argument is kind of pointless.
You contradict yourself.
No he doesn't. You might not like scripting / dynamic languages but taking the best (or a good stab at taking the best) of scripting, C and perl can actually make some things more straight-forward. Need a regular expression? Used to function calls rather can syntactical regex? Need perl regex? preg_match.
Patently false. PHP has no dependency on Apache now, it originally used CGI, and continues to support CGI, FastCGI, and operation as a module in web servers other than Apache (such as IIS). The CGI startup overhead problem has many solutions, such as FastCGI, AJP, proxying, etc.
Patently missing the point. PHP and Apache go together so well it created the LAMP mindshare space.
But "not in-process" does not imply the use of CGI, and it does not imply the use of any system with long loading times. Furthermore, "in-process" is potentially insecure and can be less reliable - as all code runs in the same process.
Who cares? His point is startup cost which is generally higher for forks vs modules and you're just plain going to get more scalability compared to the traditional perl cgi forking method. Hence mod_perl.
Give me a break. You can dislike anything you want but why do you even bother when you don't have all the facts.
+5, Insightful. Dear me...
Re:people use PHP? (Score:5, Insightful)
I use it because I can code up relatively fast, relatively secure dynamic websites in a very short amount of time. I can install it on a webserver in seconds and it integrates beautifully with Apache and MySQL. Maybe there is a better solution out there, but PHP has always done what I need it to and I've never had a problem with it. It's never given me a reason to look elsewhere.
What I don't understand is all of the PHP-haters out there. Really, who cares if it is "the script kiddie's substitute for cgi-perl"? Isn't the proper measure of a tool if it does what you need it to and not who else uses it?
Re:people use PHP? (Score:5, Insightful)
Re:people use PHP? (Score:4, Interesting)
Actually, both parent and GP are right. PHP is wonderful for web development, but has more than a few annoying quirks with regard to consistency.
On the flipside, it has hands-down some of the best documentation on the planet, which makes the quirks tolerable, and is a big part of the reason why the language is so popular (especially with new programmers)
I'm seriously hoping that a new PHP release finally clears up all of the inconsistencies in the main namespace once and for all. It'll be painful at first, but a very-good-thing in the long term. Updating old scripts could even be a semi-automated process, given that the necessary changes are extremely superficial.
Re: (Score:3, Informative)
I remember when it was the script kiddie's substitute for cgi-perl. What does it offer from a theoretical and engineering PoV, apart from a Visual Basic learning curve?
Market penetration. From managerial perspective, you can hire PHP developers a dime a dozen, and replace them very quickly if needed. From developer perspective, you can grab any of those "PHP in 10 nanoseconds for complete idiots" books, an Apache+PHP+MySQL bundle installer for Windows, and learn it in a few days to the level sufficient to be hired.
Of course, the typical quality of a PHP solution is what you'd expect from such approach, but when did it ever stop anyone?
If you mean technological advantages,
Re: (Score:3, Insightful)
And everything exuding heat is perfectly natural, no problems there.
The deaths and environmental changes from heat exchange in rivers near power plants don't happen, nope, uh uh.
Water's perfectly natural you need it to live, no way to drown in it, nope, uh uh.
Re:F1 car in normal street. (Score:5, Funny)
You can go to work in a F1 car, or your normal car.
I wish. My F1 always gets stuck in the gutter at the end of the driveway.
But the same argument could apply (Score:4, Insightful)
Yes. I know the difference. C is an elegant if simple language, which is hard to program properly. C++ is an abomination that attempted to take the elegant, simple nature of C by bolting on spare body parts from dead object-oriented corpses, resulting in a language that is neither simple nor elegant, which is even harder to program properly.
See, I know the difference.
But if the point is to gain efficiency, why would you stop at C++? It's not a magical perfect balance of performance with elegance. C would give better performance than C++.
Sure, there's the non-OO tradeoff (though you could quite easily gain the benefits of OO, though not as elegantly as C++), and then you don't have to deal with fucking templates (which are really nice to program, but a bitch to clean up when someone else has fucked them up for you).
The premise of the article is stupid, and shows a pure lack of understanding of PHP, web service architecture and implementation, and a not-inconsiderable dose of C++ fanboi-ism.
Re:so where is the example of a company doing this (Score:4, Interesting)
I have done projects like this, and received massive speedups and performance increases. The issue is that you need to understand the real reasons why rewriting a program in C and/or assembly gives a massive performance increase. Inevitably, the reason why the C program is so much faster, is that a programmer has went through and rethought the application. The programmer eliminated string copies, string manipulations, data communication overheads, and data manipulation/translation overheads by rethinking the programs design.
For example, imagine a very simple application designed to take a digital input, and display a red/green indicator to a user depending on the input state. Count every time a major string overhead, data communication overhead, or data translation overhead occurs in each of the proposed solutions.
Web Solution
1. Input digital input via PLC (Data Overhead #1)
2. Upload data from input via PLC communications protocol to PC (Data Overhead #2)
3. Make data available to other programs, for example RSSQL makes real-time I/O appear as SQL database queries (Data Overhead #3)
4. Use PHP or ASP to generate a web page based on a SQL query for the real-time input (Data Overhead #4)
5. Use a web browser to query the relevant web page. (Data Overhead #5)
Web Solution performance: it might be able to update the display screen every 1/5 second.
Embedded C Solution
1. Input a data point using real-time I/O
2. Paint a computers display screen accordingly. (Data Overhead #1)
C Solution Performance: 1/60 second, limited by the refresh rate of the monitor.
Assembly / Microcontroller Solution
1. Input the data point, with INP , AX
2. Output the data point to a Red/Green LED, with OUT AX,
Note: the assembly implementation doesn't have any string manipulation, so it doesn't have any significant data overhead.
Assembly Execution Time: Less than 1 micro-second.
The crucial concept from the above example is that the programmer reduced overhead and execution time, by simplifying program operation. The problem was solved in 3 different ways, and the fastest solution wiped out all the communication/string/data management overhead. If you want to make a computer program very fast, it is necessary to reduce data communication, string manipulation, and complex data structure overhead.
Which languages do this and why: .NET encourage carefree string use and data structure use. The have automatic garbage collection. As such, minimal penalties exist for the programmer to use strings.
Level 1 - Simplest: Assembly is the best at wiping out string overhead, because engineers willingly migrate complex functionality to hardware before implementing it in assembly. In this case, the display screen was eliminated in favour of a direct output to an LED.
Level 2 - Low-Level: C is remarkably quick at string manipulation programs, because programmers minimize the amount of string manipulation. String manipulation in C sucks, and is difficult to get correct. As such, programmers attempt to minimize it, or use optimized tools like lex/flex or yacc/bison that automate the difficult problems.
Level 3 - Garbage Collected: Java and
Level 4 - Scripted: PHP, Perl, Python are higher level languages focused on easy programming for high-level tasks. They pretty much assume the programmer doesn't care about the overhead of processing strings or complex data structures. Instead, they make it easy for the programmer to program the complex data structures.
An application like FaceBook has to have some complex data structures to do its job. In that case, a migration from PHP to C will likely not produce great benefits, because the C program still has to do all the same work the PHP program does. The old rule was that interpreters were very slow. With modern techniques, just about any language can be sufficiently compiled to
Re: (Score:3, Funny)
Alright wise guy. Explain twitter.