Forgot your password?
typodupeerror
PHP Programming IT Technology

On PHP and Scaling 245

Posted by CowboyNeal
from the web-pages-served-fresh-all-day dept.
jpkunst writes "Chris Shiflett at oreillynet.com summarizes (with lots of links) a discussion about scalability, brought about by Friendster's move from Java to PHP. Chris argues that PHP scales well, because it fits into the Web's fundamental architecture. 'I think PHP scales well because Apache scales well because the Web scales well. PHP doesn't try to reinvent the wheel; it simply tries to fit into the existing paradigm, and this is the beauty of it.' (The article is also available on Chris' own website.)"
This discussion has been archived. No new comments can be posted.

On PHP and Scaling

Comments Filter:
  • by Dozix007 (690662) on Saturday July 03, 2004 @09:42AM (#9599558)
    PHP inherntely will not lead to scalability, however, if you ever try to create any applications that use a DFS-type algorithm, it can happen. PHP (I know it is web-based, shouldn't ask too much) does not allow for extremely simple soloutions in DFS type algorithms that are apparent to most users. Many will end up with too many "while()" statements and bring down script efficency exponetialy.
    • by julesh (229690) on Saturday July 03, 2004 @10:00AM (#9599644)
      Sorry, your abbreviations are confusing me. DFS? I know Disk File System and Distributed File System, but neither of those seem a good fit for what you're talking about. So... what are you talking about?
      • by Dozix007 (690662) on Saturday July 03, 2004 @10:04AM (#9599655)
        Depth-First-Search. You can use PHP to create a simple search engine by using arrays, fopen, fread, and while() loops. If done improperly, you can eventually loop your script into oblivion creating big time inefficency.
        • by Anonymous Coward
          You're not thinking in a PHP architecture.... thinking Java style J2EE does not apply to using PHP.

          What is a PHP "server"... it is the combination of Apache and PHP and a request being served. Since the web is stateless with simple session IDs tying things together it's not really necessary to share memory or resources between requests... hence Rasmus Lerdorf's "share nothing architecture."

          It doesn't make sense do an olympic-sized web crawling script, and certainly not invoke it in the time of a web requ
          • Actually... a web crawling script is quite small. I am not thinking in a Java mindset, but a CS one. Basic CS theory and knowledge can be applied most anywhere. PHP search scripts are quite usefull for internal site search, or a small network of sites. I also think that many should stop downing PHP as an unavaiable possibility for large projects. It is possible, you just need to be dynamic and well organized when doing so. A well coded site can work quite well, you just need to know what you are doing.
    • Why not try term vector space? I've recently completed a PHP/Mysql term vector space engine (see website).
  • Gah, no! (Score:5, Funny)

    by DrEldarion (114072) on Saturday July 03, 2004 @09:44AM (#9599563)
    it simply tries to fit into the existing paradigm

    Allright, he used the word "paradigm", that makes his opinion automatically invalid.
  • by Michalson (638911) on Saturday July 03, 2004 @09:47AM (#9599575)
    The only real argument I could really find was "Java doesn't do X well, therefore PHP must be great". The author seems to live in a universe with only two choices, his straw man Java, and his favorite web language, PHP. When he does try and argue PHP's merits on its own, it seems to collapse into a PHP is good because its good argument. I don't see any part of the article addressing how PHP can benefit the developer facing real issues of large scale web development (such as the need for caching systems on high volume websites, or the maintence challenge of larger code bases on complex sites). While good arguments may exist for PHP, they just don't seem to be here.
    • by lamz (60321) * on Saturday July 03, 2004 @09:57AM (#9599625) Homepage Journal
      I don't see any part of the article addressing how PHP can benefit the developer facing real issues of large scale web development (such as the need for caching systems on high volume websites, or the maintence challenge of larger code bases on complex sites).

      The article doesn't mention it, but Smarty [php.net] is an excellent PHP library that implements, among other things, caching. I have used it extensively with excellent results.

      • by claar (126368) on Saturday July 03, 2004 @01:09PM (#9600505)
        Um, this is an article about scaling, and therefore performance. Mentioning Smarty in such context is almost off-topic ;-)

        Personally, I find the lighter weight Savant [phpsavant.com] to be a better choice, since it's straight PHP (No syntax to learn either -- bonus!). That removes the need for Smarty's "compile into php"
        step entirely, which has giving me MUCH better performance than when I was using Smarty. IMHO&experience, at least.

        (And if you want caching, it can be done at the PHP engine level rather than in your templating engine -- see any of the PHP accellerators out there)
        • by justMichael (606509) on Saturday July 03, 2004 @02:14PM (#9600894) Homepage
          Yes Smarty compiling the templates into PHP causes some overhead. Compiling templates only happens once (unless you modify the template) so I'm not sure why your performance numbers were so much better with Savant, maybe the config?

          But if you are running a site that can use the output caching that Smarty offers and the code is done properly, you will see huge speed increases as you can skip everything in the page including opening a db connection. Which gives very close to flat HTML performance.

          As to using PHP accelerators, they don't handle output caching by themselves. You can code your own, but my time is better spent doing other things ;)

          Using Smarty and Turck together is pretty impressive.
    • "Java doesn't do X well, therefore PHP must be great"

      But then, how can you conclude a language scales well, other than by comparing it? Java is supposedly used by so many sites that it can be used as a measuring standard. If PHP scales better, it must be great. I think that sounds quite reasonable.

      For the record, I like neither language that much. I use PHP every day, but I would rather be using Python or LISP.
      • Well, you can make PHP which scales like crap and Java which scales as high as your bandwidth will allow, and vice versa. Java has architectural differences which make it potentially better suited to scaling high (both in terms of handling lots of users and in managing lots of complexity), but you need to have some clue to actually exploit them.

        It's like comparing MySQL and Oracle; they both do largely the same thing, but Oracle's a lot more advanced and aimed a lot higher. From the article summary, it s
    • Did you read all of the authors opinions ? I see many good reasons for right tool for the right job listed here, and Friendster is obviously one of them.
    • I don't see any part of the article addressing how PHP can benefit the developer facing real issues of large scale web development

      It does: the author talks about how PHP forces you to use scalable mechanisms for state management (in contrast to, say, Java).

      or the maintence challenge of larger code bases on complex sites

      Well, PHP generally requires much less code than Java to get the same task done, so that's another advantage for PHP.
      • Well, PHP generally requires much less code than Java to get the same task done, so that's another advantage for PHP.
        Nice troll. How many lines of code takes multithreading in PHP? No luck here. How many lines to run in a smartcard? Nop. A Python interpreter in PHP? Niet.
        Use the right tool for the problem.
    • I don't see any part of the article addressing how PHP can benefit the developer facing real issues of large scale web development

      I always tend to think of *accessing data* as where the rubber hits the road in website scalability. Of course, PHP by itself is super-scalable (because each request processing is independant)... but what exactly are you *doing* in that PHP code? If you aren't accessing and displaying data (generally from a database), you've got a pretty unique website.

      I don't see much point
      • Quite. The advantage of Java when combined with a database (and as you rightly point out how often is a webapp *not* combined with a database?), is that you can take advantage of in memory caching, improving scaling up to a point by reducing load on the database, which is typically the slowest part of a web app transaction.

        Personally I use and love both Java and PHP for web apps, horses for courses certainly, but I would be far more comfortable with Java for a large webapp any day.
  • by DavidNWelton (142216) on Saturday July 03, 2004 @09:48AM (#9599582) Homepage
    Perhaps it's not mentioned very often because it's obvious, but I think it's an advantage for systems like PHP, or Rivet [apache.org] that they scale down very well.

    What does this mean? That they don't consume too much in the way of resources, and are very easy to get started with. This puts a dynamic web site within reach of more people, which is a good thing, even if inevitably some of them will, yes, write crappy code. It is another example of the "worse is better" philosophy.

    I just wish they had used Tcl or something else already out there instead of creating a language that in and of itself is nothing very exciting, and has been a bit slow.
    • It is not a good thing that there is a short learning curve on PHP. While it does put the ability for dynamic webcontent at the fingers of most users, it also creates a crapflood of insecure sites. Not to mention when a user may get into more advanced PHP programming and know nothing of basic CS (I know, not a big CS language, but some things must be known). Inefficent scripts will bog down sites, improper loops and insecurity can wreak havok on a network. I have recieved several emails in relation to a PHP
      • Quick and dirty; PHP is the VB of XXI century ;-)
      • by zangdesign (462534) on Saturday July 03, 2004 @12:38PM (#9600331) Journal
        It is not a good thing that there is a short learning curve on PHP. While it does put the ability for dynamic webcontent at the fingers of most users, it also creates a crapflood of insecure sites.

        I hate to say it, but the problem exists between keyboard and chair. PHP is not inherently secure or insecure language. It may still have bugs, but those are a function of age and the serious ones have been taken care of. Rather, the problem is in the way people write software using PHP, without necessarily understanding the nature of the platform they are using.

        It is not the job of the language to enforce security - it is the job of the programmer.
        • I hate to say it, but the problem exists between keyboard and chair. PHP is not inherently secure or insecure language.

          Well, unless I misparsed the grandparent correctly, it didn't imply that at all; it said "It is not a good thing that there is a short learning curve on PHP"- implication, the ease of use is the problem, as opposed to "PHP is insecure".
    • by Anonymous Coward
      You're right.

      however, if you wrote the same thing about Visual Basic / ASP, you would have been modded a troll.

  • Another article (Score:4, Informative)

    by Anonymous Coward on Saturday July 03, 2004 @09:49AM (#9599585)
    Here's an article from Jack Herrington on PHP's scalability.

    http://www.onjava.com/pub/a/onjava/2003/10/15/ph p_ scalability.html
  • by ahmetaa (519568) on Saturday July 03, 2004 @09:50AM (#9599588)
    if someone wants to produce a high performance web site in Java, jsp is a bad choice. use Velocity - pure java objects - a decent DB abstraction mechanism (Hibernate, iBatis). . Plus, i used php, ok, it is easy to use and can be preferred small to medium size web sites. but call me biased, it is nowhere near the elegance of java.
    • by Anonymous Coward
      You can do ugly things using any languaje/patform it depends on the programmer
    • Correct me if I'm wrong, but isn't JSP just a simplified syntax for creating servlets by embedding Java code withing an HTML document?

      If so, why should this cause performance problems?

      (As an aside, I've run a JSP server in the past on a 100MHz pentium, and after the first use of each page performance was OK, so I'm not sure what the big problem is...)
      • thats what it is(on first run they're compiled into something like 'normal' servlets)... but then again the guy didn't really give out any reason why jsp's are a bad idea so it's highly possible he didn't know anything about them.

    • The problems with JSP are to do with writing maintainable code, not speed. There is a principle of software development that suggests that it is a bad idea to embed software logic in presentation code, as this does not allow for easy modification. If you support this principle, JSP (and some ways of using PHP) are not a good idea. However, JSP is not slow: the JSP pages are translated into Java Servlet source code and then compiled. This can result is very fast websites.
      • No...you are not correct. Your point about JSPs is only this is only true under a model 1 implementation/design. A model 2 implementation (where your business logic is done in Java objects) utilizes JSPs exactly like Velocity...only as a template. See Struts and other MVC frameworks that embrace a model 2 implmentation of JSPs.
        • No...you are not correct. Your point about JSPs is only this is only true under a model 1 implementation/design.

          You are right - I was talking only about model 1. JSPs can be used very effectively in MVC frameworks.

          What I was trying to describe was JSP use when Java code is embedded, rather than tag libraries.
    • by mabinogi (74033) on Saturday July 03, 2004 @10:43AM (#9599808) Homepage
      JSP on it's OWN is a bad idea.

      just as Velocity on it's own would be a bad idea.
      Write your buisness logic in plain java, use servlets to manage the flow of control, and to call your java API to create value objects (beans) to place in the request, and then use JSP to format the data.

      You only run in to problems if you try to do everything with JSP, which is always a bad idea, just as it's always a bad idea.

      and JSP 2.0 is even better with the JSTL expression language built in.
      • by caseih (160668) on Saturday July 03, 2004 @12:36PM (#9600322)
        Even better than JSP and other technologies is to use Jakarta's Tapestry as the presentation layer. Tapestry rocks and I look forward to having something like that on PHP. Right now PHPTal is close. The ability to define a page as components (almost in GUI terms) and then define event call-backs and so forth really makes life better.

        Tapestry for the view, Spring for the control, and Hibernate for the model is a combination hard to beat with php. Sooner or later all these technologies will be used no matter what underlying language.
    • JSR 223: Scripting Pages in JavaTM Web Applications
      The specification will describe mechanisms allowing scripting language programs to access information developed in the Java Platform and allowing scripting language pages to be used in Java Server-side Applications. JSR 223 [jcp.org]
  • by TheNarrator (200498) on Saturday July 03, 2004 @09:52AM (#9599600)
    I've seen a friendster stack trace before, when the app was running slow at 5 am. For those of you who don't know what this is, it's when Java runs into an error and tells you were your program died. It was really funny. Basically there was a servlet and a call to Database.java and on line 8000 of database.java they were calling mysql directly. Real nice architecture, NOT!

    • Only proves that:
      -friendster programmers don't know how to catch an error in Java, something that Java has plenty ways to do.
      -is easy to find where the error is in Java. I've seen lot's of "Warning: MySQL Connection Failed: Unknown MySQL error in /www/something.php on line nn" so it is a similar thing with PHP. The stack trace makes usually easier to the programmer to find the problem but it should not be shown to the user.
    • What do you mean "calling mysql directly"? I can assure you that isn't actually possible in Java. MySQL is a C application, Java can't call C code without some kind of intermediate layer.

      Also, what's "Database.java" -- if it's part of the MySQL/Java interface layer, this would be perfectly appropriate behaviour.
      • What do you mean "calling mysql directly"? I can assure you that isn't actually possible in Java. MySQL is a C application, Java can't call C code without some kind of intermediate layer.

        Well, we could also argue that you couldn't do that in any language, since MySQL is a daemon and the client libraries will open a connection to the daemon to talk to it...

        Even if it were a JNI wrapper around the C libraries, that was not the point of the original poster. The point seems to be that there's a huge monolit

    • /me slaps forehead. no, that was the MYSQL JDBC driver you saw in the stack trace. JDBC is a database-agnostic plugin architecture for communicating with databases that abstracts away a lot of the stuff that makes talking to a database a "to the metal" task. it's like pear_db, only you don't have to hope that your ISP included it when it was compiling PHP. you just plop some JARs in your web application and chug.
    • No-one seems to have picked this up, but it says it all. If any of your support classes has a Line 8000, especially one at which an exception can arise that is not being caught, you may need to go back and do a few basic classes in software design.

      Confession time: the worst Swing based class I have ever committed has about 4000 lines, but about 2/3 of that is Swing.

    • Just to clarify, IMHO their architecture appeared to be jsp model 1 [oracle.com] architecture which IMHO is not a very performance oriented architecture. They could have at least used jsp model 2 and used various caching layers for business objects,etc.
  • by Morgahastu (522162) <{eman sdnab evaf ... egorREZEEW} {ta}> on Saturday July 03, 2004 @09:53AM (#9599608) Journal
    I think the term is subjectable depending on the context in which it's used. Scalalable does have many definitions but I don't think that they are all wrong except for one.

    His definition suits him well but it might not be helpful for me.

    I might use scalable just to say that an application can easily (with little or no modification) handle 100x more users. This doesn't necessarily mean that the difference in system load varies a minimal specific amount per each extra request. All that matters is that it will work with higher demand. Who cares how or why.

    I think scalable can also mean that an app can handle 10,000 users when hosted on a single machine but when put on a cluster of computers it can handle exponentially more users. To me that is a scalable application.

    Scalable has no set definition in the contexts of applications.
    • by bangular (736791) on Saturday July 03, 2004 @10:12AM (#9599689)
      The term "scalable" has become an industry buzzword. It is fruitless to argue whether something is scalable or not if there is no clear defination. It's like arguing whether you believe in freedom or not. Of course most people in the world will say they believe in freedom, but if you ask 100 people to define it you will get 100 different answers (the Bush administration has had a field day with this because the minute you oppose them, they accuse you of not believing in freedom; their defination of course).

      It is impossible to say php is or is not scalable unless a defination can be agreed on. And with "scalable's" current buzzword status, I don't see that happening very soon.
    • This is just me being a grammar nazi, but you use the word "subjectable" in the first sentence of your post, that's not a word. You're thinking of the word "subjective".
    • It's also good to determine how scalable the code is. Is the code readable? Maintainable? Extensible? Can large teams effectively work on the same code base?

      While this does have more to do with how the code is written, programming languages to contribute to code scalability.

      Does PHP promote scalable code?
  • by jenkin sear (28765) * on Saturday July 03, 2004 @09:54AM (#9599611) Homepage Journal
    Scalability is rarely that much of an issue- any halfway decent architecture (php, java, even .net) will let you scale horizontally- and Moore's law will take care of any performance problems in time.

    My big issue with PHP is maintainability- I see it (perhaps incorrectly) as a glorified templating language, which places it on the same evolutionary track as ASP and cold fusion; developers will tend to munge sql calls into the templates, blow off any MVC separation, and get a system that is very hard to keep going for more than a few revisions.
    • by Anonymous Coward
      you are correct that you are incorrect.... if anything, developers are moving towards MVC, like Mojavi [mojavi.org] - probably PHP's best MVC framework because it doesn't try to port struts to PHP, it writes a very flexible framework using PHP the way it was meant to be used.

      Also, maintainability is not a feature of a language, it's the organization practices of the developer. Java developers are used to throwing files wherever, doing import statements wherever, and once its compiled, it's organized! Well, you have t
    • by julesh (229690) on Saturday July 03, 2004 @10:13AM (#9599694)
      developers will tend to munge sql calls into the templates, blow off any MVC separation, and get a system that is very hard to keep going for more than a few revisions.

      Yes, that is tempting. But, conversely, it's a very useful capability for small projects. For larger projects, you just need to ensure you have the discipline not to use the capabilities.

      For instance, here [covcen.org.uk] is a site I developed in PHP using a strict model-view separation. There is direct linkage between view and controller and controller and model -- I couldn't be bothered to sort that out for a project of limited size like that one. In a larger project, I'd probably devise some kind of mechanism for that.

      You can write unmaintanable code in any language you choose. Discipline is the key.
    • Any developer tackling a serious project will soon realize the same "problems" with maintainability you have, but fortunately, there are solutions to all of them. First and foremost, is the Smarty templating engine. Then there are the PEAR classes, PECL extensions and if you want to get all CS-ey, there are the OO features of PHP which are expanded and refined in PHP5 (and unlike Perl 6, PHP5 will actually be the release version in the very near future).

      If PHP were as unmaintainable as it seems to all of
    • My big issue with PHP is maintainability-

      True: a lot of big PHP packages look awful and can't be touched without falling apart.

      Sadly, the same is true of a lot of big software packages written in other languages.

      The solution? Hire better programmers or keep your software small and simple. In fact, the former will likely result in the latter.

      blow off any MVC separation

      I think it's an article of faith, not fact, that MVC contributes anything to maintainability.
    • I agree, but I've also seen the opposite - fairly simple projects completely buried in the complexity of multi-tiered architecture.

      An example was a project I "inherited" a few years back that was written with ASP for the presentation layer, business logic in COM objects, MS-SQL stored procedures for the database calls and MS-SQL for the backend database. It needed three developers to maintain all the different parts, and a simple change like displaying an existing database field on a web page meant changin
  • by Lolaine (262966) on Saturday July 03, 2004 @09:55AM (#9599615)
    First of all; Everytime I see the term "Scalation", the narrator writes as If scalation was only a term for "bigger". We have to think not only of being bigger, but being smaller.

    PHP has a wide support for many RDBMS, APIs and Operating Systems, but it is only a Language. A language doesn't scale, it's the platform that scales.

    That's why I see the PHP/Apache/Unix to scale far better than (for example) ASP/IIS/NT: The first platform can run from a PDA to a high-perfomance Minicomputer; The second can run from an I686 (pentium support was removed?) to the best PC-Architecture based computer you can buy. That's the difference: A wide option platform versus a closed option platform.

    Probably, the first platform will have perfomance leaks and will not take every perfomance point from the machine it runs within, but its scalability potential resides that it can run in whatever you throw it at. Maybe J2EE or other platforms will run faster on the same hardware than PHP, but PHP will scale there and will be looking shoulder to shoulder to it.

    That's why I don't like to valuate Scalability from the "speed" point of view, but the "where it runs" point of view.

  • Yahoo. (Score:5, Interesting)

    by downbad (793562) on Saturday July 03, 2004 @09:58AM (#9599630)
    Yahoo is a prime example of PHP's scalability. Although they still use some legacy C code, nearly all of their new developments use PHP and BSD.

    I worked in a small shop developing web apps, and while it wasn't mission critical stuff like banking, it wasn't exactly brainless "dump data from MySQL" stuff either. I was lucky that my boss wasn't picky about languages. But if anyone I work with doubts the power and simplicity of PHP, I usually bring up Yahoo.

    IMHO, PHP rocks. It's suitable for pretty much any and all web development. It can be used for quick hacks, or you can code it like a pro with objects and stuff.

    • Re:Yahoo. (Score:5, Informative)

      by Anonymous Coward on Saturday July 03, 2004 @10:17AM (#9599710)
      Actually that's only partially true. Yahoo uses C/C++ for almost all backend development. PHP is used mostly for what it's good at: Simple web frontends that call on extensions written in C and C++ to do most of the heavy lifting, or access backend systems written in C/C++.

      Yahoo is very much a C/C++ shop first and foremost - PHP is used as a template system (alongside several proprietary systems) to allow easy modification of high level behaviour.

    • For an inside look at why and how Yahoo uses PHP, see Michael Radwin's talks [yahoo.com].
    • by Ogerman (136333) on Saturday July 03, 2004 @03:43PM (#9601326)
      IMHO, PHP rocks. It's suitable for pretty much any and all web development. It can be used for quick hacks, or you can code it like a pro with objects and stuff.

      Yes, PHP is excellent for web development. Yes, PHP can scale to even some large web sites. But since the web is still all the rage, this is unfortunately all that many people think about. Where PHP stumbles is when you need to move off the web or when you need to write complex business logic that is not solely driven by a web tier. PHP also fails when you need to integrate diverse transactional resources in an efficient manner. Not all business applications can be suitably implemented in PHP. As examples:

      - PHP, by its scripted execute-and-terminate nature, cannot schedule the execution of tasks on its own. So, for example, there is no way to schedule an email to be sent at a specified time. If you need this sort of functionality, you'll have to look beyond PHP to ugly hacks like cron jobs that call PHP. (and then PHP scripts that can automatically modify your cron scripts..) Alternatively, you could write your own scheduler in a different language.

      - Somewhat related, PHP is incapable of asynchronous operation. Suppose, for example, that we have a flood of customers placing orders. Our inventory database is fully capable of keeping up with the demand, but credit card processing system is backlogged and this is out of our control. So we cannot give users an immediate response as to whether their payment was accepted upon placing the order. We also don't want to make them wait 5-10 minutes after hitting the "place order" button for a response. The proper business solution is to accept the order, but send the customer an email later if the payment was rejected. This process requires asychronous operation -- queueing of the payment validation requests and possible further action separate from user interaction. PHP has no solution for this scenario or the many others like it and thus we must look beyond the PHP domain.

      - PHP is quite weak when it comes to writing a complex business logic layer. This is not to say that it is not possible, but there are no frameworks available comparable to those offered in the Java world (and I'm not just talking about EJB, btw). So this is not a question of languages, but of available tools to do the job efficiently. For example, PHP has no concept of application-level transaction management. (declarative transactions, isolation levels, etc.) Looking towards the cutting edge, it has no support for Aspect Oriented Programming, which is an enormous boon to business logic developers, available in Java, C++, .NET and others.

      - PHP is weak on tools for developing the persistence layer. For example, it has nothing comparable to Hibernate, let alone tools for RAD employing UML.

      - PHP has no pre-built solutions for caching persistent data, and certainly not objects. Once again, it is possible, but developers are left to roll their own solutions using shm extensions or writing out to the database backend. Using the database can be terribly slow and even the shm approach requires (de-)serialization on script load/terminate. While this sort of thing does not limit scalability, it does limit performance (response times).

      - PHP has no means of replicating application state in a cluster other than using the backend database. While this is often of no consequence, some complex business software holds a fair amount of state which needs not be persistent.

      - PHP itself cannot reasonably be used to develop non-web clients such as a GUI tool for efficient rapid data entry or greater interactivity, a PDA client, or an embedded device that interfaces with a campus security system. These sorts of clients can talk to PHP scripts via SOAP extensions, but it should be recognized that we have again left the PHP domain to meet these needs and the resulting solution may not be the most efficient.

      So in closing, PHP is great for some thing
    • Re:Yahoo. (Score:3, Interesting)

      by AmVidia HQ (572086)
      Ahem. You copied my comment I wrote some time ago. When you copy, give credit where it's due.
  • by Christianfreak (100697) on Saturday July 03, 2004 @10:04AM (#9599658) Homepage Journal
    PHP's problem is that it quickly becomes unmaintainable in larger projects. That's why it doesn't scale, not because the platform isn't fast enough or Apache can/can't scale.

    PHP will continue to have this problem until someone comes and tells the developers about a nifty invention called 'namespaces'

    Some other things that could help: Standard templating for easier separation of design/content from code, a better module architecture that doesn't require me to recompile just to get some new functionality, some nice standard modules that go with that new architecture.

    Of course if someone did all of that you'd have Perl and since we already have Perl, I'll stick with it.
    • PHP will continue to have this problem until someone comes and tells the developers about a nifty invention called 'namespaces'

      Namespaces are handly, I'll agree, but I don't see them as a golden-bullet that are impossible to live without.

      Let's face it, they don't actually achieve anything that a consistent naming strategy couldn't also achieve.
    • You sound like somebody who didn't use PHP long enough. Large PHP projects become plenty maintainable once you start using handy stuff like the Smarty templating engine (which IIRC is included by default now). There are also a myriad of great PEAR classes and PECL extensions. As for a module architechture that doesn't require you to recompile, that would be nice, however, I would bet that most PHP programmers have never recompiled their installation or needed to do so. You're right though, it would be nice.

      For the most part though, I would say that PHP is slightly better equipped for web development, just like Perl is better equipped for general scripting tasks... I'm a python man myself though ;-)
      • "a module architechture that doesn't require you to recompile"

        Your kidding right?

        urpmi php-mysql php-pgsql php-curl php-xml php-sockets
        service httpd restart

        See any "make; make install" commands in there?

        How is that not modular?

        Nearly everything in PHP is a module (or PHP's term, an extension) that can be installed or removed without recompiling.

    • PHP only becomes unmaintainable if you don't know what you're doing, or if you don't plan well at the onset. The thing about PHP is that it doesn't force you to do anything, which means it doesn't force you to do anything the right way. This is not a fault. I wouldn't be a PHP developer today were it not for the ease with which I learned to write some very, very bad code. Of course, there's room to grow. The result is that the onus is on the developer, and not the language. So you're right, PHP doesn't scale. Not it's job. PHP provides the opportunity to scale, and the toolset, which are more than adequate, and improving over time.

      This is particularly funny coming from a perl developer. Perl can become unmaintainable on a small project.
      • Perl can become unmaintainable on a small project

        What unique attributes of perl do you believe contribute to your claim that perl "can become unmaintainable on a small project"?

        I regularly program in C and I would say that C has numerous issues that make readability and maintainability in the large but you rarely see anyone heap this scorn on C.
      • PHP only becomes unmaintainable if you don't know what you're doing, or if you don't plan well at the onset.

        In the Real World, you rarely get to write the web app yourself; instead, you get to add features/clean up/fix the web app that the Other Guy wrote. The Other Guy invariably doesn't know what he's doing, nor did he plan well at the beginning. Knowing this is the case, I would much rather he were using an environment that forces some amount of good design on him, since that will save me time and f
  • by bigattichouse (527527) on Saturday July 03, 2004 @10:15AM (#9599701) Homepage
    One of the great boons of PHP is the fact that you can build shell scripts with it. This allowed me to create a large distribution/inventory/control system in PHP, AND do all the back end processing in PHP as well. Sound inefficient, sure, but it works like a champ - plus any new programmers get to learn the system quite quickly due to consistency.
    • I don't think it's inefficient. I use it. I have an extensive CLI PHP scripting system setup that does it all. It connects to FTP systems downloaded data for updates, runs updates on several databases, generates plain text reports, csv (Excel type reports), and most of all combining it with crontabed called from others systems it allows me to share data between two systems that previously where unable to do so.

      This also allows me to move code blocks between different platforms without issue. It also al
  • The reason (Score:2, Insightful)

    by Diclophis (203740)
    HTTP URL Wrappers and file_get_contents and serialize, unserialize. With these functions alone you can recreate any CORBA SOAP XML-RPC type remoting. And remoting is good for for scalability because it lets you 'outsource' the workload to another machine. Truly N-Tier design (N>3).
  • rebuttal (Score:4, Informative)

    by Anonymous Coward on Saturday July 03, 2004 @10:20AM (#9599724)

    I will start with mandatory links to the great series of articles that Ace's Hardware ran, describing their server scenario and their migration from PHP to Java/J2EE:

    1. Building a Better Webserver [aceshardware.com]
    2. Building a Better Webserver in the 21st Century [aceshardware.com]
    3. Scaling Server Performance [aceshardware.com]

    The PHP Scalability Myth starts of by defining three types of server architectures. The first, two-tier, and the last, logical-three-tier, are the same conceptually (there is the slight distinction between whether display and business logic code is "mingled", but this is typically not a performance issue, but just an aesthetic or design issue). This two-tier/logical-three-tier architecture is the only one PHP supports natively. The article then proceeds to compare a two-tier PHP architecture against the most elaborate full three-tier Java architecture, which is used rarely in practice, and extremely rarely in the same domain in which a PHP solution is feasible. Instead of comparing apples and oranges (if PHP supported a full three-tier architecture, I would imagine two-tier PHP vs. three-tier PHP would have the same performance discrepencies), let's simply compare the only architecture PHP supports natively, two-tier, against JSP talking directly to a database, as this scenario is the most analogous to the PHP one. Let's also discard any caching as again this is something that Java handily accomodates but is not natively (or at least easily) available in PHP due to lack of state. And let's assume the database is the largest bottleneck.

    The article states:

    At the time when the first versions of the JSP and EJB standards were released, the prevalent web server was (and still is) Apache 1.x, which had a process model that was not compatible with Java's threading model. This meant that a small stub was required on the web server side to communicate with the servlet engine. The remains a non-trivial performance overhead for those that decide to pay it, and was a significant performance overhead when the first scalability comparisons were made.

    I'm not sure what "stub" the article is referring to, but I will assume it means an Apache module which talks a "native" protocol to the servlet engine. The first such module was mod_jserv, which could run the servlet engine both in-process and over a compact protocol called AJP (Apache Java Protocol), which represents essentially a pre-parsed HTTP requests. This module, as well as the AJP protocol itself has gone through severel revisions, from mod_jk, to mod_jk2. I cannot quite recall, but I think some version of mod_jk might have lost the ability to run in-process. Every other version, including the most current, can, if I recall correctly. This is besides the point, because as far as I know, AJP always has been a trivial performance overhead (I believe recent versions can run over Unix domain sockets). In fact, Apache is routinely used in production as the front-end web server, instead of the built-in servlet engine web server, simply because it is faster at serving static content, and that the AJP protocol is negligable. If the "stub" referred to in the quote is not the AJP module, then this may not be relevant, nevertheless AJP has always been highly efficient and typically negligable with regard to performance (the same typical connection min/max/idle count configurations apply as do to Apache itself).

    The article goes on to proclaim the complexities of caching and data object persistence which we have eliminated from our comparision. Let's move on to the real bottleneck - the database. The article says "PHP's connectivity to the database consists of either a thin layer on top of the C data access functions, or a database abstraction layer called PEAR::DB. There is nothing to suggest tha

    • Re:rebuttal (Score:5, Informative)

      by julesh (229690) on Saturday July 03, 2004 @10:50AM (#9599851)
      This two-tier/logical-three-tier architecture is the only one PHP supports natively.

      I'm not sure what you're on, but you can build however-many-tiers-you-like applications with PHP. In fact, PHP supports a number of technologies specificallly designed to communicate with additional tiers, including CORBA, JavaBeans and SOAP.

      Let's also discard any caching as again this is something that Java handily accomodates but is not natively (or at least easily) available in PHP due to lack of state

      PHP supports persistent state through shared memory blocks trivially. The implementation of data caching schemes that use this feature is not hard.

      17 child threads attempt to connect, one will not be able to. If there are bugs in your scripts which do not allow the connections to shut down (such as infinite loops), a database with only 32 connections may be rapidly swamped

      Why would you limit your database to serving fewer connections than you have limited your web server to?

      PHP supports an option to kill runaway scripts and reclaim their resources after a time limit has elapsed, which handily prevents the infinite loop problems mentioned.

      Ok, so now we have a bunch of "persistent" connections that hang around with the process. How long do they hang around?

      Until the database closes them or the PHP server process is killed.

      What if two threads in the same process want to use a connection?

      The connection is locked from the moment a thread acquires it (using the *_pconnect function) until the script using it terminates.

      In the worst case, persistent connections make your problem much much worse, because now you have many more connections open to your database.

      What does an inactive open connection to the database cost? Not very much, in my experience.

      Your arguments have a little merit, but please try to do your research before ranting about a system.
      • I don't agree with much the original poster of this thread said, but having many persistent connections to your db can be an issue. Mysql tends to get extremely unhappy at the 1000+ connections range. That's on a Dell 2650, dual PIII, 4GB ram machine, ymmv.

        kashani
      • Why would you limit your database to serving fewer connections than you have limited your web server to?

        I always wondered about that in application design. Just recently I talked with a former software architect from IBM and he gave me an answer that may make some sense. It still feels a little counter-intuitive to me, so any corrections would be welcome.

        The answer is basically a resource management issue if I remember correctly. You want to manage the load on your database servers so it stays relativ
  • Real world examples? (Score:4, Interesting)

    by javab0y (708376) on Saturday July 03, 2004 @10:34AM (#9599778)

    I think to settle this debate is a possible real-world example. Look at the story on the Jboss Nukes Project [onjava.com]. It explains the CPU utilization and speed of the PHP version and how moving to a J2EE implementation decreased the wait times dramatically.

    Its difficult to argue with facts.

    • by Anonymous Coward
      I would be extremly careful with such a statement.

      We know that the PetStore J2EE sample/reference application is ~10 times slower than a sample .NET implementation written by Oracle/Microsoft.

      We also know that the JBOSS people were sending false statements last year using anonymous accounts (around the time when the mentioned article was written).

      So I would be very careful to state that "a J2EE implementation decreased the wait times dramatically".
      I don't think so, not at all!
  • by sfjoe (470510)
    1. PHP scales well.
    2. Java scales well.
    3. Friendster couldn't devlop a scalable J2EE application, so they switched to PHP.
    4. WHat will Friendster switch to when they can't develop a scalable PHP application?

    • seeing how they had pretty smart people writing the J2EE app they started with, I doubt #3. When you have the authors of "Tomcat: The Definitive Guide", and "Mastering Tomcat Development" writing your app, then I would assume at the very least that they are as good as any other Java team. I think people should consider the possibility that Friendster has/had scaling issues that other sites plainly don't have. All of their pages have to be dynamic, and I doubt that any of barely 1% would be cachable, besi
  • by RAMMS+EIN (578166) on Saturday July 03, 2004 @12:30PM (#9600287) Homepage Journal
    While I am personally gratified that someone is making the case for PHP vs. Java, I think the whole idea of attributing scalability (as in, works for lesser and greater numbers) is wrong.

    Scalability depends on how you write your code. If your algorithms are good, your system will scale, and if they aren't, it will not. Any language that doesn't let you write good algorithms cannot be expected to be generally useful, but I think neither PHP nor Java fall in that category.

    Finally, I think scalability is really not what's important, but rather performance. When developing tailor-made applications, I only care if they requires more or fewer resources for the number of requests they actually get, not for higher or lower loads. Of course, for libraries, operating systems, etc. the argument is different.
  • by PhotoBoy (684898) on Saturday July 03, 2004 @01:25PM (#9600578)
    Having developed systems in Java and PHP I think it's wrong to try discussing how well either of them scales without considering the main factors that affect the scalability of projects, namely:

    - The skill of the developers implementing the system
    - The foresight of the original plan/architecture design
    - Understanding of where bottlenecks/growth problems will occur

    Any project that doesn't plan the scalability in from day one will likely struggle to fix the problem when scalability does become an issue.

    IMHO scalability is a design and architectural problem, the language used (within reason) makes no difference- it's the quality and structure of the design itself which will make or break the system.

I am a computer. I am dumber than any human and smarter than any administrator.

Working...