Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
PHP Programming Books Media

The PHP Anthology - Volume II, 'Applications' 100

sympleko (Matthew Leingang) writes "In Volume I of The PHP Anthology, Harry Fuecks showed some of the basic PHP functionality to solve a few simple problems, including how to object-orient your code, how to use PHP's hundreds of built-in functions, and how to use well-developed existing classes, be they from PEAR or other sites. In Volume II, he intends to 'blow your socks off by tackling some traditionally complex problems with the same principles--to great effect.' It's summertime and I'm sandals-only for the time being, so my socks remain safely in the top drawer. But the volume is nonetheless exciting." (Read on for the rest of Leingang's review, and check out last week's review of Volume I.)

There are seven chapters in this volume, each dealing with real-world problems. Many problems are those you've seen solved on sites you admire and wondered "How did they do that?" Others are frameworks that allow your site to run smoothly, with nobody getting accidentally logged out or having to wait too long while your script gluttonously pulls the same data out of the database for the Nth time. At the end, Fuecks goes back to the beginning, to show how proper design and development can save you time when you start your next project.

Chapter 1: Access Control
Authentication is the process by which users identify themselves. This is difficult in HTTP, a stateless protocol in which the server handles one request at a time and instantly forgets you. Luckily HTTP allows cookies, which are bits of data the server sends to the client for to reveal upon revisiting. At first cookies were used only to annoy ("Hello, Steve! You have visited this page 3 times"), but a cookie can hold the ID of a session record in a database, which contains any state-information that you like.

You can authenticate without sessions via HTTP server configuration, as long as you like the dull dialog box the browser pops up when users enter a restricted area. Oh, and you don't mind the fact that users won't be able to "log out" without quitting their browser, nor can you force a logout after a certain timeout value. Nor can you allow users to register themselves... these are all existing, solved problems and the author shows some of the best solutions. Common tasks like allowing users to change their passwords, recover their passwords if (I mean when) they forget them, and arranging users in groups to which you assign common permissions are also covered.

My favorite example from this chapter is the humans-only registration application. Remember when online voting for the Major League Baseball All-Star Game first started? Anyone who knew how to write a web client could have automated a task to vote as many times as the server could handle, and have his favorite players be the all-star team.* To bring it closer to home, what if somebody decides to bog down your site by automatically registering a huge number of times and filling up your database? You can keep these things from happening by making users look at images which contain text but are hard for computers to "read." PHP is in use at all stages of this game, from writing the registration form's HTML to generating the obscured image on-the-fly.

Chapter 2: XML
XML is a fact of life and, hype aside, is a great way to store and transmit machine-readable data. One of the most visible applications is the thousands of bloggers and news sites providing XML feeds of their headlines. You can write portal sites that grab these headlines, parse them all and present them on your site with links to the full text at the source.

There are two ways to parse XML: with events, or by using the Document Object Model (DOM). The methodologies are similar to reading a plain-text file line-by-line or all at once. Using events you can implement a finite-state machine based on which tags and text come down the pike. Or you can slurp the whole document into memory and find any part of it with ease. The built-in library for the former is based on the popular Simple API for XML (or SAX; don't you like those nested acronyms?), while the latter often uses Xpath to find the particular document nodes you want.

The author shows how to parse RSS feeds with both SAX and the DOM, and how to render a feed with DOM. Further, you can use Extensible Stylesheet Language Transformations (known as XSLT) to transform XML -- whether it's to XHTML for regular browser reading, WML (Wireless Markup Language) for viewing on mobile phones, or even SQL to communicate with a database.

Another exciting XML application is in the area of web services, in which agents (often but not necessarily web servers) communicate with each other over an XML-based protocol built on top of PHP. The two most popular protocols are XML-RPC (the RPC stands for "Remote Procedure Calls") and SOAP (which used to stand for "Simple Object Access Protocol" but now is just a name). Often-changing information such as stock prices and weather are often offered through web services, but they can also be used as an object API between agents over the network. What's cool about using SOAP is you can publish to clients exactly what services you offer and how they can call them using the Web Services Description Language (WSDL).

Chapter 3: Alternative Content Types
If you've ever printed out a web page that was designed for browser viewing, you know the less-than-desired effect. The navigational elements, search boxes, and banners, while necessary for the web page, are useless once a static copy is printed. Furthermore, you need to extend your site to include users with less-featured browsers, such as mobile phones.

Fortunately, PHP has been taught many languages. PDF is the standard for print-quality documents, and there are several libraries (free and non-free) which allow you to generate them. WML is the HTML of cell phone browsers, in which screen space is at a premium and bandwidth scarce. SVG is an XML application which allows vector-based images like PostScript does. The coolest example, however, uses XUL (the XML User interface Language, not to be confused with Zool) to make full GUI applications that you run through Mozilla. This isn't useful for the outside world where you can't force your users to use Mozilla (sigh), but works well for intranet applications that run on a variety of platforms.

The author also brings up in this chapter an HTML SAX parser he has written. You can process HTML pages chunk-by-chunk and extract the pieces you want. I hadn't known about such a class until I read the book and I'm very excited I know about it now. For sometimes it's necessary to parse a web page meant for humans to read (perhaps to pretend to be a user and automate your all-star voting), and most HTML pages won't validate as HTML, let alone XML.

A good point here is that a well-designed, tiered application will allow you to swap out different presentation classes with little code rewrite. Separating the tasks of extracting the data from the database and presenting to the user in variety of formats is a common task that when done right becomes subsequently easier.

Chapter 4: Stats and Tracking
Once your site is up and running, you'll be interested to know which parts are the most active, and how much traffic you're getting. Into a dynamic page you can obviously insert any logging mechanism, but a great place to put it is inside your site's logo. PHP can send binary data as easily as text. Why would you want to do this?

  • The logo is usually on every page (or it should be). You don't have to cut-and-paste code.
  • You can serve the image, then use the flush command to send the output on and do extra processing. This way logging doesn't get in the way of page rendering.

There are lots of packages available to collect and analyze data. The author goes through phpOpenTracker which is quite rich in features. There are also ways to collect data on what links users follow to leave your site, and to keep requests from search engines from cluttering your log files.

Chapter 5: Caching
Another possible knock against PHP is that, while it's good to have dynamic pages, some pages are unnecessarily so. This is a waste of server resources to keep rendering the same page anew. There are different ways to conserve.

On the client side, you can use HTTP 1.1 headers like Cache-control and Expires to tell browsers when it's okay to store cached copies locally

On the server side, as can be expected, you have a greater level of control. You can use output buffering to delay sending of output to the browser, then save a copy of the output locally. On subsequent requests, you can serve the file rather than generate the HTML all over again. This can be implemented on a chunk (or block) level, so that you can keep some parts ultra-time sensitive and others not so much. The package PEAR::Cache_Lite can help with this.

Chapter 6: Development Technique
The last two chapters were my favorites of the two-volume set. They are on a higher level of abstraction than the features of PHP's library of functions, or previous five chapters on real-world solutions. After you've reached a certain level of expertise in PHP coding, you being to wonder about the "right" way to do things. The author shows how to use Xdebug to find bottlenecks in your code, as well as a few quick optimization tips (for instance, design your flow control so that the first choice is the one most often taken).

He then discusses the principles of N-tiered design. N is usually 5, but the data layer (usually a database or file system) and presentation layer (usually the browser) are most often handled outside of PHP, so you normally have three levels to worry about:

  • Data Access: Getting data from the outside world into your application
  • Application Logic: Doing whatever unique thing your application is supposed to do
  • Presentation Logic: Forming a response in a format acceptable to your client

Keeping these layers separate and restricting them to communicating through well-defined interfaces allows you maximum flexibility. If you need to change databases (say you just got venture capital money and can afford Oracle now), you can do so only changing one layer. If you want to serve different flavors of HTML, or different markup languages altogether, or binary data, you can do so by only changing one layer. You can even strive for maximum distributability by enabling your layers to "live" on physically independent machines and communicate with XML-RPC or SOAP.

Documenting your code is essential. Anybody who's been programming for over a year has gone back to code he or she's written and thought, "Now what the heck was this supposed to do?" It's even more essential when you write something and wish to distribute it for the benefit of others. You can expect them to grok your code at an even lower rate since they didn't write it the first time.

Luckily, scripted languages like PHP are excellent at parsing text files, including PHP scripts themselves. Using well-defined documentation formats akin to JavaDoc, you can embed documentation in your code inside comments, and use tools like phpDocumentor to extract these documentation blocks and format them as nice, cross-reference HTML. In fact, writing doc blocks before your code is a good way to think ahead about how you want your classes and methods to work.

Unit Testing, one of the most digestible dogmas of Extreme Programming, is an awesome way to test your code for logic errors. You build up tiny test cases (using mock objects to isolate the class you're testing) and build as many as you like. Once you do this (PHPUnit and SimpleTest are two rich frameworks), you keep your tests and each time you add features, you run your test to make sure you haven't added bugs as well.

Chapter 7: Design Patterns
Design Patterns is one of the modern classics in information technology. After having done OOP for a while, you will inevitably get the feeling of deja vu that you've solved a problem before. Not so concretely as "I need a database abstraction layer," or "I need a templating system," but as in "I need a way to create objects without specifying exactly what class they belong to," or "I'm tired of writing so many if statements." Design patterns are common object architectures which can be used to solve common (though unique) problems.

Many design patterns are more suited to state-equipped applications with GUIs, but there are plenty to assist the PHP coder. The Factory Method is a pattern through which an object can create other objects of varying classes. So instead of writing mysql_connect everywhere, then having to change every occurrence of that function, you can abstract all database interaction to a class, then instantiate a database connection through a class method of another class: $db = MyApp::getDatabaseConnection(). This is useful when the connection (not just the RDBMS, but the actual database) you want varies depending on whether you are developing, testing, or going live with your application. Factory methods are also a good way to avoid global configuration variables.

The Iterator Pattern and the Observer Pattern are two others mentioned in this chapter. Iterators are used often in paging through database results. Observers are used to let objects notify other objects of changes in their state. This chapter will make you want to go read the whole Gang-of-Four book if you haven't already.

My biggest beef with the book is that this wasn't presented earlier on, perhaps at the beginning of Volume II. As a climax, it leaves me flat, wondering how the rest of the volume could have been derived from this very cool concept. But most PHP books conclude with chapters on how to extended PHP on the C level, or giant case studies involving massive code dumps, and I'm often not satisfied with them. This is a nice philosophical note to go out on. And there's something to be said for the argument that books like these aren't written to be read cover-to-cover.

Appendices
The book closes with the same indices as in Volume I. Since I don't know the URL of my review of that volume, I'll just copy: You can read about which configuration directives you're probably most interested in (the complete list you can get on PHP's web site), some common security breaches, and how to install PEAR, PHP's version of CPAN. My favorite appendix is the "Hosting Provider Checklist," a great reference for evaluating whether kewlhosting.com is going to give you the freedom and support you need to make a great hosted web site.

This volume was informative, well-written, and inspirational in that it made me want to go out and add cool and useful features to my web sites. Check it out if you can.

*Not really (not that I tried or anything), but they've always been a little bit smarter about it. You get my point, though. This did happen on an ESPN.com Page 2 mascot popularity contest, but they noticed through request headers that millions of votes were coming from the same place, and invalidated all those votes.


In real life, Matthew Leingang is Preceptor in Mathematics at Harvard University. He promises to review any book sent to him for free, and sometimes actually does it. Both volumes of The PHP Anthology are available from SitePoint. Slashdot welcomes readers' book reviews; to see your own review here, carefully read the book review guidelines, then visit the submission page.

This discussion has been archived. No new comments can be posted.

The PHP Anthology - Volume II, 'Applications'

Comments Filter:
  • by Anonymous Coward on Monday August 09, 2004 @03:39PM (#9922900)
    The future of the web is the Symantic Web. XML is just 1/5th of the layer to fully implement it. Go look for things like Owl-S and study that as a suplement to PHP handoff.
  • If you were just starting to code Slashdot.org today, what would you code the site in PHP or mod_perl?

    Perl.

    Answered by: CmdrTaco
    Last Modified: 10/28/00
    • by killermookie ( 708026 ) on Monday August 09, 2004 @03:50PM (#9922995) Homepage
      PHP version when that question was asked.
      Version 4.0.3
      11-Oct-2000

      So what would CmdrTaco say now for PHP5?
      • by Anonymous Coward
        So what would CmdrTaco say now for PHP5?

        "Whuh? Oh, not now, I'm in the middle of an Anime marathon!"
    • Yet you yourself use PHP for your own site (which you refference in your sig)? Intresting.
      • Maybe not everyone on /. is on a moralistic crusade for their favorite technology.

      • Yeah I do, and I will admit that I was looking to start a holy war (I'm stuck at work and bored to hell ;-) ).

        To be perfectly honest though I have grown sour towards PHP. If find I put more effort into wrestling to get my logic seperate from my content, and mustering data into objects than I do writing good code.

        Perl is my language of choice because of this; though I am not simply writing off PHP.I favored it for a while, but now I am weary of it.

        • I just don't see why the PHP hating is going on... PHP is awesome, and from the latest update to your site "Well it's finally here. This is the 4th version of the site and I am much happier with it than all of the others. There has been many improvements to the backend of this site as well which make it much more pleasant to update."

          mod_perl has advantages and disadvantages, but for what people are using the net for right now, and the cost/time of development, PHP is the way to go as far as I'm concerned.
          • I really don't hate PHP, it's just that I find that the language itself conflicts with the way I want to program. I find Perl to be more luid in that respect.

            Also as far as my own site I did use PHP to make it, and, in fact, it is hosted by phpwebhosting.com. Though the latter is more because they are an excellent host than anything else (seriously, they are sweet). My next revision is going to in perl. Growing a site in PHP is too difficult.

            I have two other web related projects (the kind that pay money)

            • I'm just the reverse of you. I use perl for small hacks and scripts people need to do this or that on their webpage, but when I start to write a large project in perl I quickly get into a big mess. I find php is much more in line with how I code, and php5 looks even more promising. I agree with you though, its not a matter of what is the best, its a matter of what you know and what is right for the task.
    • Last Modified: 10/28/00

      I think that "Last Modified" date says we might get a different answer now, PHP has changed a lot since 2000.
    • 10/28/00

      Probably would have said the same thing about mysql considering where postgres was on 2000.

  • socks? (Score:5, Funny)

    by zelurxunil ( 710061 ) <zelurxunilNO@SPAMgmail.com> on Monday August 09, 2004 @03:48PM (#9922979) Homepage Journal
    It's summertime and I'm sandals-only for the time being, so my socks remain safely in the top drawer. But the volume is nonetheless exciting.

    And you profess to be a geek! Haven't you heard of sandles-and-socks.
  • by bsd4me ( 759597 ) on Monday August 09, 2004 @03:53PM (#9923025)

    Does it have a chapter on how to talk to tech support at a hosting service to:

    • get them to upgrade to moderately recent version of PHP?
    • get them to install an optional library?
    • get them to modify a system setting for you?
    • ask then why on earth they modified a system setting and didn't notify all of their PHP users?
    • ask them why they didn't notify their PHP users when they upgraded to a PHP version which changes some functions' semantics?

    Kidding aside, I have a love-hate relationship with PHP because of having to support applications where I don't have total control over how PHP is installed.

    • You probably just need a better host. I've used superwebhost.com and deru.net in the past, both of whom have been perfectly willing to install modules, etc. at my request. And both of them were very good about notifying me of any server changes. Nowadays I run a leased server at a hosting facility, so I can do it myself. That's way better, o' course :)
    • Posting way too late here, but for bsd4me:

      Have you considered one of the user mode linux hosts like JVDS? [jvds.com]

      They have BSD as well as Linux, and you're in complete control for about the same price as a space/bandwidth-equivalent shared hosting setup.

      The BSD machines obviously don't run um-Linux, but the equivalent mode of BSD that apparently works very well and has been around a long time.

      I've been using them for a while to host a bunch of small websites. I don't know how PHP performs in that environment
      • Thanks for the info about the hosting companies.

        I have a good hosting company that will work with me for system changes, but in the future I will probably go with a virtual host for the reasons you mentioned.

        Most of my griping was aimed at being in the situation where a client already has a hosting company and won't switch, and I was forced to use them and work inside their boundaries.

  • by Anonymous Coward
    I just shudder when I hear the words PHP and enterprise. I simply would never write an app. that I absolutely depended on in PHP. It's simply too easy to make mistakes in, and thrives on mixing the display (view) of data with business logic. PHP was "designed" by people who never actually thought through exactly what the problem was that they were trying to solve.
    • by PugMajere ( 32183 ) on Monday August 09, 2004 @04:03PM (#9923106) Homepage Journal
      While I tend to like Perl more, I've played with PHP enough to know that mixing the display with the logic is purely a personal choice in PHP.

      Look up the "Smarty" templating engine. That should give sufficient power to get your display out of your business logic, for the most part.
      • by Anonymous Coward
        Please. Not that 200K monster.

        If you want templating, use the PHP engine itself!

        All my .tpl templates are PHP files, and that allows for some tricks that conventional templates can't do.

        And still they work for separating business logic from presentation logic.
        If you like you could read this link:
        http://www.sitepoint.com/article/beyond-tem plate-e ngine
        • You mean 200kB that are in shared memory because you use mmcache if you really care? BTW Smarty does translate everything to PHP. So actually you're using PHP as template language, when it comes to execution. But you've a little caching logic and some plug-ins that can help you if you know how to use it.

          b4n
        • And still they work for separating business logic from presentation logic.

          Which Smarty doesn't do, it mixes them right back up again.

          Not much use if you want your templates worked on by someone who isn't a programmer, although it certainly looks powerful I fail to see the point. If you want code in the templates just use PHP.

          After experimenting around the way I settled on was a 5-line function using file_get_contents() and str_replace(). I fail to see why a template engine needs 3000 lines of code as sm
    • by dave420 ( 699308 ) on Monday August 09, 2004 @04:31PM (#9923381)
      Wow. Do you actually have a job? Wow.

      Any language is easy to make mistakes in. That's because people code it. True, you can block in HTML and PHP in the same document, but only a fool would think that's all PHP can do (or even that PHP somehow gets off on doing that). There are a plethora of PHP templating engines out there. Can't find one you like? Write your own in under 100 lines of code. It's that simple. Saying that's a major reason why PHP isn't enterprise-ready is just plain stupid.

      I've coded and deployed applications written in PHP throughout the company I work at, and they perform flawlessly. Maybe that's because I have an ounce of understanding what it takes to be a programmer. I'm not great, yet I can manage to pull it off without any problems.

      Then, you go top it off with pure speculation about the ideology behind PHP. Again, this shows you really don't have a clue about PHP.

      PHP is a very versatile, very powerful cross-platform tool. You can make shitty web apps in it, but then you can in any language. It's just as powerful/useful/stable as any other language out there.

      sheesh.

  • by pacman on prozac ( 448607 ) on Monday August 09, 2004 @03:56PM (#9923051)
    ...theres 400 posts from people who don't realise that register_globals has been turned OFF by default for years and only outdated old PHP scripts and guides need it turned on.
    • I guess I'm lame, but I always turn it back on. Is there some better way to pass variables through a link?

      -Erwos
      • by JAgostoni ( 685117 ) on Monday August 09, 2004 @04:23PM (#9923262) Homepage Journal
        You should access your GET/POST variable from their respective associative arrays:
        $_GET["varname"]
        $_POST["varname"]

        Having register_globals turned on is a security risk as it allows hackers to inject settings into your code or adversely affect your variables without you knowing it.
        • Also, if you need it turned on in order to run older PHP scripts you can switch it on per vhost or directory in apaches configuration file with something like:

          <Directory /home/me/public_html/oldphp>
          php_admin_flag register_globals 1
          </Directory>

          Then you can have it switched off on the rest of the server.

          Have a look at the predefined variables [php.net] bit of the PHP manual, it explains quite well how to avoid needing register_globals.

        • Sort of. If you code correctly, there is no risk to values being injected. Never assume that a variable is NULL to begin with and you'll never have a problem with register globals turned on. Turning it off is just helping you when you code silly.

          • by Anonymous Coward
            Sort of. If you code correctly, there is no risk to values being injected. Never assume that a variable is NULL to begin with and you'll never have a problem with register globals turned on. Turning it off is just helping you when you code silly.

            That's a horrible attitude. What if you forget to null out a variable? What if you make a typo? You want as much to be constant between runs as possible. You want an attacker's conttrol of variables to be as limited as possible. Making it easier to write secure, c

          • While that may be a valid point, there are still risks:
            - You inherited/are-using code from someone else
            - A hacker thinks of something you didn't

            Let's also not forget all the blame that is put on Microsoft for having insecure software. Sure ... if you don't open those attachments it's safe so why bother fixing Outlook/IE?
        • This only works in PHP 4.1.0 and later. $HTTP_GET_VARS and $HTTP_POST_VARS [php.net] will work in all(?) PHP4 versions.

          I spent a fair amount of time this morning debubbing a simple script that worked on my test server (PHP 4.3.4), but failed on the production server (at a client's hosting company). After a while, I found out that it was PHP 4.0.x... ug

        • You can also use the $_REQUEST['varname'] to get the data from both.
      • Is there some better way to pass variables through a link?

        A querystring, maybe? <a href="foo.php?bar=baz&qxt=fred">This is a link that passes two variables</a>

        Maybe that's not what you mean, though.

      • I learned this from a slashdot post and am happy to pass it along--just start your page with
        extract($_GET);
        and/or
        extract($_POST);
        a nd continue as before.
        And it works for cookies, but I'm too lazy to see if it's 'cookie' or 'cookies'.
        This might totally negate the security stuff* but it's handy and you don't have to mess with PHP's settings (i.e., php.ini) which may not be possible on a server you don't own.

        * at the very least: if you're expecting data from a POST, someone can't overwrite it by changing the
        • What about a mechanism to "sign" each form, maybe with an MD5 checksum, and pass that as a hidden field in form? And also have those signatures checked in your code (of course this would work easiest if you have 'static' forms - forms that don't modify).
          • That wouldn't work. When someone enters different values into the form fields the md5 checksum changes.

            Just use $_POST["var"] and $_GET["var"], the PHP manual has been saying to do for years (at least 4). Using extract($_POST) is just as dangerous as using register_globals. Say your script does something like:

            if ($userIsLoggedIn) { doSecretStuff(); }

            Then all someone has to do is visit
            http://yoursite.com/index.php?userIsLogged I n=true
            and they've potentially cracked your site. Without register globals or e
            • That's not what i'm saying. What I'm saying is to generate MD5 only for the names of the variables, so you can control what is passed on to the script, so you will not have any variable out of your control (like in your example). Also, a good simple alternative is to use extract() at the top of the script and then simply overwrite any important variable, based on your script.
  • Is the 'e' a typo in Harrys' name?
  • by Anonymous Coward
    must... not... comment... on... Harrys... last... name... nnghhh!
  • by Neil Blender ( 555885 ) <neilblender@gmail.com> on Monday August 09, 2004 @04:27PM (#9923312)
    Posts so far:

    offtopic 8
    fuecks jokes 6
    obvious flames 5
    flamebites 6
    agrees with flame 1
    trolls 2
    funnies 2
    explanation of the funnies 1
    refunny the funny 2
    php related 3
    Perl is better 2
    misc 1
  • by LoFat ByLine ( 321449 ) on Monday August 09, 2004 @08:10PM (#9925342)
    Sorry to be so on-topic, but does anyone have a take on how this book compares to George Schlossnagle's Advanced PHP Programming [slashdot.org], which appears to cover a lot of the same ground?

If I have not seen so far it is because I stood in giant's footsteps.

Working...