The PHP Anthology - Volume II, 'Applications' 100
There are seven chapters in this volume, each dealing with real-world problems. Many problems are those you've seen solved on sites you admire and wondered "How did they do that?" Others are frameworks that allow your site to run smoothly, with nobody getting accidentally logged out or having to wait too long while your script gluttonously pulls the same data out of the database for the Nth time. At the end, Fuecks goes back to the beginning, to show how proper design and development can save you time when you start your next project.
Chapter 1: Access Control
Authentication is the process by which users identify themselves. This is difficult in HTTP, a stateless protocol in which the server handles one request at a time and instantly forgets you. Luckily HTTP allows cookies, which are bits of data the server sends to the client for to reveal upon revisiting. At first cookies were used only to annoy ("Hello, Steve! You have visited this page 3 times"), but a cookie can hold the ID of a session record in a database, which contains any state-information that you like.
You can authenticate without sessions via HTTP server configuration, as long as you like the dull dialog box the browser pops up when users enter a restricted area. Oh, and you don't mind the fact that users won't be able to "log out" without quitting their browser, nor can you force a logout after a certain timeout value. Nor can you allow users to register themselves... these are all existing, solved problems and the author shows some of the best solutions. Common tasks like allowing users to change their passwords, recover their passwords if (I mean when) they forget them, and arranging users in groups to which you assign common permissions are also covered.
My favorite example from this chapter is the humans-only registration application. Remember when online voting for the Major League Baseball All-Star Game first started? Anyone who knew how to write a web client could have automated a task to vote as many times as the server could handle, and have his favorite players be the all-star team.* To bring it closer to home, what if somebody decides to bog down your site by automatically registering a huge number of times and filling up your database? You can keep these things from happening by making users look at images which contain text but are hard for computers to "read." PHP is in use at all stages of this game, from writing the registration form's HTML to generating the obscured image on-the-fly.
Chapter 2: XML
XML is a fact of life and, hype aside, is a great way to store and transmit machine-readable data. One of the most visible applications is the thousands of bloggers and news sites providing XML feeds of their headlines. You can write portal sites that grab these headlines, parse them all and present them on your site with links to the full text at the source.
There are two ways to parse XML: with events, or by using the Document Object Model (DOM). The methodologies are similar to reading a plain-text file line-by-line or all at once. Using events you can implement a finite-state machine based on which tags and text come down the pike. Or you can slurp the whole document into memory and find any part of it with ease. The built-in library for the former is based on the popular Simple API for XML (or SAX; don't you like those nested acronyms?), while the latter often uses Xpath to find the particular document nodes you want.
The author shows how to parse RSS feeds with both SAX and the DOM, and how to render a feed with DOM. Further, you can use Extensible Stylesheet Language Transformations (known as XSLT) to transform XML -- whether it's to XHTML for regular browser reading, WML (Wireless Markup Language) for viewing on mobile phones, or even SQL to communicate with a database.
Another exciting XML application is in the area of web services, in which agents (often but not necessarily web servers) communicate with each other over an XML-based protocol built on top of PHP. The two most popular protocols are XML-RPC (the RPC stands for "Remote Procedure Calls") and SOAP (which used to stand for "Simple Object Access Protocol" but now is just a name). Often-changing information such as stock prices and weather are often offered through web services, but they can also be used as an object API between agents over the network. What's cool about using SOAP is you can publish to clients exactly what services you offer and how they can call them using the Web Services Description Language (WSDL).
Chapter 3: Alternative Content Types
If you've ever printed out a web page that was designed for browser viewing, you know the less-than-desired effect. The navigational elements, search boxes, and banners, while necessary for the web page, are useless once a static copy is printed. Furthermore, you need to extend your site to include users with less-featured browsers, such as mobile phones.
Fortunately, PHP has been taught many languages. PDF is the standard for print-quality documents, and there are several libraries (free and non-free) which allow you to generate them. WML is the HTML of cell phone browsers, in which screen space is at a premium and bandwidth scarce. SVG is an XML application which allows vector-based images like PostScript does. The coolest example, however, uses XUL (the XML User interface Language, not to be confused with Zool) to make full GUI applications that you run through Mozilla. This isn't useful for the outside world where you can't force your users to use Mozilla (sigh), but works well for intranet applications that run on a variety of platforms.
The author also brings up in this chapter an HTML SAX parser he has written. You can process HTML pages chunk-by-chunk and extract the pieces you want. I hadn't known about such a class until I read the book and I'm very excited I know about it now. For sometimes it's necessary to parse a web page meant for humans to read (perhaps to pretend to be a user and automate your all-star voting), and most HTML pages won't validate as HTML, let alone XML.
A good point here is that a well-designed, tiered application will allow you to swap out different presentation classes with little code rewrite. Separating the tasks of extracting the data from the database and presenting to the user in variety of formats is a common task that when done right becomes subsequently easier.
Chapter 4: Stats and Tracking
Once your site is up and running, you'll be interested to know which parts
are the most active, and how much traffic you're getting. Into a dynamic page you can obviously insert any logging mechanism, but a great place to put it is
inside your site's logo. PHP can send binary data as easily as text.
Why would you want to do this?
- The logo is usually on every page (or it should be). You don't have to cut-and-paste code.
- You can serve the image, then use the flush command to send the output on and do extra processing. This way logging doesn't get in the way of page rendering.
There are lots of packages available to collect and analyze data. The author goes through phpOpenTracker which is quite rich in features. There are also ways to collect data on what links users follow to leave your site, and to keep requests from search engines from cluttering your log files.
Chapter 5: Caching
Another possible knock against PHP is that, while it's good to have dynamic pages, some pages are unnecessarily so. This is a waste of server resources to keep rendering the same page anew. There are different ways to conserve.
On the client side, you can use HTTP 1.1 headers like Cache-control and Expires to tell browsers when it's okay to store cached copies locally
On the server side, as can be expected, you have a greater level of control. You can use output buffering to delay sending of output to the browser, then save a copy of the output locally. On subsequent requests, you can serve the file rather than generate the HTML all over again. This can be implemented on a chunk (or block) level, so that you can keep some parts ultra-time sensitive and others not so much. The package PEAR::Cache_Lite can help with this.
Chapter 6: Development Technique
The last two chapters were my favorites of the two-volume set. They are on
a higher level of abstraction than the features of PHP's library of functions, or previous five chapters on real-world solutions. After you've reached a certain level of expertise in PHP coding, you being to wonder about the "right"
way to do things. The author shows how to use Xdebug to find bottlenecks in your code, as well as a few quick optimization tips (for instance, design your flow control so that the first choice is the one most often taken).
He then discusses the principles of N-tiered design. N is usually 5, but the data layer (usually a database or file system) and presentation layer (usually the browser) are most often handled outside of PHP, so you normally have three levels to worry about:
- Data Access: Getting data from the outside world into your application
- Application Logic: Doing whatever unique thing your application is supposed to do
- Presentation Logic: Forming a response in a format acceptable to your client
Keeping these layers separate and restricting them to communicating through well-defined interfaces allows you maximum flexibility. If you need to change databases (say you just got venture capital money and can afford Oracle now), you can do so only changing one layer. If you want to serve different flavors of HTML, or different markup languages altogether, or binary data, you can do so by only changing one layer. You can even strive for maximum distributability by enabling your layers to "live" on physically independent machines and communicate with XML-RPC or SOAP.
Documenting your code is essential. Anybody who's been programming for over a year has gone back to code he or she's written and thought, "Now what the heck was this supposed to do?" It's even more essential when you write something and wish to distribute it for the benefit of others. You can expect them to grok your code at an even lower rate since they didn't write it the first time.
Luckily, scripted languages like PHP are excellent at parsing text files, including PHP scripts themselves. Using well-defined documentation formats akin to JavaDoc, you can embed documentation in your code inside comments, and use tools like phpDocumentor to extract these documentation blocks and format them as nice, cross-reference HTML. In fact, writing doc blocks before your code is a good way to think ahead about how you want your classes and methods to work.
Unit Testing, one of the most digestible dogmas of Extreme Programming, is an awesome way to test your code for logic errors. You build up tiny test cases (using mock objects to isolate the class you're testing) and build as many as you like. Once you do this (PHPUnit and SimpleTest are two rich frameworks), you keep your tests and each time you add features, you run your test to make sure you haven't added bugs as well.
Chapter 7: Design PatternsDesign Patterns is one of the modern classics in information technology. After having done OOP for a while, you will inevitably get the feeling of deja vu that you've solved a problem before. Not so concretely as "I need a database abstraction layer," or "I need a templating system," but as in "I need a way to create objects without specifying exactly what class they belong to," or "I'm tired of writing so many if statements." Design patterns are common object architectures which can be used to solve common (though unique) problems.
Many design patterns are more suited to state-equipped applications with GUIs, but there are plenty to assist the PHP coder. The Factory Method is a pattern through which an object can create other objects of varying classes. So instead of writing mysql_connect everywhere, then having to change every occurrence of that function, you can abstract all database interaction to a class, then instantiate a database connection through a class method of another class: $db = MyApp::getDatabaseConnection(). This is useful when the connection (not just the RDBMS, but the actual database) you want varies depending on whether you are developing, testing, or going live with your application. Factory methods are also a good way to avoid global configuration variables.
The Iterator Pattern and the Observer Pattern are two others mentioned in this chapter. Iterators are used often in paging through database results. Observers are used to let objects notify other objects of changes in their state. This chapter will make you want to go read the whole Gang-of-Four book if you haven't already.
My biggest beef with the book is that this wasn't presented earlier on, perhaps at the beginning of Volume II. As a climax, it leaves me flat, wondering how the rest of the volume could have been derived from this very cool concept. But most PHP books conclude with chapters on how to extended PHP on the C level, or giant case studies involving massive code dumps, and I'm often not satisfied with them. This is a nice philosophical note to go out on. And there's something to be said for the argument that books like these aren't written to be read cover-to-cover.
Appendices
The book closes with the same indices as in Volume I. Since I don't know the URL of my review of that volume, I'll just copy: You can read about which configuration directives you're probably most interested in (the complete list you can get on PHP's web site), some common security breaches, and how to install PEAR, PHP's version of CPAN. My favorite appendix is the "Hosting Provider Checklist," a great reference for evaluating whether kewlhosting.com is going to give you the freedom and support you need to make a great hosted web site.
This volume was informative, well-written, and inspirational in that it made me want to go out and add cool and useful features to my web sites. Check it out if you can.
*Not really (not that I tried or anything), but they've always been a little bit smarter about it. You get my point, though. This did happen on an ESPN.com Page 2 mascot popularity contest, but they noticed through request headers that millions of votes were coming from the same place, and invalidated all those votes.
In real life, Matthew Leingang is Preceptor in Mathematics at Harvard University. He promises to review any book sent to him for free, and sometimes actually does it. Both volumes of The PHP Anthology are available from SitePoint. Slashdot welcomes readers' book reviews; to see your own review here, carefully read the book review guidelines, then visit the submission page.
Re:A word if I may: (Score:2)
----
How to Make Work Enjoyable [blogspot.com]
Symantic Web (Score:4, Funny)
Re:Symantic Web (Score:5, Funny)
From the slashdot FAQ... (Score:3, Interesting)
Re:From the slashdot FAQ... (Score:5, Interesting)
Version 4.0.3
11-Oct-2000
So what would CmdrTaco say now for PHP5?
Re:From the slashdot FAQ... (Score:1, Funny)
"Whuh? Oh, not now, I'm in the middle of an Anime marathon!"
Re:From the slashdot FAQ... (Score:3, Funny)
Re:From the slashdot FAQ... (Score:2)
Re:From the slashdot FAQ... (Score:2)
Yeah I do, and I will admit that I was looking to start a holy war (I'm stuck at work and bored to hell ;-) ).
To be perfectly honest though I have grown sour towards PHP. If find I put more effort into wrestling to get my logic seperate from my content, and mustering data into objects than I do writing good code.
Perl is my language of choice because of this; though I am not simply writing off PHP.I favored it for a while, but now I am weary of it.
Re:From the slashdot FAQ... (Score:1)
mod_perl has advantages and disadvantages, but for what people are using the net for right now, and the cost/time of development, PHP is the way to go as far as I'm concerned.
Re:From the slashdot FAQ... (Score:2, Insightful)
I really don't hate PHP, it's just that I find that the language itself conflicts with the way I want to program. I find Perl to be more luid in that respect.
Also as far as my own site I did use PHP to make it, and, in fact, it is hosted by phpwebhosting.com. Though the latter is more because they are an excellent host than anything else (seriously, they are sweet). My next revision is going to in perl. Growing a site in PHP is too difficult.
I have two other web related projects (the kind that pay money)
Re:From the slashdot FAQ... (Score:2)
Re:From the slashdot FAQ... (Score:3)
I think that "Last Modified" date says we might get a different answer now, PHP has changed a lot since 2000.
Re:From the slashdot FAQ... (Score:2)
Probably would have said the same thing about mysql considering where postgres was on 2000.
socks? (Score:5, Funny)
And you profess to be a geek! Haven't you heard of sandles-and-socks.
Re:socks? (Score:1)
Re:socks? (Score:1)
FredRated
"PMS: when women act like men for a few days a month"
Re:socks? (Score:1)
Re:socks? (Score:1)
Does it have a chapter... (Score:4, Informative)
Does it have a chapter on how to talk to tech support at a hosting service to:
Kidding aside, I have a love-hate relationship with PHP because of having to support applications where I don't have total control over how PHP is installed.
Re:Does it have a chapter... (Score:3, Informative)
Re:That's PHPs fault, not the host. (Score:1)
Optional libraries, as in compiled C libraries, yes they must be installed/activated by the server admin. Only if run as mod_php (which 99% of installs are) though. Run as a cgi and each user can run their own com
Re:Does it have a chapter... (Score:2)
Have you considered one of the user mode linux hosts like JVDS? [jvds.com]
They have BSD as well as Linux, and you're in complete control for about the same price as a space/bandwidth-equivalent shared hosting setup.
The BSD machines obviously don't run um-Linux, but the equivalent mode of BSD that apparently works very well and has been around a long time.
I've been using them for a while to host a bunch of small websites. I don't know how PHP performs in that environment
Re:Does it have a chapter... (Score:2)
Thanks for the info about the hosting companies.
I have a good hosting company that will work with me for system changes, but in the future I will probably go with a virtual host for the reasons you mentioned.
Most of my griping was aimed at being in the situation where a client already has a hosting company and won't switch, and I was forced to use them and work inside their boundaries.
Real-world enterprise applications (Score:1, Interesting)
Re:Real-world enterprise applications (Score:5, Informative)
Look up the "Smarty" templating engine. That should give sufficient power to get your display out of your business logic, for the most part.
Re:Real-world enterprise applications (Score:1, Informative)
If you want templating, use the PHP engine itself!
All my
And still they work for separating business logic from presentation logic.
If you like you could read this link:
http://www.sitepoint.com/article/beyond-te
Re:Real-world enterprise applications (Score:1)
b4n
Re:Real-world enterprise applications (Score:2)
Which Smarty doesn't do, it mixes them right back up again.
Not much use if you want your templates worked on by someone who isn't a programmer, although it certainly looks powerful I fail to see the point. If you want code in the templates just use PHP.
After experimenting around the way I settled on was a 5-line function using file_get_contents() and str_replace(). I fail to see why a template engine needs 3000 lines of code as sm
Re:Real-world enterprise applications (Score:5, Informative)
Any language is easy to make mistakes in. That's because people code it. True, you can block in HTML and PHP in the same document, but only a fool would think that's all PHP can do (or even that PHP somehow gets off on doing that). There are a plethora of PHP templating engines out there. Can't find one you like? Write your own in under 100 lines of code. It's that simple. Saying that's a major reason why PHP isn't enterprise-ready is just plain stupid.
I've coded and deployed applications written in PHP throughout the company I work at, and they perform flawlessly. Maybe that's because I have an ounce of understanding what it takes to be a programmer. I'm not great, yet I can manage to pull it off without any problems.
Then, you go top it off with pure speculation about the ideology behind PHP. Again, this shows you really don't have a clue about PHP.
PHP is a very versatile, very powerful cross-platform tool. You can make shitty web apps in it, but then you can in any language. It's just as powerful/useful/stable as any other language out there.
sheesh.
Re:J2EE (Score:1)
Re:Real-world enterprise applications (Score:5, Interesting)
Look Personally I adore Php. But I see his point. The templating thing is a red herring, but its indeed too easy to make oddball mistakes that arise from the weak typing.
Ie $arr[$index] =$value
vs $arr[index] = $value
Both will 'work'. One will also screw up.
Its a similar problem that beset VB, and interestingly VB is (for reasons that really puzzle me. Both cobol and VB do have 'being crap' issues in common however) almost the new cobol of the windows business world.
However VB has the option to make typing explicit. PHP/Zend developers: Pay attention!
Re:Real-world enterprise applications (Score:1, Informative)
vs $arr[index] = $value
Both will 'work'. One will also screw up.
The second wont work, it'll throw an error.
Re:Real-world enterprise applications (Score:1)
define('index',$index);
above it
Re:Real-world enterprise applications (Score:1)
b4n
Re:Real-world enterprise applications (Score:1)
I was just making the code work.....
Re:Real-world enterprise applications (Score:2)
Figure you aint dont one of those 'big jobs' before either.
Re:Real-world enterprise applications (Score:2)
$arr[index]=$value
$arr["index"]=$value
Where you can see that PHP, unless index is a constant, will look for an entry with an index of "index", which understandably could cause problems (even though I've never seen that cause a problem, personally).
That's not a particularly good argument against scrapping an entire language, though. If you've read some of the PHP manual, you don't make those mistakes. And, believ
Re:Real-world enterprise applications (Score:1)
undefined index 'index' at line $line_number_of_error
in any version of PHP from about 4.1 upwards and is shown in error messages. It's not a mistake it's a syntax error as shown in the manuals arrays page [php.net]
How long before... (Score:5, Funny)
Re:How long before... (Score:2)
-Erwos
Re:How long before... (Score:5, Informative)
$_GET["varname"]
$_POST["varname"]
Having register_globals turned on is a security risk as it allows hackers to inject settings into your code or adversely affect your variables without you knowing it.
Re:How long before... (Score:2)
Then you can have it switched off on the rest of the server.
Have a look at the predefined variables [php.net] bit of the PHP manual, it explains quite well how to avoid needing register_globals.
Re:How long before... (Score:2)
Re:How long before... (Score:1, Insightful)
That's a horrible attitude. What if you forget to null out a variable? What if you make a typo? You want as much to be constant between runs as possible. You want an attacker's conttrol of variables to be as limited as possible. Making it easier to write secure, c
Re:How long before... (Score:1)
- You inherited/are-using code from someone else
- A hacker thinks of something you didn't
Let's also not forget all the blame that is put on Microsoft for having insecure software. Sure
Re:How long before... (Score:2)
This only works in PHP 4.1.0 and later. $HTTP_GET_VARS and $HTTP_POST_VARS [php.net] will work in all(?) PHP4 versions.
I spent a fair amount of time this morning debubbing a simple script that worked on my test server (PHP 4.3.4), but failed on the production server (at a client's hosting company). After a while, I found out that it was PHP 4.0.x... ug
Re:How long before... (Score:1)
Re:How long before... (Score:2)
A querystring, maybe? <a href="foo.php?bar=baz&qxt=fred">This is a link that passes two variables</a>
Maybe that's not what you mean, though.
Re:How long before... (Score:2)
extract($_GET);
and/or
extract($_POST);
a nd continue as before.
And it works for cookies, but I'm too lazy to see if it's 'cookie' or 'cookies'.
This might totally negate the security stuff* but it's handy and you don't have to mess with PHP's settings (i.e., php.ini) which may not be possible on a server you don't own.
* at the very least: if you're expecting data from a POST, someone can't overwrite it by changing the
Re:How long before... (Score:1)
Re:How long before... (Score:2)
Just use $_POST["var"] and $_GET["var"], the PHP manual has been saying to do for years (at least 4). Using extract($_POST) is just as dangerous as using register_globals. Say your script does something like:
if ($userIsLoggedIn) { doSecretStuff(); }
Then all someone has to do is visit
http://yoursite.com/index.php?userIsLogged I n=true
and they've potentially cracked your site. Without register globals or e
Re:How long before... (Score:1)
Re:MLB All-star voting (Score:3, Funny)
Just wondering... (Score:1)
must.. hold.. tongue.. (Score:1, Funny)
Typical PHP story (Score:4, Funny)
offtopic 8
fuecks jokes 6
obvious flames 5
flamebites 6
agrees with flame 1
trolls 2
funnies 2
explanation of the funnies 1
refunny the funny 2
php related 3
Perl is better 2
misc 1
Re:Typical PHP story (Score:1)
why I prefer Python 10
Re:Typical PHP story (Score:3, Funny)
Advanced PHP Programming? (Score:5, Interesting)