XML descriptors in PHP frameworks – considered harmful

No, I am not a seasoned PHP programmer and I do not intend to become one. But we do live in a harsh economy where all IT projects are worth considering, thus my occasional incursions in the world of of PHP-driven websites.

I am not new to PHP either, but – coming from a Java world – immediately felt the need of a serious MVC framework.

Nobody wants to reinvent the wheel each time a new website is built. Just launch the obvious “PHP MVC framework” on Google and the results pages will be dominated by four open-source projects :

- PHPMVC is probably the oldest project and implements a model 2 front controller/li>

- Ambivalence declares itself as a simple port of

Java Maverick project - Eocene a “simple and easy to use OO web development framework for PHP and ASP.NET”,implementing MVC and front controller - Phrame is a Texas Tech University project released as LGPL, heavily inspired by Struts.

The choice is not easy. There are no examples of industrial-quality sites built with either of these frameworks.

(some may say there are no examples of industrial quality sites built with PHP but let's ignore these nasty people for now).

There are no serious comparisons of the four frameworks, neither feature-wise nor performance-wise.

In the tradition of open-source projects, the documentation is rather scarce and examples are “helloworld”-isms.

Yes I am a bloody bastard for pointing out these aspects – since the authors are not paid to release these projects – and perhaps I could contribute myself with some documentation. However, when under an aggressive schedule I feel it's easier to write my own framework instead of understanding other people's code and document it thoroughly.

However, I have a nice hint for you. The first three frameworks are using XML files for controller initialization (call it “sitemap”, “descriptor” or otherwise; but it's just a plain dumb XML file). So you should safely ignore them in a production environment.

Because, the “controller” is nothing more than a glorified state machine. The succession of states and transitions (or “actions” or whatever) should be persisted somewhere. XML is probably a nice choice for Java frameworks, where the files are parsed and the application server keeps in memory a pool of controller objects.

But: PHP sessions are stateless. The only way of keeping state is via filesystem or database, usually based on an ad-hoc generated unique key, which is kept in the session cookie. More: PHP allows native serialization only for primitive variables; a complex object such as the controller can not be persisted easily, so it has to be retrieved from XML and fully rebuilt. Unlike in Java appservers, objects cannot be shared between multiple session, thus pooling is not an option. Thus, in PHP, the XML approach is highly un-recommended, since this means that the XML files are parsed for each page that is viewed on the site. Although PHP's parser is James Clarks's Expat, one of the fastest parsers right now (written in C), note that the DOM object must be browsed in order to create the controller object (which is becoming more and more complex as the site grows). This is called heavy overhead, no matter how you look at it.

There are a few reasons about why you need XML in a web framework, however this does NOT apply to PHP apps. Myth quicklist:

  • it's “human-readable”. Come on, PHP is stored is ASCII readable files and even if you use Zend products to compile and encrypt your code, why on earth would you allow readability and modification of the controller state machine on the deployment site ?
  • easier to modify than in code. This is probably true for Java and complex frameworks, but in PHP is significantly simpler than Java.
  • automatically generated from code by tools such as Xdoclet or via an IDE. If you're writing it in Java, because PHP does not have such tools.

This means that the only serious candidate (between these considered here) for a PHP MVC framework is Phrame, which stores the sitemap as a multi-dimensional hashmap. Thus, you should either consider Phrame or (for small \< 50 screens) sites you'll be better off writing your own mini-framework, with a state machine implemented as a hashed array of arrays and some session persistence in the database. I chose to serialize and persist an array containing primitive variables, using PHPSESSID as the primary key in order to retrieve and unserialize the array, all coupled with a simple "aging" mechanism for these users with the nasty habit of leaving the site without performing logout first.

Finally a last world of advice : use PEAR ! This often overlooked library of PHP classes includes a few man-years of quality work. You'll get a generic database connection layer (PEAR-DB) along with automatic generation of model classes mapped on the database schema (DB_DataObjects), a plethora of HTML tools (tables, forms, menus) and some templating systems to choose from. All in a nice easy to install and upgrade package.

Don't put a heavy burden on your upgrade cycle using heterogenous packages downloaded from different sites on the web, just use PEAR.

Or simply ignore the PHP offer and wait patiently for your next Java project. Vacations are almost over.