PHP, Zend Framework and Other Crazy Stuff
Archive for September, 2009
The Mysteries Of Asynchronous Processing With PHP – Part 2: Making Zend Framework Applications CLI Accessible
Sep 29th
In Part 1 of this series, we started an exploration of the concept of Asynchronous Processing as it applied to PHP. We covered the benefits it offers, the basic implementation directions often applied, and also discussed how to identify and separate tasks from the main application so they could be made subject to asynchronous processing. It is highly recommended that you read this before continuing with Part 2 so you can follow what I’m up to here
.
UPDATE: Modified the bootstrap class and script based ZF runner to reflect some changes needed to support Part 3 of this series. These primarily allow for improved control over command line options.
With the theory heavy portion of the series out of the way, we can begin to explore the various implementation possibilities. In Part 3, we will examine implementing Asynchronous Processing using a child process, i.e. a separate PHP process we create from our application during a request. We’ll analyse this implementation option before introducing some source code so we may understand its advantages and disadvantages.
While, technically, this series is not Zend Framework specific since the same principles can be applied to any PHP application, I’ll be using the Zend Framework in examples of asynchronous processing from an application. As a result, Part 2 is a tangential detour into how to make a Zend Framework based application accessible from the command line before we delve into examples using this in future parts of the series. If you are not a Zend Framework user, I’m sure you can find relevant material online for your own preferred framework though the ZF pieces may still have some usefulness in understanding the approach from an MVC perspective.
Surprising to some, the Zend Framework is indeed usable from the command line…with some massaging. I’ve already noted that using a full application framework for a background task comes at a cost since you are using a lot of code, not all of which may be strictly necessary but unless you are willing to invest in a custom framework specifically for such uses, your framework of choice is probably the simplest option.
I’m not going to describe setting up a basic application with the Zend Framework, however you can do so by following the base application created in my book, Zend Framework: Surviving The Deep End (which is free, online in HTML form, and now duly advertised to you
). The relevant chapters are Chapter 5 and Chapter 6. If you want to get started quickly, you can download the example application for the book (in progress) from http://github.com/padraic/ZFBlog
Unfortunately, the ZF is not immediately accessible from the command line. Although it offers classes like Zend_Console_Getopt and Zend_Controller_Response_Cli, the remaining pieces are mysteriously (and conspicuously) missing. They are not difficult to add however, especially if you are using Zend_Application to fuel your bootstrapping.
Adding Custom CLI Support Classes
There are two very obvious problems calling a Zend Framework application from the command line. First, there is no Request class supporting CLI command line options (though there is a Zend_Controller_Request_Simple). Secondly, the Front Controller always attempts to route the request, and all of the standard Routers assume you will use a HTTP request. This HTTP focus results in an Exception when routing occurs.
To improve this situation, we will implement two very simple custom classes. ZFExt_Controller_Request_Cli and ZFExt_Controller_Router_Cli.
ZFExt_Controller_Request_Cli very simply accepts an instance Zend_Console_Getopt and attempts to locate a module, controller and action name from the command line options it exposes. If they exist, these are used to set the relevant module, controller and action names for the request (doing this manually negates the need for routing). Here’s the class stored to (if using the example app) at /library/ZFExt/Controller/Request/Cli.php:
[geshi lang=php]
require_once 'Zend/Controller/Request/Abstract.php';
class ZFExt_Controller_Request_Cli extends Zend_Controller_Request_Abstract
{
protected $_getopt = null;
public function __construct(Zend_Console_Getopt $getopt)
{
$this->_getopt = $getopt;
$getopt->parse();
if ($getopt->{$this->getModuleKey()}) {
$this->setModuleName($getopt->{$this->getModuleKey()});
}
if ($getopt->{$this->getControllerKey()}) {
$this->setControllerName($getopt->{$this->getControllerKey()});
}
if ($getopt->{$this->getActionKey()}) {
$this->setActionName($getopt->{$this->getActionKey()});
}
}
public function getCliOptions()
{
return $this->_getopt;
}
}[/geshi]
ZFExt_Controller_Router_Cli is basically a “dumb” router. It implements Zend_Controller_Router_Interface but all of its methods are blank. Since our CLI access does not need to be routed, we’re effectively just plugging the requirement for a Router object with something which is designed to do absolutely…nothing
. Here’s the class stored to (if using the example app) at /library/ZFExt/Controller/Router/Cli.php:
[geshi lang=php]
class ZFExt_Controller_Router_Cli implements Zend_Controller_Router_Interface
{
public function route(Zend_Controller_Request_Abstract $dispatcher){}
public function assemble($userParams, $name = null, $reset = false, $encode = true){}
public function getFrontController(){}
public function setFrontController(Zend_Controller_Front $controller){}
public function setParam($name, $value){}
public function setParams(array $params){}
public function getParam($name){}
public function getParams(){}
public function clearParams($name = null){}
}[/geshi]
Putting these new classes to use will require manually adding them to the Front Controller we'll use in our application bootstrap. For CLI use, I've elected to implement a new bootstrap which is very similar to the one implemented for the example app at /library/ZFExt/Bootstrap.php. The CLI bootstrap below is stored to /library/ZFExt/BootstrapCli.php:
[geshi lang=php]
class ZFExt_BootstrapCli extends Zend_Application_Bootstrap_Bootstrap
{
protected $_getopt = null;
protected $_getOptRules = array(
'environment|e-w' => ‘Application environment switch (optional)’,
‘module|m-w’ => ‘Module name (optional)’,
‘controller|c=w’ => ‘Controller name (required)’,
‘action|a=w’ => ‘Action name (required)’
);
protected function _initView()
{
// displaces View Resource class to prevent execution
}
protected function _initCliFrontController()
{
$this->bootstrap(‘FrontController’);
$front = $this->getResource(‘FrontController’);
$getopt = new Zend_Console_Getopt($this->getOptionRules(),
$this->_isolateMvcArgs());
$request = new ZFExt_Controller_Request_Cli($getopt);
$front->setResponse(new Zend_Controller_Response_Cli)
->setRequest($request)
->setRouter(new ZFExt_Controller_Router_Cli)
->setParam(‘noViewRenderer’, true);
}
// CLI specific methods for option management
public function setGetOpt(Zend_Console_Getopt $getopt)
{
$this->_getopt = $getopt;
}
public function getGetOpt()
{
if (is_null($this->_getopt)) {
$this->_getopt = new Zend_Console_Getopt($this->getOptionRules());
}
return $this->_getopt;
}
public function addOptionRules(array $rules)
{
$this->_getOptRules = $this->_getOptRules + $rules;
}
public function getOptionRules()
{
return $this->_getOptRules;
}
// get MVC related args only (allows later uses of Getopt class
// to be configured for cli arguments)
protected function _isolateMvcArgs()
{
$options = array($_SERVER['argv'][0]);
foreach ($_SERVER['argv'] as $key => $value) {
if (in_array($value, array(
‘–action’, ‘-a’, ‘–controller’, ‘-c’, ‘–module’, ‘-m’, ‘–environment’, ‘-e’
))) {
$options[] = $value;
$options[] = $_SERVER['argv'][$key+1];
}
}
return $options;
}
}[/geshi]
This new bootstrap class performs two important functions. First, it sets up the application’s Front Controller to use our full set of CLI helper classes including the custom ones we added. Secondly, it allows for the setting of a command line option parser, an instance of Zend_Console_Getopt. The default used within the bootstrap class has a limited set of options, so we could set a replacement parser with an expanded set of command line options available. Unfortunately, we may not simply add new options and reparse due to the limitations of Zend_Console_Getopt but substitution will work just fine for most needs.
Adding A Calling Script
All that remains to enable CLI access is to add a calling script to run the application. We’ll start by adding a php file at /scripts/zfrun.php. This will be very similar to how a Zend Framework index.php file would look like if using Zend_Application:
[geshi lang=php]
if (!defined('APPLICATION_PATH')) {
define('APPLICATION_PATH', realpath(dirname(__FILE__) . '/../application'));
}
if (!defined('APPLICATION_ROOT')) {
define('APPLICATION_ROOT', realpath(dirname(__FILE__) . '/..'));
}
set_include_path(
APPLICATION_ROOT . '/library' . PATH_SEPARATOR
. APPLICATION_ROOT . '/vendor' . PATH_SEPARATOR
. get_include_path()
);
require_once 'Zend/Loader/Autoloader.php';
$autoloader = Zend_Loader_Autoloader::getInstance();
$autoloader->setDefaultAutoloader(create_function(‘$class’,
“include str_replace(‘_’, ‘/’, \$class) . ‘.php’;”
));
// check for app environment setting
$i = array_search(‘-e’, $_SERVER['argv']);
if (!$i) {
$i = array_search(‘–environment’, $_SERVER['argv']);
}
if ($i) {
define(‘APPLICATION_ENV’, $_SERVER['argv'][$i+1]);
}
if (!defined(‘APPLICATION_ENV’)) {
if (getenv(‘APPLICATION_ENV’)) {
$env = getenv(‘APPLICATION_ENV’);
} else {
$env = ‘production’;
}
define(‘APPLICATION_ENV’, $env);
}
$application = new Zend_Application(
APPLICATION_ENV,
APPLICATION_ROOT . ‘/config/cli.ini’
);
$application->bootstrap()->run();[/geshi]
That wasn’t so bad
. The script itself merely sets up the typical constants needed for Zend_Application. We also have a block defining the rules needed to parse any command line options. As the related comment suggests, we should in future iterations add a means of appending additional rules as needed by varying tasks. The resulting Zend_Console_Getopt instance is later passed to our bootstrap instance (ZFExt_BootstrapCli) before we bootstrap and run the application.
The final piece of this jigsaw is adding the configuration file, cli.ini, passed to Zend_Application. This is a cut down version of the original application.ini used by the example app stored to /config/cli.ini:
[geshi lang=php][production]
; PHP INI Settings
phpSettings.display_startup_errors = 0
phpSettings.display_errors = 0
; Bootstrap Location
bootstrap.path = APPLICATION_ROOT “/library/ZFExt/BootstrapCli.php”
bootstrap.class = “ZFExt_BootstrapCli”
; Standard Resource Options
resources.frontController.controllerDirectory = APPLICATION_PATH “/controllers”
resources.frontController.moduleDirectory = APPLICATION_PATH “/modules”
; Module Options (Required For Mysterious Reasons)
resources.modules[] =
; Autoloader Options
autoloaderNamespaces[] = “ZFExt_”
[staging : production]
[testing : production]
phpSettings.display_startup_errors = 1
phpSettings.display_errors = 1
resources.frontController.throwExceptions = 1
[development : production]
phpSettings.display_startup_errors = 1
phpSettings.display_errors = 1
resources.frontController.throwExceptions = 1[/geshi]
The main differences from the original application.ini is to remove any settings for a View. We won’t be rendering any templates for our CLI access. Otherwise, you can retain any other settings for database access, etc. This could also be added as a separate section to application.ini, however I decided a separate CLI settings file made it a bit simpler to follow and allows setting the usual application environment based sections.
Adding CLI tasks to ZF Applications
We’ll start by adding a TaskController to the application. The name is largely irrelevant so don’t decide you must put all tasks into the same controller! You may also use controllers within a module should they require their own specific tasks or command line needs.
The new controller is added at /application/controllers/TaskController.php:
[geshi lang=php]
class TaskController extends Zend_Controller_Action
{
public function init()
{
if (!$this->getRequest() instanceof ZFExt_Controller_Request_Cli) {
exit(‘TaskController may only be accessed from the command line’);
}
}
public function echoAction()
{
echo ‘Hello, World!’, “\n”;
exit(0);
}
}[/geshi]
While this is a very simple example, echoing a message, the task itself could be as complicated as you wish. We’ve also added a quick check to ensure this controller cannot be accessed from a normal HTTP request – having publicly available tasks is not a good idea afterall
.
Using the CLI access from the command line
Use of our newly added CLI access to this Zend Application is very simple. There are four command line options defined. Here’s an example which calls the new task and sets the application environment (used in our configuration) to “development”. Note that if absent, the environment defaults to “production”.
php zfrun.php -c task -a echo -e development
Which is equivelant to:
php zfrun.php –controller=task –action=echo –environment=development
Using either, once you’ve navigated to the application’s /script directory, should echo the message we added to the task.
Conclusion
In the second part of our look at Asynchronous Processing we’ve investigated how to enable CLI access to a Zend Framework application. In the future, this will allow us to delegate tasks asychronously using command line calls and using framework based tasks.
In Part 3, we’ll return to the Asynchronous Processing topic and put this work to use in explaining a very common implementation strategy for asynchronous tasks.
The Mysteries Of Asynchronous Processing With PHP – Part 1: Asynchronous Benefits, Task Identification and Implementation Methods
Sep 27th
Imagine a world where clients will give up on receiving responses from your application in mere seconds, where failed emails will give rise to complaints and lost business, where there exist tasks that must be performed regularly regardless of how many requests your application receives. This is not a fantasy world, it’s reality. In the real world your application must be responsive, reliable and capable of recovery from errors. These are obvious needs but all too often applications fail to realise them. Sometimes, developers even fail to realise they should even be concerned about them.
To offer an opening real-world example, I’ll borrow from a recent discussion I had concerning the Pubsubhubbub Protocol. If you are unfamiliar with Pubsubhubbub (PuSH), it’s a protocol which implements a publish-subscribe model where the publishers of RSS and Atom feeds can “push” updates to a group of Subscribers. The pushing is handled by an intermediary called a Hub which is pinged by the Publisher when they update a feed, and which then distributes the update to many Subscribers using a Callback URL they each have declared.
In that discussion, the original poster was having a problem. Whenever a Hub sent his Subscriber implementation an update, it seemed to do it repetitively for some mysterious reason. Eventually, the problem was identified. The Hub implements a five second timeout. If, after five seconds, the update request was not completed because the Subscriber failed to send a valid response, it was assumed to have failed. The Hub would then attempt it again, and again, until finally its configured number of retries was used up.
Why was the five second timeout being exceeded by the Subscriber? What was taking it so long in returning a response and finishing the request? You see, the Subscriber was not simply acknowledging the receipt of an update as demanded by the protocol, it was actually processing the entire update for its own use including a number of potentially expensive database operations before it completed the request. This was taking more than five seconds.
Here’s the problem in a nutshell. The Subscriber was performing work that had absolutely nothing to do with returning a response to the Hub and it was having an impact on the time it took to complete the request. The Hub couldn’t care less about the Subscriber’s processing, it was expecting a quick confirmation that the update was received. Instead, the Subscriber was effectively making it wait while it did something completely unrelated to that response. Using Asynchronous Processing, the Subscriber should have offloaded the feed processing elsewhere leaving it free to quickly respond to the Hub.
What is Asynchronous Processing?
Asynchronous processing is a method of performing tasks outside the loop of the current request. Basically, you offload the task to another process, leaving the process serving the request free to respond quickly and without delay. Of course, not all tasks are caused by a request. Some can performed without a request trigger, like some forms of maintenance or log parsing.
Implementing asynchronous processing can take a few directions:
1. A parent process can spawn a child process to complete a task in the background allowing the parent process continue uninterrupted.
2. You could add tasks to a Job Queue (or even Message Queue) relying on a background daemon or scheduled process to perform batch processing of outstanding tasks in the queue.
3. You could simply have a scheduled standalone task without the queue, and which is performed regardless of what requests are received.
There are, I’m sure, many more variations. Most readers will recognise at least one of these (hint: cron
). Once you understand the nature of asynchronous processing you can find many uses for it in the most unlikely of places.
What Problems Does Asynchronous Processing Solve?
Our example demonstrates that resource intensive tasks can be detrimental to responsiveness, so much so that it can can become detrimental in turn to the client, whether it be a machine applying a configured timeout and being forced into retrying the same request over and over, or whether it be an actual person who has to stare at a blank page as the seconds tick by.
Resource intensive tasks are not the only ones worth applying asynchronous processing to, though they are likely the most obvious given their impact on clients. Most tasks worth offloading can be grouped into categories:
1. Tasks which are resource intensive, i.e. needing a lot of CPU cycles or memory to complete which will add to server load and delay client responses.
2. Tasks which are time consuming but not necessarily resource intensive. These may include database operations, HTTP requests, the use of external web services, and other operations which can suffer delays from network latency or external problems out of our control.
3. Tasks which must be completed regardless of errors. For example, sending emails like signup confirmations or order confirmations. If a first attempt fails (for whatever reason), they may need to be attempted many times before either succeeding or being reported or logged for attention. Obviously, attempting these just once within a request cycle is prone to error – if it fails during the request, will it ever be attempted again? What if your mail server is offline for an extended period?
4. Tasks not triggered by requests. If it needs to be performed, but is not triggered by a HTTP request, then it probably needs to be scheduled or manually added to a job queue somewhere.
If you can categorise any task in your application within those loose categories, then you have identified a potential candidate for asynchronous processing. If such tasks are presently performed during an application request, you just need to pass one additional test – the completion of the task should not be required in order to return a response. Sending emails, for example, can be done in the background and will not effect the response – it doesn’t have an impact on any dynamic data passed to a view or template, for example.
Implementing Asynchronous Processing: Task Identification, Separation and Reusability
So, we’ve worked through the thought process and theory of asynchronous processing. Before we run off and implement some examples, we first need a task! Once it’s identified, we then need to separate it from the application so it can be processed as an independent unit of work. To add to this, we should also make sure it’s reusable, essentially returning to our Object Oriented basics. The task should be implemented as a class, or set of classes, so we can execute it with different parameters as easily as possible. This may not have been its original structure. For example, it may simply have been a big procedural script hiding out in an application controller somewhere (very very common), or even the application’s service layer.
Let’s stick with a prior example, our Pubsubhubbub Subscriber. We’ll assume, for now, that the most appropriate method of asynchronous processing relies on spawning a background PHP process to operate on the feed update, leaving the parent process free to return a response quickly. The task to be made subject to background processing is therefore anything to do with processing the feed update. We can show both alternatives in a simple diagram.

Now that the task is identified, it needs to be separated. This involves taking all steps that the task performs and adding them to an isolated script, effectively a PHP file executed from the command line using the “php -f” command. This does not mean that task must be procedural! It should remain as object oriented as possible. Here’s a sample PHP file showing a simple task and demonstrating how it’s called from a script.
[geshi lang=php]
myTask::perform();
class myTask {
public static function perform()
{
echo "Performing a task...";
}
}[/geshi]
Simple really. Once the perform() method is called, you can use Object Oriented Programming as usual.
One final piece to remember is that tasks should be reusable. You may start by calling this in a separate child process, but that may be migrated to a Job Queue or a schedule. The task needs to be agnostic as to its calling method. This means that it should be capable of accepting configuration/parameters from any source. In many cases, you'd simply wrap the task in a supporting framework. Besides configuration options, there is enabling autoloading, bootstrapping required dependencies, etc. In fact, each task would have something akin to a bootstrapping process just like the main application would rely on from whatever framework it depends on.
In a sense therefore, we're comparing tasks to actions on a controller. They are very similar.
Somewhat related to reusability is another concept of breaking down tasks themselves into their most relevant components. For example, let's say your task is described as follows:
When a User’s registration details are stored, attempt to send them an activation email up to five times before delegating any subsequent attempts to a job queue.
To explain the task, activation emails are time sensitive. A user will likely register, and immediately check their email. They may even refresh their inbox a few times. Because it’s time sensitive, we may start by using a child process spawned from the parent to attempt the emailing immediately. After five attempts, the child process aborts the task and perhaps marks it for future processing by a scheduled scripted job queue (activation emails are important enough that we should keep trying to send them until continued failures prove a bigger problem exists).
At first, we might be tempted to add a Task which loops over an email attempt five times. Wrong! The looping is a separate task component. The actual email attempt is the core component. It’s that core component we want to make reusable. The looping may instead be implemented by a Task Manager which will attempt the email task five times. Okay, that might be a too simple example, but it shows a point. The looping and the emailing can be thought of as separate components. In another situation, perhaps the task does two mutually exclusive things. There again, we can break the apparent task into two separately reusable tasks. Just keep thinking in terms of OOP and you won’t go wrong
.
Conclusion
In this first part of my series on Asynchronous Processing with PHP, we’ve covered a lot of theory concerning why such processing is needed, how it could be implemented, and how to think of tasks in terms of being separate and reusable. So I guess I’ll let you turn that around in your head for a day or two before I hit you with Part 2
.
The main message is important. Asynchronous Processing is one of those fundamental areas of knowledge any programmer, even in PHP, needs to know about. It’s been my experience that developers often see it as some arcane craft practiced by a handful of hardcore PHP developers. This completely untrue. Asynchronous Programming is actually very easy to understand, and very easy to implement as we’ll see in the next part of this mini-series where we’ll look at an example using the Zend Framework.
Zend Framework Monthly Bug Hunt Starts Today – C’mon, Join In!
Sep 17th
As Matthew announced during the week on the mailing lists, Zend are sponsoring a two-day Bug Hunt every month starting today. And there will be prizes for those who solve lots of issues! Here’s Matthew’s email:
Greetings, one and all!
I’ve alluded several times in the past month to having a plan for
helping manage our ever-growing bug list in the issue tracker. We’re now
ready to roll out phase one of this plan, and we need *you*!Starting this month, we will be sponsoring two bug hunt days monthly, on
the third Thursday and Friday of the month. That’s this upcoming
Thursday and Friday, 17-18 September 2009.During those days, the Zend team — myself, Ralph, and Alex — will be
in #zftalk.dev on Freenode for our entire work day (Ralph and myself are
in the United States, Alex is based in Russia; figure out the timezones
yourself (-: ). We will be triaging bugs ourselves, but, more
importantly, we will be there to help facilitate *you*, our contributors
and users, in resolving issue reports.As an incentive, each month, we will ship a Zend Framework t-shirt to
the individual that assists in the most issue resolutions during the bug
hunt days, whether via patches or direct commits. Quarterly, we will
evaluate overall contributions, including documentation, bug fixes, and
newly contributed components, and award a developer with their choice of
a Zend Studio license or Zend Framework Certification voucher. (Caveat:
one t-shirt per person per year, and one license/voucher per person per
year, folks!)For those interested in participating in the bug hunt days, the rules
are simple: have a signed CLA on file, and resolve issues in the
tracker.If you have not yet signed a CLA and want to participate, you can get a
copy of the form here:http://framework.zend.com/cla
Sign it and return it (you can email it, fax it, or send it via post);
if you send it via post, you’ll need to wait for confirmation that
we’ve received it before we can accept code contributions from you.Now, when it comes to the issue tracker, you’ll need to determine if the
issue:* is simply the reporter misunderstanding or misusing code OR
* is a request for a new feature OR
* is a reproducible issueIn the first case, comment on it and indicate the correct usage, and ask
the component maintainer or somebody from Zend to review your response
and mark the issue as resolved. In the second case, please try and focus
on issue reports instead of feature requests during the bug hunt days.That brings us to the final case, reproducible issue reports. With
these, you’ll need to do the following:* Capture the reproduce case as a unit test
* Resolve the issue in such a way as to maintain backwards
compatibility with existing usage. (In other words, don’t change the
signature of a method unless the signature is what is actually
broken.)From there, you then have two options:
* If you already have commit access, commit the test and fix to the
repository, and either resolve the issue or ask somebody from Zend to
review and resolve. Don’t forget to merge your changes to the 1.9
release branch!* If you do not have commit rights, create a patch with the unit test
and fix, and attach the patch to the issue. Ask the maintainer or
somebody from Zend to review and apply the patch.If you need help creating the unit test or patch file, hop onto the
#zftalk.dev IRC channel and ask for help.How should you choose issues to work on? Answer the following questions,
and you should be able to hop right in:* What components do you have expertise in?
* What components are you interested in learning more about?
* What issues have a high number of voters or watchers?Bug hunting should be fun, so pick components and issues you’re
interested in. Ask questions on IRC if you don’t understand how
something works.So, spread the word, and come prepared this week to help make the
framework even better! I look forward to seeing you on IRC this week!
The Zend Framework has quite a large number of issue that need to be resolved/closed/marked invalid. People often have this fuzzy idea that somehow all three members of Zend’s framework team will miraculously resolve them for us in between developing components, fixing the documentation, handling releases, offering support, and…lots of stuff. This idea needs to die a quick death – the Zend Framework is an open source project so the people responsible for fixing its issues is the community.
So I truly encourage everyone to get involved. If you are not up to spending the next two days fixing issues, then strongly consider a more limited approach. Pick just one or two issues to fix. Surely that is a manageable and not overly time consuming commitment?
I’ll be around for most of today, and probably part of tomorrow so I look forward to seeing the resolution count reach staggering new heights
. Everyone involved may convene in the #zftalk.dev IRC channel on Freenode.net where Matthew and Ralph will be available to offer support to any Bug Hunter needing a query answered or advice on how to proceed with an issue.
Just a short note on using the Issue Tracker: It’s hosted on a server which for one reason or another performs quite poorly at times. If it seems to be taking ages to load, please be patient – it will…eventually.
Self-Contained Reusable Zend Framework Modules With Standardised Configurators
Sep 13th
It was during last week, while writing out a draft chapter for Zend Framework: Survive The Deep End, that I found myself hitting a conceptual wall. If you are familiar with Zend Framework, you likely understand the concept of a Module in some detail. A Module is, in theory, a reusable collection of controllers, views and other classes which is packaged in its own directory for simpler copying or seperate treatment in a version control system like git or subversion.
The problem I had lay in demonstrated this fabled reusability. The more I tried to, the more I found myself throwing out cautions, warnings and advice on what to avoid doing. When it came to using Zend_Application, the trend continued since Zend_Application (a great component otherwise!) is just badly documented and explained. So off went another section just to try and explain its often confusing terminology. If you read the source code it all makes sense but if you don’t the disconnect between the explanations and a user’s expectations is obvious.
Reusability Rules
Zend Framework developers have, for better or worse, been ignoring the potential of modules for an interminably long time. It’s not that big of a surprise given the focus of the framework has always been to present a use-at-will architecture which relies on loose coupling and independent components. Tight integration through overarching features (which don’t break the framework’s impressive orthogonality) like a command line tool or initialisation tools has long been neglected until very recently. Zend Framework 1.8 saw the long needed introduction of Zend_Application which offers standardised bootstrapping. Zend_Tool is another ongoing effort on the command line side.
The most typical example of a module in the literature is also the worst. An administration backend. It’s a logical module since it’s a completely separate system to the frontend, but it’s the worst example because it is so very rarely reusable. Not every logical separation is reusable – they are mutually exclusive concepts. You could equally have a logical module which itself is comprised of several reusable modules and one non-reusable module. By definition, an administration backend is closely tied to CRUD operations against the application’s domain model (at least to start with). Since each application will be different, the administration backend will also.
A far better example of a reusable module is something much narrower and focused. Consider a module dealing with User Management, or Paypal IPN integration, or implementing a blog aggregator. These are each common needs which, depending on the application, may require little change from implementation to implementation. Drop them in, configure them, integrate them, and you can have them working with few issues. Unfotunately, we keep focusing our module efforts on obviously non-reusable things like administration backends. Losing sight of the potential reuse of smaller subsystems will lead us to repeatedly developing them over and over again without even noticing this as a problem.
For the Zend Framework, this would be a big win. Rather than having developers re-implement commonly used web application systems it would encourage the distribution of third-party modules which would benefit from open source licensing and feedback. Imagine your next application requiring a minimal blog or integration with Paypal IPN and finding a third party module which does the trick so you can save some development time.
Achieving Reusability
When we discuss achieving reusability there are several factors and features covered when it comes to modules:
1. They are separated into their own parent directory.
2. They can apply specific configuration when accessed.
3. They require no special integration work.
4. Their classes are automatically available to the host application.
5. They are not required to contain controllers or views.
It’s not an exhaustive list. Items 1, 4 and 5 are already a reality. Zend Framework modules do live in a module directory, using Zend_Application and some conventions their classes are autoloaded on demand and they are not required to contain controllers and views. A module may exist which merely offers models, helpers and some default forms.
So our path to reusable modules is hampered by items 2 and 3. Modules currently don’t have on-access configuration unless we impose it through various means. This flows into integration work which is commonly needed to achieve this in the first place.
The Layout Example: Integration Through Front Controller Plugins
A simple example, taking our example administration backend (an “admin” module) is that of switching layouts. Suppose our main application uses a professional design but our administration backend uses a very simple minimal one. How do we switch layouts when the admin module is accessed so the correct layout template is applied?
An initial expectation might be to try this from our application.ini file (if using Zend_Application) using:
[geshi lang=css]; Default Module
resources.layout.layout = “default”
resources.layout.layoutPath = APPLICATION_PATH “/views/layouts”
; Admin Module
admin.resources.layout.layout = “default”
admin.resources.layout.layoutPath = APPLICATION_PATH “/modules/admin/views/layouts”[/geshi]
Ah, module configuration! This is a very common first attempt since the expectation is that a module framed configuration will kick in only for that module.
Alas, this will not work even if it looks blatantly obvious that it should (expectations again). Module configuration here is used during the bootstrapping process which occurs before a request is routed, i.e. we can’t know what module the request relates to yet because our routes are not yet applied. So any module configuration of this type is actually applied to the same resources as the previous set of settings, i.e. module configuration overwrites the main resource configuration. The example above, replaces the layout and layout path across the entire application with that of the admin module. Visiting any module, including the default module, will show the last configured layout being applied no matter what module prefixing you use.
What possible good is having this confusing module configuration then? Well, it’s useful to pass custom options to your module’s bootstrap class for something. Beyond that, I can’t think of many other use cases. You could, for example, use it to register module-hosted plugins, classes, etc but that’s just as easily done without the module name prefixed to the option. Note, this is my own ignorance speaking – I haven’t seen any detailed examples using this.
In the meantime, how do we ensure the layout is only switched if the module is accessed? The above configuration won’t work, obviously. Well, we first need to know what module is being dispatched to, so it must be done after routing has taken place. The most obvious location for our switching logic is therefore a front controller plugin which implements the preDispatch() method (i.e. it’s executed just before any controller is called, giving us an opportunity to re-configure some resources like Zend_Layout).
Here’s an example plugin for this. It’s the simplest possible version – I’ve seen some examples which forget that Zend_Layout already offers a plugin we can subclass to keep things simple.
[geshi lang=php]
class ZFExt_Controller_Plugin_LayoutSwitcher
extends Zend_Layout_Controller_Plugin_Layout
{
public function preDispatch(Zend_Controller_Request_Abstract $request)
{
$this->getLayout()->setLayoutPath(
Zend_Controller_Front::getInstance()->getModuleDirectory(
$request->getModuleName()
) . ‘/views/layouts’
);
$this->getLayout()->setLayout(‘default’);
}
}[/geshi]
It’s not a perfect class – every single module must follow the convention on using the same layout path and layout name. We could also add some logic to skip the default module since this would configure it twice for no reason. But it works! When we access the “admin” module, the layout path will be set to /application/modules/admin/views/layouts and the layout template used will be default.phtml. The default modules path will likewise reflect its original configuration.
To get this working, let’s add a new layout resource option so our custom plugin replaces the default one from Zend_Layout:
[geshi lang=css]; Default Module
resources.layout.layout = “default”
resources.layout.layoutPath = APPLICATION_PATH “/views/layouts”
resources.layout.pluginClass= “ZFExt_Controller_Plugin_LayoutSwitcher”[/geshi]
This does work by the way
. Add the following test alongside a directory _modules containing a readable subdirectory _modules/admin and it will pass.
[geshi lang=php]
class ZFExt_Controller_Plugin_LayoutSwitcherTest extends PHPUnit_Framework_TestCase
{
protected $plugin = null;
protected $request = null;
public function setup()
{
Zend_Controller_Front::getInstance()->addModuleDirectory(dirname(__FILE__) . ‘/_modules’);
$this->plugin = new ZFExt_Controller_Plugin_LayoutSwitcher(
new Zend_Layout
);
$this->request = new Zend_Controller_Request_Http;
}
public function teardown()
{
Zend_Controller_Front::getInstance()->resetInstance();
}
public function testSwitchesLayoutNameIfAdminModuleDispatched()
{
$this->request->setModuleName(‘admin’);
$this->plugin->preDispatch($request);
$this->assertEquals(‘default’, $this->plugin->getLayout()->getLayout());
}
public function testSwitchesLayoutPathIfAdminModuleDispatched()
{
$this->request->setModuleName(‘admin’);
$this->plugin->preDispatch($request);
$this->assertEquals(dirname(__FILE__) . ‘/_modules/admin/views/layouts’,
$this->plugin->getLayout()->getLayoutPath());
}
}[/geshi]
How about something different? What if our main application uses HTML 5 and our administration backend uses XHTML 1.0 Transitional. Damn, we need another plugin. Worse, this time we can’t reduce it to a convention since a doctype can be anything and we have no way of predicting it. We could set it on the module layout, but layouts are rendered last – it would still not be applied to page level templates or partials. Forms would be messed up, for example. Same goes for the character encoding of our views (messed up escaping).
So slap in another plugin to handle doctype switching, and another to handle encoding changes. Why not add another just for fun so we can handle connecting to a module’s shared database. Then there’s the case where… Alright…enough of that
. The point is a simple one. We are adding custom plugins all over the place to integrate modules into our application. These plugins will not be reusable, will require editing for different modules, and will need to be rewritten between applications. We need something more structured.
Integration: Modular Pre-Dispatch Configuration
As we can see, integration efforts are tricky. Relying on custom plugins and trying to wrestle the bootstrap system into submission are a lot of trouble to go through. Zend_Application and bootstrapping may not offer us a good solution for integration, but they do give us the roadmap.
Zend_Application defines bootstrap classes which are used to initialise resources before routing takes place. Keeping it simple, we need to reconfigure resources after routing but before dispatching occurs. We may also need to initialise different resources if they are used by a module, but not the main application. What we need is something like bootstrapping that occurs after routing. After a bit of thought, we might come to the conclusion that the current Resource classes of Zend_Application could live parallel to counterparts who exist not to initialise a Resource, but to modify a pre-initialised Resource by resetting its configuration. These are what I term Configurators, maybe not the best name, which mirror Resources.
Take a Configurator class for Layouts as an example:
[geshi lang=php]
class ZFExt_Application_Module_Configurator_Layout
extends Zend_Application_Resource_ResourceAbstract
{
public function init()
{
$layout = $this->getBootstrap()->getResource(‘Layout’);
$layout->setOptions($this->getOptions());
}
}[/geshi]
Our Configurator actually extends from Zend_Application_Resource_ResourceAbstract demonstrating its close relationship to a Resource. However, it does not create and initialise a Resource – it merely modifies the existing one by injecting a new configuration sourced from a module.
Where does this replacement configuration come from? I’ve decided to use a simple convention. If you want a module to impose its own configuration when, and only when, it is accessed then create that configuration in a file called module.ini located at /application/modules/admin/configs/module.ini. The configuration file could be any supported format, but I’ve used the INI format for simplicity. This would look like (using the typical environmental groups):
[geshi lang=css][production]
; Standard Resource Options
resources.layout.layout = “default”
resources.layout.layoutPath = APPLICATION_PATH “/modules/admin/views/layouts”
[staging : production]
[testing : production]
[development : production][/geshi]
So, we have the Configurator class, and the configuration file it will use. Let’s bind these together. We’ll start by putting in place a class whose role is to use a collection of options loaded from module.ini to instantiate and run a set of Configurator classes.
[geshi lang=php]
class ZFExt_Application_Module_Configurator
{
public function __construct(Zend_Application_Bootstrap_Bootstrapper $bootstrap,
Zend_Config $config)
{
$this->_bootstrap = $bootstrap;
$this->_config = $config;
}
public function run()
{
$resources = array_keys($this->_config->resources->toArray());
foreach ($resources as $resourceName) {
$options = $this->_config->resources->$resourceName;
$configuratorClass = ‘ZFExt_Application_Module_Configurator_’ . ucfirst($resourceName);
$configurator = new $configuratorClass($options);
$configurator->setBootstrap($this->_bootstrap);
$configurator->init();
}
}
}[/geshi]
As you can see, it is very simple. It takes a configuration, detects what Resources it applies to, instantiates relevant Configurators and executes them. It could be improved a lot by allowing for custom Resources and other such customisations but for now the basics will do nicely.
Earlier, we mentioned that the application only becomes aware of the current module when the request is routed. Therefore, to get this working we need to trigger the Configurators after routing (or prior to request dispatching). We also need to check if the current module has a module.ini file and also ensure we skip over the default module (our main application space might be reusable so this is an arguable point and probably should be allowed for).
We’ll accomplish this using a front controller plugin:
[geshi lang=php]
class ZFExt_Controller_Plugin_ModuleConfigurator
extends Zend_Controller_Plugin_Abstract
{
public function preDispatch(Zend_Controller_Request_Abstract $request)
{
$front = Zend_Controller_Front::getInstance();
$bootstrap = $front->getParam(‘bootstrap’);
$moduleName = $request->getModuleName();
if ($moduleName == $front->getDefaultModule()) {
return;
}
$moduleDirectory = Zend_Controller_Front::getInstance()
->getModuleDirectory($moduleName);
$configPath = $moduleDirectory . ‘/configs/module.ini’;
if (file_exists($configPath)) {
if (!is_readable($configPath)) {
throw Exception(‘modules.ini not readable for module “‘ . $module . ‘”‘);
}
$config = new Zend_Config_Ini($configPath, $bootstrap->getEnvironment());
$configurator = new ZFExt_Application_Module_Configurator(
$bootstrap, $config
);
$configurator->run();
}
}
}[/geshi]
If you’re still with me, and can piece this story together, you achieve a workflow as follows for the admin module when accessed from any URI like http://example.com/admin. I’ve skipped steps where not relevant.
1. Normal bootstrapping is completed with the layout being initially set using the application.ini options.
2. The request is routed. The module name “admin” is set internally.
4. ZFExt_Controller_Plugin_ModuleConfigurator::preDispatch() is called before dispatching commences (getting it done before other plugins can be addressed in the future).
5. The plugin detects /application/modules/admin/configs/module.ini and loads it as a Zend_Config instance.
6. The plugin instantiates ZFExt_Application_Module_Configurator, passes it the configuration and original bootstrap, and calls the new object’s run() method.
7. The Module Configurator assesses the configuration for resource names. For each resource detected, it instantiates a Resource Configurator like ZFExt_Application_Module_Configurator_Layout.
8. The Resource Configurator is executed and applies the new configuration to the existing Layout Resource thus overwriting the original configuration.
9. Dispatching occurs – the admin module’s requested action is rendered with the correct admin layout.
By itself, this seems like a lot of trouble to go through – except what if it becomes a Zend Framework feature? All of a sudden, countless custom plugins will meet their death and be replaced by a simple configuration file!
Conclusion
The goal of this article was to highlight the problems of achieving reusable modules and implement, as a proof of concept, at least part of the solution with an eye to encouraging greater discussion of where to go from here. I, for one, would love to see this included in the Zend Framework so we can get over the trend of relying on custom plugins and evolve towards a more standardised means of configuring modules.
If you’ve enjoyed the article please do add a comment and make suggestions on what could be improved or added as a feature. If I get enough positive feedback I’ll move this into a formal proposal (preferably with a partner or two
). If you’re interested in collaborating on a proposal addressing this let me know!
Zend Framework Book: Surviving The Deep End – Chapter 10 Released! Zend_View, Zend_Layout, HTML 5 and YUI CSS
Sep 8th
It’s been a busy few weeks but I’ve finished and released Chapter 10 of the Zend Framework: Surviving The Deep End book!
The new chapter explores setting up the example blog application’s web design using Zend_View and Zend_Layout. I also spend some time exploring HTML 5, the future standards update for HTML. I decided to employ HTML 5 mainly because it’s rather fun and interesting to get it working today and learn more about where it may impact web applications and web design in the future. The web design itself makes use of the Yahoo! User Interface Library’s CSS components.
I’ve also gotten around to sorting the book’s source code availability. This is explained in the book’s introduction with detailed instructions. The source code is being stored on a git repository hosted by Github.com at http://github.com/padraic/ZFBlog. The specific source code for a chapter is tagged as “Chapter-XX” where XX is the chapter number. This should save everyone from copying the source from the book directly if all they want is the final set.
Finally, let me reiterate that feedback is truly welcome! You can add comments at the end of every single chapter or, if very specific, to any paragraph. The intent of this detailed commenting system is to gather the ideas, criticisms and corrections readers come up with so the final edition (before Zend Framework 2.0 hits!) is as useful, correct and worth reading as possible. So do leave comments or ask questions at the end of chapters!
Zend Framework: Surviving The Deep End is a free online book about the Zend Framework licensed under a Creative Commons Attribution-Non-Commercial-No Derivative Works 3.0 Unported License.
