PHP, Zend Framework and Other Crazy Stuff
Archive for November, 2008
Zend_Feed_Reader: Approved for Combat!
Nov 17th
I was notified earlier by my fellow Zend Framework contributor, Jurriën Stutterheim, that Zend_Feed_Reader has been approved for further development in the Standard Incubator. We’re going to target a rapid fire ZF 1.8 release assuming all goes as planned. Murphy’s Law not withstanding
.
To recap – Zend_Feed_Reader is a wholesale replacement for Zend_Feed (the parts of it relating to reading RSS/Atom). It’s goals are to offer a consistent unified interface, regardless of feed type or version (yes, even RSS 1.0), which can intelligently interpret feed data and select alternative data points for missing data. I can preach for a long time why this is needed, but the proposal is the best source since it expands on expected features and explains why this is needed more clearly than a short summary here can:
Zend_Feed_Reader – Pádraic Brady & Jurriën Stutterheim
Thanks especially to Jurriën who stepped in to partner with me on this proposal, and who kept it alive during my protracted absences over the Summer!
Zend Framework: Surviving The Deep End
Nov 17th
Zend Framework: Surviving The Deep End
A Zend Framework Example Blog Application From Start To Finish
People have approached me over the months curious about why I don’t just write a Zend Framework themed book, get with a publisher, and make mountains of money (or at least a modest hill). After all, most know I have had offers from at least one publisher. Writing a book would provide a great deal of publicity for my own skills, highlight my authority on the subject, increase my ego to dangerous proportions and give my stature in the community and with potential clients a boost. The benefits go beyond mere cash.
I usually start by explaining that writing a book on contract is no easy project. A book takes time, patience, incredible dedication and an all out attack on your preconceptions. No matter how “expert” you’d like to believe you are, writing a book will definitely blow that out of the water. In the words of myself, an expert is someone who consults the manual less, and knows where to find everything they don’t know more efficiently
.
I know from experience in writing the Zend Framework Example Blog Application Tutorial that explaining the Zend Framework is no easy task. It’s impossible to do without plenty of forethought and care, and usually a detailed reading of the manual in conjunction with the actual source code (both what you intend writing, and the Zend Framework source itself). Even then you rewrite everything at least twice, and tweak it a dozen ways after. Why? Because that’s the only way to escape one’s dependence on ingrained experience and IDE auto completion which easily leads the “expert” to skip over the obvious and take knowledge for granted – not an advisable trait in any reference work.
This, for me, created an arena in which I felt compelled to utilise an old device: a single comprehensive example which tied together every facet of knowledge I wanted to explore. It had to be simple, creative, problematic and real – all at the same time. That’s where a “blog” came to mind
. It can be as simple or as complex as you want it to be which made it a highly flexible example, and everyone (plus their dog) knows how a blog works in theory. It also coincided with the web application framework equivelent of “Hello, World!”. The fact I really wanted to write my own customisable blog was mere icing.
Unfortunately the whole initial project had one minor issue – I quickly realised the book I wanted to write was not the book a publisher wanted to print. I went from a loose index of about 15-20 chapters covering up to 400 pages in print, to getting a rearranged edited version for less than 200 pages sent back to me by an editor with an abridged content index. I was then asked to sign the dotted line. I declined. I’ve heard from them since but the initial experience was sufficient to make me wary of any more surprises. I had figured out what I wanted to write by then, and didn’t find the sudden pushback satisfactory. I’m stubborn that way. I tend to rant at length when irritated (as a few would remember
).
Here’s why not to write a Zend Framework book then: when the book you must write is not the book you want to write! Writing a 200 page book on the Zend Framework can be done – just not by me. All my readers understand my writing style by now – I explore topics in detail with a dash of sometimes geeky humour and most importantly, I write at length. Miles of length
. Cal Evans, formerly over at Zend Devzone, knows how my idea of a 2000 word article usually ends up as 4000+ – half of it smilies
. I’m a verbose personality who finds it repugnant to summarise knowledge to the point it loses it’s heart or only gets half the job done (and heaven knows there’s enough of that on the market without me adding to it). From that perspective, a short book would be the ultimate regret, something I wouldn’t likely find pride in, and something that I would always feel could have been a lot better. Since I’m not driven by the possibility of a major injection of cash to my bank account (seriously, few once off books are worth writing just for cash) it was an easy decision.
Which brings us full circle to the reason my blog has crashed…twice…this summer. Since the book idea wasn’t working out, the only solution worth considering was the one which provided the greatest level of personal satisfaction – a multi part Zend Framework tutorial on my blog!
Of course I kept it buried about the fact that the series was a shortened version of what I’d write in a book. I wanted to see how the community reacted first – and react it did. Enough to crash the damn blog twice and swarm me with emails when the series was down
.
It’s weird how things progress once they build up momentum. But with all the fun this series has been, it’s near time I made the book idea a reality in some form which is more useful to readers than hard to locate blog posts, and carries an sense of solid permanancy. With that notion in mind, I’ve done bits and pieces over the Summer. The tutorial has been transferred to Docbook XML ready for further editing and an automated process for transferring this to HTML and PDF formats (I’m currently mid way into a tutorial on that whole PHP driven process on the blog) put in place. A strategy of editing, correcting, updating and expanding (where necessary) the original articles is underway as a “Revised” series to start next month. All such Revised versions will arrive in multiple formats – Blog, HTML and PDF – as free Creative Commons licensed downloads alongside a donation driven model for raising cash…for my next Macbook Pro upgrade (and maybe my Porsche?) if nothing else. Most useful perhaps, this all means I can roll out updates and corrections with ease (no massive lead time for printing).
So yes, I am going the self publishing vanity route
. Let the ego swelling commence! I’m a persistent bugger and hellbent on writing this book if only to appease my appetite for writing something worthwhile and delighting in tracking its success (or epic failure) over time. I’m weird that way. I program weird stuff in my free time because it’s an outlet for my creativity, not because I profit from it. I’ve spent way more on hosting then my weirdo blog and mini-apps/libraries would ever refund, and that’s how I like it.
As for the inevitable spiel over why you should consider downloading this new in-progress book once available (and maybe parting with some hard earned pennies in these uncertain economic times), I leave the blogged series as the ultimate testiment. It pretty much captures the direction of this book. Instead of bogging you down with detailed parameter lists (there’s a reference manual online for that) or treating you like the world’s biggest dummy (I’m sure any 200 page book will cover that) I wanted to provide a book which looked at the bigger picture – building a real application and solving the problems any real application may pose. And I wanted to do it at whatever length I needed to get the point across (with smilies – obviously!) while still being an interesting read.
Zend Framework: Surviving The Deep End will kick off during December. The title is a humourous reference to what exists after you know your Zend Framework basics, i.e. how to use the basics and reference manual to achieve something tangible. Like a Pet Store – if only that were not already exhausted by those folk at BluePrints using J2EE…
Writing Professional Looking Documentation With Docbook, PHP, Phing and Apache FOP: Part 1: Getting Started
Nov 12th
Introduction
Documentation. The word illicits a mix of fear and depression in even the most hardened programmer. For many it’s a hard slog through endless boredom which occurs throughout, or at the end of, the development process. Documentation is never the easiest task. Good documentation takes time, patience, lots of questions about the subject matter (no matter how familiar you think you are with the subject matter, you can be assured you have some misunderstandings), and a degree of ability in condensing knowledge to a form people can instantly connect with.
But even when you get it done there’s the question of how to distribute it! A popular choice is HTML – it’s portable since everyone has a browser, and as web developers we’re all familiar with the syntax. Another common choice is plain text since “someone else” can always transfer it to another format down the line. Some people even believe its entertaining to rely solely on inline source code comments relying on the skills of the user to decipher their personalised coding style, thought process, and intent.
This article series proposes using Docbook XML as the ultimate source format for all documentation. The difference between most formats and Docbook, is that Docbook can be used to generate numerous final formats. That flexibility and the quality of it’s output go a long way in explaining Docbook’s popularity among documentation authors. If you doubt it’s capabilities, bear in mind there are publishers who have adopted Docbook!
The series was written to introduce programmers to a PHP oriented publishing process which uses Docbook XML as the basis for generating professional looking HTML and PDF output. I say PHP oriented, because the Unix “toolchain” commonly associated with Docbook XML has been replaced almost entirely by PHP. This is useful because with PHP’s power at your disposal writing various filters to handle stuff like PHP source code highlighting is extremely simple.
Meet The Ingredients!
Docbook XML
The Docbook standard seems to have a reputation for being complex. This is an outright misconception with little foundation – the format is broad in that there are hundreds of possible tags, but shallow in that outside the tag count the rest is very straightforward. Docbook is a simple XML format where a tiny subset of the standard syntax is sufficient for 99% of your requirements. It’s as simple as plain old HTML and there are several excellent editors so you’re not stuck editing XML by hand (which should be avoided since it’s…painful). However I do suggest setting up the shell book (see manual.xml further on) by hand and using an editor for individual chapters/appendices since it makes life easier than putting up with one giant file!
The downside is that you have to understand at least the basic tags and be aware that as with all XML, all elements do nest, with all Docbook XML files required to validate against the standard. XML Editors rely on tag knowledge and nesting consistency – they save time because you are not writing the tags and worrying about validation, appearance and other hand editing pains.
Beyond that it’s a simple exercise in learning by doing which I leave to the reader. You’ll see some samples later on to give you a feel for the basic syntax. You can read a crash course over on http://opensource.bureau-cornavin.com/crash-course/en/introduction.html but the full reference manual is a great deal bigger and more comprehensive. There also exist several useful examples in the PHP community including the Zend Framework Reference Manual and PHPUnit’s Pocket Guide, to name a few, which you can checkout from their respective version control repositories.
The reason Docbook has gotten so popular for technical manuals, reference books, and even shorter articles and some magazines is that Docbook is agnostic to the final distributable format. From any Docbook source you can generate HTML, XHTML, RTF, HTML, CHM (Microsoft Help), PostScript, TeX and FO formatted creations among others. FO is itself an intermediary XML syntax which is easily generated into PDF form using a FO Processor (like Apache FOP which we’ll meet later). With all these target formats available from a single source document it’s easy to see that Docbook affords you flexibility. Why write RTF or HTML, when Docbook also gives you these, and more, with minimal fuss?
Did I mention that the only required tools for Docbook processing all happen to be free and open source?
PHP 5
Transforming Docbook XML into other formats requires a toolchain. The standard setup is to install the Docbook DTDs, the Docbook XSL stylesheets (which instruct the tranformation of the source Docbook XML into varied formats), and a collection of GNU tools on Linux like xsltproc. This is sometimes referred to as the “new way” since it’s reasonably uncomplicated compared to what went before in the Linux world. Reasonably is in the eye of the beholder however
.
PHP5 comes with the DOM and XSL Extensions and these both come with all the functions necessary to drive a completely PHP driven toolchain for Docbook. All one needs is the toolchain programmed in PHP so it can be reused. Luckily, Phing (a project build system based on Apache Ant), includes pre-written tasks and filters which serves this purpose admirably.
I should note these were contributed for the benefit of all PHP inclined Docbook users by Bill Karwin, the former Zend Framework maestro. Thank you Bill!
The other facet of a PHP toolchain is that it enables PHP programmers to write custom Phing tasks and filters which can assist in customising output. We’ll see my PHP source highlighting examples later.
Phing
Phing is one of those understated libraries eclipsed by the likes of Apache Ant or Ruby Rake in their respective languages. Phing’s raison d’etre is to allow PHP programmers describe repetitive tasks in an XML syntax so they can be automatically re-run from a command-line whenever you want. For example, whenever I want to generate documentation from Docbook, I simply issue the command “phing docs” in a console!
Phing is written in PHP5, and that’s the main reason I use it. You can create custom tasks and filters in PHP without fiddling with Java or embarking on an exploration of bash scripting (if you normally prefer makefiles). I use several custom tasks to fine tune the whole processing process which are basically PHP classes imported into Phing. Phing takes the pain out of applying your PHP knowledge to the task of generating and manipulating Docbook XML and any of its intermediary or final formats.
The other side of Phing is simple automation. Rather than bang on a console for two minutes, I can encode a Docbook run in Phing’s XML syntax and let it automatically carry out all the tasks I defined, in the order I defined them. Two minutes over countless runs does add up and Phing removes that annoyance from my programming life. It excels at automating highly repetitive tasks.
Apache FOP
XSLT processing cannot, obviously, directly generate PDF. To get PDF documents it’s necessary to have an intermediary format called XSL Formatting Objects (XSL-FO) created by an XSLT processor from the Docbook sources. This intermediary format can then itself be transformed in a second stage by a suitable XSL-FO processor into PDF, PostScript and a few other lesser used formats.
Apache FOP is chosen as the FO Processor for a few reasons. It’s easy to setup and use. You can easily configure it to embed custom (or base 14) fonts into PDF. It’s written in Java, accessible from the command line and runs on Windows XP/Vista with little effort. Oh, and it’s free. Thou shalt not pay for a XSL-FO processor! No matter how easy the advertising promises to make it!
Installing All This Crap Without Going Insane
If you head is spinning from the deluge of information, take a break by engaging in some menial installation tasks.
PHP 5
I’m going to assume you can safely install PHP 5 without me holding your hand. Just make sure you also include PEAR! If you need to install PEAR separately consult the documentation at http://pear.php.net.
Phing
Installing Phing is done from the command line using PEAR. Visit Phing’s place on the web where the User Manual exists if you intent attempting a non-PEAR install. Here’s the usual steps needed:
pear channel-discover pear.phing.info pear install phing/phing
You can also install one Phing task I release to my own PEAR channel to handle PHP source highlighting in HTML output.
pear channel-discover pear.phpspec.org pear install phpspec/PhpDocbookHighlighterTask
I actually use two custom Tasks. The first highlights PHP code in HTML/XHTML documentation generated from Docbook and is found on the PHPSpec PEAR channel. The second is in the ZFBlog subversion repository at http://svn.astrumfutura.org/zfblog/branches/phing/PhpFoHighlighterTask.php and should be copied to the PEAR/phing/tasks/ext/ directory of your system. This task deals with highlighting PHP source code in XSL-FO output so that in PDF content it is properly highlighted (for reference this code, like the HTML version, is licensed under a New BSD License unless otherwise stated).
Here’s a copy to examine – it uses a variation of the PHP Highlighting script for HTML rewritten to apply to XSL-FO using PHP DOM:
[geshi lang=php]< ?php
require_once 'phing/Task.php';
class PhpFoHighlighterTask extends Task
{
private $_file = null;
public function setFile($file)
{
$this->_file = $file;
}
public function init()
{}
public function main()
{
$this->_highlightFile($this->_file);
$this->log(‘PHP in XSL-FO highlighted’);
}
private function _highlightFile($file)
{
$dom = new DOMDocument();
$dom->load($file);
$xpath = new DOMXPath($dom);
$elements = $xpath->query(“//fo:block[@phing='phpfohighlightertask']“);
foreach ($elements as $block) {
self::_highlightBlock($block, $dom);
$block->removeAttribute(‘phing’);
}
$dom->save($file);
}
private static function _highlightBlock($block, $fo)
{
$toHighlight = str_replace(
array(‘>’, ‘<’, ‘&’,'"’),
array(‘>’, ‘< ', '&', '"'),
$block->nodeValue
);
// This basically prevents highlighting of non
// HTML, XML and PHP source code. Note: All PHP to
// be highlighted this way must have < ?php at the top
if (substr($toHighlight, 0, 5) !== '
&& substr($toHighlight, 0, 9) !== '
&& !preg_match("/^<[^>]*>/”, $toHighlight)) {
return;
}
// Why manually highlight when it’s built into PHP!
// edit php.ini or add config to change colours
$code = highlight_string($toHighlight, true);
$code = str_replace(
array(‘','‘,’ ’,’
‘,”\r”),
array(”,”,’ ‘,”\n”,”\n”),
$code
);
$code = preg_replace(“!\n\n\n+!”, “\n\n”, $code);
$code = trim($code);
$dom = new DomDocument;
$dom->loadXML($code);
$xpath = new DomXPath($dom);
$parentSpan = $xpath->query(‘/span’)->item(0);
$style = $parentSpan->getAttributeNode(‘style’)->value;
$colour = substr($style, 7, 7);
$content = $parentSpan->nodeValue;
$inlineParent = $fo->createElement(‘fo:inline’);
$inlineParent->setAttribute(‘color’, $colour);
$nodes = $xpath->query(‘/span/node()’);
foreach ($nodes as $node) {
if ($node->nodeType == XML_ELEMENT_NODE) {
self::_appendInlineChild($node, $inlineParent, $fo);
} else {
$child = $fo->importNode($node, true);
$inlineParent->appendChild($child);
}
}
// Side effect of XSL-FO complexity is the odd blank monospace box
// This strips them out – sort of a workaround. Means this code could
// be improved a bit so stripping is not needed to start with!
if (preg_match(“/^\s+$/”, $inlineParent->firstChild->textContent)) {
$inlineParent->removeChild($inlineParent->firstChild);
}
foreach ($block->childNodes as $node) {
$block->removeChild($node);
}
$block->appendChild($inlineParent);
}
private static function _appendInlineChild($span, $inlineParent, $fo)
{
$style = $span->getAttributeNode(‘style’)->value;
$colour = substr($style, 7, 7);
$content = $span->nodeValue;
$inlineChild = $fo->createElement(‘fo:inline’, $content);
$inlineChild->setAttribute(‘color’, $colour);
$inlineParent->appendChild($inlineChild);
}
}[/geshi]
With Phing installed – our PHP environment is complete. Let’s now grab the remaining elements. More >
An Example Zend Framework Blog Application Tutorial (Revised)
Nov 5th
The Zend Framework has been with us for quite some time but exploratory articles which deliver a full example of how to use the Framework without getting completely buried in the detail remain thin on the ground. To add to the landscape, last April I wrote Part 1 of an “Example Zend Framework Blog Application Tutorial” and the series continued for 9 Parts.
A lot happens in a short time! The Zend Framework has itself added more features, corrected bugs, and programmers have started to wield the framework in greater numbers. This “Revised” series is an update to the original series which temporarily ceased over the Summer months. In these revised articles I’ll continue the tradition of detailed examinations of specific topics within the scope of a single real life Blog Application, which I will be deploying once completed, with the addition of corrections, better flowing steps and additional pointers.
No one person article is capable of being perfect. Over the months people have made dozens of comments by blog and email and these will also be incorporated along with several intermediary editing entries. The result should be a fantastic introduction to the Zend Framework which gets down and dirty with a real application but keeps the overwhelming details at a distance. The point is to see a real application being developed in simple consumable steps without repeating the entire reference manual.
The revised series will also spark the process of a more formalised publishing process. As new Parts are unveiled, transformed PDF and HTML versions will come online (at an undisclosed future location) for download. You will be able to print off copies for future reference outside of the current blog format. Unlike the blog, these are more malleable to subsequent updates and corrections. Manning’s MEAP in other words – without the upfront cost (though donations to my server costs will be appreciated
).
The PDF/HTML publishing process has been setup and automated for quite a while. Some of you might have remembered the test PDF release of Part 9 to see how the format struck readers. I’ll talk more about that process in a future article because it’s nice to show that professional looking documentation isn’t remotely difficult and just takes a little bit of patience to get your environment setup for it.
Expect Part 1 (Revised) in the very near future!
I had a lot of internal debate over what to do with the tutorial. The summer months made it clear that hosting the series was beyond the capabilities of a shared host, and I later discovered even my Slicehost account was running into swap space. To balance my complete devotion to free access to the series with the resources needed to keep it online, I’ll be adopting a mixed donation/advertising model. The blog will always (emphatically!) remain ad free but the PDF/HTML versions will have a prominant donations button and at least one advertisement column (ed: not in the actual PDF file – just its hosting page). The resulting funds should cover the hosting costs quite handily.
More details soon. It’s been a long time coming.
