<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:creativeCommons="http://backend.userland.com/creativeCommonsRssModule">

<channel>
	<title>Pádraic Brady &#187; PHP Security</title>
	<atom:link href="http://blog.astrumfutura.com/category/PHP-Security/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.astrumfutura.com</link>
	<description>PHP, Zend Framework and Other Crazy Stuff</description>
	<lastBuildDate>Thu, 12 Apr 2012 17:33:23 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.1</generator>
<atom:link rel="hub" href="http://pubsubhubbub.appspot.com"/><creativeCommons:license>http://creativecommons.org/licenses/by-nc-nd/3.0/</creativeCommons:license>		<item>
		<title>A Hitchhiker&#8217;s Guide to Cross-Site Scripting (XSS) in PHP (Part 1): How Not To Use Htmlspecialchars() For Output Escaping</title>
		<link>http://blog.astrumfutura.com/2012/03/a-hitchhikers-guide-to-cross-site-scripting-xss-in-php-part-1-how-not-to-use-htmlspecialchars-for-output-escaping/</link>
		<comments>http://blog.astrumfutura.com/2012/03/a-hitchhikers-guide-to-cross-site-scripting-xss-in-php-part-1-how-not-to-use-htmlspecialchars-for-output-escaping/#comments</comments>
		<pubDate>Mon, 12 Mar 2012 20:49:36 +0000</pubDate>
		<dc:creator>padraic</dc:creator>
				<category><![CDATA[PHP General]]></category>
		<category><![CDATA[PHP Security]]></category>
		<category><![CDATA[Zend Framework]]></category>
		<category><![CDATA[Character encodings in HTML]]></category>
		<category><![CDATA[Cross-Site Scripting]]></category>
		<category><![CDATA[php]]></category>
		<category><![CDATA[xss]]></category>

		<guid isPermaLink="false">http://blog.astrumfutura.com/?p=723</guid>
		<description><![CDATA[In recent weeks, I consulted with the second most intelligent species on the planet: Dolphins. Dolphins are renowned across the known Universe for their awesome programming skills. After all, it was they who developed such insightful works as &#8220;Evolution By Example&#8221;, &#8220;Dude! We Wrote The Laws Of Physics!&#8221;, and &#8220;How Many Humans Does It Take]]></description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: right; margin-left: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fblog.astrumfutura.com%2F2012%2F03%2Fa-hitchhikers-guide-to-cross-site-scripting-xss-in-php-part-1-how-not-to-use-htmlspecialchars-for-output-escaping%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fblog.astrumfutura.com%2F2012%2F03%2Fa-hitchhikers-guide-to-cross-site-scripting-xss-in-php-part-1-how-not-to-use-htmlspecialchars-for-output-escaping%2F&amp;source=padraicb&amp;style=normal&amp;service=bit.ly&amp;service_api=padraic%3AR_94101570b7e190f3de921bc15bb9438d&amp;hashtags=Character+encodings+in+HTML,Cross-Site+Scripting,php,xss&amp;b=2" height="61" width="50" /><br />
			</a>
		</div>
<div class="wp-caption alignright" style="width: 250px"><a href="http://www.flickr.com/photos/57704929@N00/3528646651" target="_blank"><img class="zemanta-img-inserted zemanta-img-configured" title="Nu wordt het wel heel simpel om XSS zwakheden ..." src="http://farm3.static.flickr.com/2194/3528646651_a16d9053e1_m.jpg" alt="Nu wordt het wel heel simpel om XSS zwakheden ..." width="240" height="139" /></a><p class="wp-caption-text">(Photo credit: bertboerland)</p></div>
<p>In recent weeks, I consulted with the second most intelligent species on the planet: Dolphins. Dolphins are renowned across the known Universe for their awesome programming skills. After all, it was they who developed such insightful works as &#8220;Evolution By Example&#8221;, &#8220;Dude! We Wrote The Laws Of Physics!&#8221;, and &#8220;How Many Humans Does It Take To Screw Up A Planet?&#8221;. The answer to the last will be published on 01/01/2013 after the experiment is shut down and sent to a landfill site assuming the Supreme Spaghetti Monster signs off on the permit.</p>
<p>Dolphins think we are really dumb and theorise that this level of stupidity has one obvious cause: self-imposed ignorance. We are, after all, only the third most intelligent species on Earth and appear to have aspirations to lower our IQ just a bit more.</p>
<p>While it&#8217;s no harm poking fun at ourselves, in PHP we do have a serious problem. <a class="zem_slink" title="Cross-site scripting" rel="wikipedia" href="http://en.wikipedia.org/wiki/Cross-site_scripting" target="_blank">Cross-Site Scripting</a> (XSS) remains one of the most significant classes of security problems afflicting PHP applications. Despite years of education, community awareness and the development of frameworks which can offer a huge boost in consistent practices &#8211; things are not getting any better.</p>
<p>So, I finally figured out what the core problem is: PHP programmers are completely clueless about XSS. It&#8217;s that simple. Instead of going out and studying the topic, we blindly follow some preferred herd of people offering advice with heartfelt conviction despite the fact that they are probably just as ignorant as the rest of us. Does that sound like the behaviour of something which allegedly evolved into an intelligent species? The result is a mix of ignorance and stagnant knowledge that leaves PHP in an unenviable position beset by wrongheaded zealots.</p>
<p>To get the ball rolling, this two-part article series is a tour of how NOT to use <a href="http://ie2.php.net/manual/en/function.htmlspecialchars.php" target="_blank">the htmlspecialchars() function</a> that is typically pressed ganged into service as PHP&#8217;s universal output escaper. By offering an example based guide, I hope it will illustrate just how many ways a prospective attacker using XSS can exploit this function&#8217;s misuse to pull off a successful attack. The examples were written for PHP 5.3, so 5.4 users may need to imagine they still have 5.3 installed and/or lodge an official complaint with somebody who looks like they keep a complaints box handy (your local fast food restaurant is a good start).</p>
<p>This example led approach has another motive. Simple examples can be translated into unit tests. Ideally, many of the current crop of frameworks can use this article as a guide to what their unit tests should be looking for. This also makes it far easier for everyday programmers to consume the article and run around the place, drunk with ungodly power, identifying issues in the libraries, frameworks and other projects that they rely on.</p>
<p>To help us on the path of enlightenment before it&#8217;s too late (I&#8217;d lodge an appeal with the Supreme Spaghetti Monster but apparently the Mayans already tried and failed), I also invite other PHP programmers to blog about a security topic over the next month or two. Give programmers one last chance to get it right before the Planet is demolished by the Vogon destructor fleet. Just pick a topic that drives you up the walls in defiance of gravity and spend an hour writing something useful and (optionally) expletive filled. Every little bit helps.</p>
<h1>What Is Htmlspecialchars()?</h1>
<p>According to many programmers from Earth, htmlspecialchars() is a function used to escape output to prevent XSS. This is however a completely wrong definition. The function was actually co-opted by programmers to combat XSS because it was either that or create slow userland functions for which the internals developers might get around to creating, when the full moon coincided with the right planetary alignment in another 314 years, a speedier C alternative to. The actual definition (along with a half-hearted self-doubting nod to preventing XSS) is as follows:</p>
<blockquote><p>Certain characters have special significance in HTML, and should be represented by HTML entities if they are to preserve their meanings. This function returns a string with some of these conversions made; the translations made are those most useful for everyday web programming. If you require all HTML character entities to be translated, use htmlentities() instead. This function is useful in preventing user-supplied text from containing HTML markup, such as in a message board or guest book application.</p></blockquote>
<p>Note that this hints at, but does not explicitly use, the terms Cross-Site Scripting, XSS or even Security. Then again, it does refer to guest book applications so it was probably written in 1790 by the Dolphin who created PHP v86 and who then got around to backporting version 1.0 for Humans in the late 20th Century out of extreme pity for our reliance on CGI. No, not the let&#8217;s take an action movie and turn it into a plotless eyesore with computer generated fake stuff style CGI &#8211; though memories of both are comparably bad.</p>
<p>Does this make htmlspecialchars() terrible at preventing XSS? No. As part of a comprehensive well-understood strategy to prevent XSS, the function is very useful. However, in PHP it is frequently overused, misused, abused, confused and&#8230;. Darn it, ran out of rhyming words again. Suffice it to say that a good description of htmlspecialchars() is that it&#8217;s an unsuitable tool for preventing XSS that has slowly evolved into a better suited tool over the years. I keep telling myself that, at least.</p>
<p>The function, htmlspecialchars(), accepts four parameters. Here is its function prototype as of PHP 5.4:</p>
<pre>string htmlspecialchars ( string $string [, int $flags = ENT_COMPAT | ENT_HTML401 [, string $encoding = 'UTF-8' [, bool $double_encode = true ]]] )</pre>
<p>The first parameter accepts a string whose special HTML characters will be converted to HTML entities. The second accepts one or more flags which defaults to using ENT_COMPAT (does not convert single quotes to entities) but should be set to use ENT_QUOTES (does convert single quotes to entities). You can include another flag, in PHP 5.4, called ENT_SUBSTITUTE which is not a bad idea for UTF-8, i.e. ENT_QUOTES | ENT_SUBSTITUTE. You can pretend that all the other constants don&#8217;t exist. The third parameter accepts a string indicating the <a class="zem_slink" title="Character encoding" rel="wikipedia" href="http://en.wikipedia.org/wiki/Character_encoding" target="_blank">character encoding</a> of the string being processed and defaults to ISO-8859-1 for PHP 5.3, and UTF-8 for PHP 5.4. Don&#8217;t ever set the fourth parameter to TRUE when escaping unless your filtering logic was written by an Über Dolphin &#8211; always keep filtering and escaping separate from each other to avoid confusing the two and then having to pointlessly argue why your way is better in defiance of all logic.</p>
<p>The function, if correctly configured using this super simple article for guidance, will now convert the following characters to entities: &lt;, &gt;, &#8216;, &#8221; and &amp;. These characters make sense to escape since they are used to construct HTML tags, delineate attribute values or reference HTML entities &#8211; none of which we want users to be able to do!</p>
<p>If you want some very good advice before your brain implodes from too much reading, a good way to potentially make yourself vulnerable to XSS is to not explicitly set the first two optional parameters ($flags and $encoding) to an appropriate value. In fact, if you see htmlspecialchars() missing any of those two parameters in someone&#8217;s source code, you should request that they fix it or, at the very least, curse their name and pray for the Supreme Spaghetti Monster to label them as biohazardous waste in need of emergency disposal.</p>
<p>Now, let&#8217;s get down to overloading your brain with information. I&#8217;m told that this part is like being sucked into the <a class="zem_slink" title="Technology in The Hitchhiker's Guide to the Galaxy" rel="wikipedia" href="http://en.wikipedia.org/wiki/Technology_in_The_Hitchhiker%27s_Guide_to_the_Galaxy" target="_blank">Total Perspective Vortex</a> machine on Frogstar World B.</p>
<h1>To Quote Or Not To Quote. How Is That A Question?</h1>
<p>As it turns out, HTML is not simply a popular markup language, it is a popular markup language designed by a bureaucratic species of transdimensional beings seeking to drive Humanity insane by inventing the most impossible-to-secure markup language known in 172 Universes which is then interpreted by &#8220;browsers&#8221; written by Mice to test the patience of security professionals and keep the really intelligent Humans distracted from the truth of their soon-to-end existence as they search out ever more ludicrous examples of parsing weirdness. Excuse me, I held my breath writing that and need to fetch my Oxygen tank&#8230;</p>
<p>Consider the following example. If you want to see whether they work without copy pasting, you can clone all examples from my ominously titled <a href="https://github.com/padraic/xss" target="_blank">xss repository on Github</a> into a webroot somewhere to read or execute them.</p>
<div id="wpshdo_1" class="wp-synhighlighter-outer"><div id="wpshdt_1" class="wp-synhighlighter-expanded"><table border="0" width="100%"><tr><td align="left" width="80%"><a name="#codesyntax_1"></a><a id="wpshat_1" class="wp-synhighlighter-title" href="#codesyntax_1"  onClick="javascript:wpsh_toggleBlock(1)" title="Click to show/hide code block">Single Quoted Attributes</a></td><td align="right"><a href="#codesyntax_1" onClick="javascript:wpsh_code(1)" title="Show code only"><img border="0" style="border: 0 none" src="http://blog.astrumfutura.com/wp-content/plugins/wp-synhighlight/themes/default/images/code.png" /></a>&nbsp;<a href="#codesyntax_1" onClick="javascript:wpsh_print(1)" title="Print code"><img border="0" style="border: 0 none" src="http://blog.astrumfutura.com/wp-content/plugins/wp-synhighlight/themes/default/images/printer.png" /></a>&nbsp;<a href="http://blog.astrumfutura.com/wp-content/plugins/wp-synhighlight/About.html" target="_blank" title="Show plugin information"><img border="0" style="border: 0 none" src="http://blog.astrumfutura.com/wp-content/plugins/wp-synhighlight/themes/default/images/info.gif" /></a>&nbsp;</td></tr></table></div><div id="wpshdi_1" class="wp-synhighlighter-inner" style="display: block;"><div class="php" style="font-family:monospace;"><ol><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;"><span style="color: #000000; font-weight: bold;">&lt;?php</span> <a href="http://www.php.net/header"><span style="color: #990000;">header</span></a><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">'Content-Type: text/html; charset=UTF-8'</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span> <span style="color: #000000; font-weight: bold;">?&gt;</span></pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">&lt;!DOCTYPE html&gt;</pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;"><span style="color: #000000; font-weight: bold;">&lt;?php</span></pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;"><span style="color: #000088;">$input</span> <span style="color: #339933;">=</span> <span style="color: #0000cc; font-style: italic;">&lt;&lt;&lt;INPUT</span></pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;"><span style="color: #0000cc; font-style: italic;">' onmouseover='alert(/Meow!/);</span></pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;"><span style="color: #0000cc; font-style: italic;">INPUT</span><span style="color: #339933;">;</span></pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;"><span style="color: #009933; font-style: italic;">/**</span></pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;"><span style="color: #009933; font-style: italic;">&nbsp;* NOTE: This is equivalent to using htmlspecialchars($input, ENT_COMPAT)</span></pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;"><span style="color: #009933; font-style: italic;">&nbsp;*/</span></pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;"><span style="color: #000088;">$output</span> <span style="color: #339933;">=</span> <a href="http://www.php.net/htmlspecialchars"><span style="color: #990000;">htmlspecialchars</span></a><span style="color: #009900;">&#40;</span><span style="color: #000088;">$input</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;"><span style="color: #000000; font-weight: bold;">?&gt;</span></pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">&lt;html&gt;</pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">&lt;head&gt;</pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">    &lt;title&gt;Single Quoted Attribute&lt;/title&gt;</pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">    &lt;meta http-equiv=&quot;Content-Type&quot; content=&quot;text/html; charset=UTF-8&quot;&gt;</pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">&lt;/head&gt;</pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">&lt;body&gt;</pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">    &lt;div&gt;</pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">        &lt;span title='<span style="color: #000000; font-weight: bold;">&lt;?php</span> <span style="color: #b1b100;">echo</span> <span style="color: #000088;">$output</span> <span style="color: #000000; font-weight: bold;">?&gt;</span>'&gt;</pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">            What's that latin placeholder text again?</pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">        &lt;/span&gt;</pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">    &lt;/div&gt;</pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">&lt;/body&gt;</pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">&lt;/html&gt;</pre></li></ol></div></div></div>
<p>If you run the example from a browser and pass your mouse pointer over the text, you will get a popup saying &#8220;/Meow!/&#8221;. Granted, this is hardly the most impressive XSS ever but remember that the Javascript executed could be a lot more ingenious and damaging. The reason you see alert() used everywhere in XSS examples is to prove that Javascript was executable &#8211; a real attacker will hardly advertise his success like this.</p>
<p>In this case, the htmlspecialchars() function call omits the second parameter which defaults to using the ENT_COMPAT flag. With this setting, the function does not convert single quotes to entities, allowing us to inject an unescaped single quote (to close the title attribute value) and another to start a new attribute and value which will be closed by the final single quote used in the template.</p>
<p>We can fix this problem in one of two ways:</p>
<p>1. Use double quotes which will prevent user input from breaking out of the HTML attribute value context using single quotes; or</p>
<p>2. Set the second parameter to htmlspecialchars() to use the ENT_QUOTES flag which will escape any single quotes a user tries to inject.</p>
<p>The moral of the story can be made even clearer by another example. In this case we use another perfectly validating means of delineating attribute values in HTML5 &#8211; we just don&#8217;t bother using quotes at all!</p>
<div id="wpshdo_2" class="wp-synhighlighter-outer"><div id="wpshdt_2" class="wp-synhighlighter-expanded"><table border="0" width="100%"><tr><td align="left" width="80%"><a name="#codesyntax_2"></a><a id="wpshat_2" class="wp-synhighlighter-title" href="#codesyntax_2"  onClick="javascript:wpsh_toggleBlock(2)" title="Click to show/hide code block">Quoteless Attributes</a></td><td align="right"><a href="#codesyntax_2" onClick="javascript:wpsh_code(2)" title="Show code only"><img border="0" style="border: 0 none" src="http://blog.astrumfutura.com/wp-content/plugins/wp-synhighlight/themes/default/images/code.png" /></a>&nbsp;<a href="#codesyntax_2" onClick="javascript:wpsh_print(2)" title="Print code"><img border="0" style="border: 0 none" src="http://blog.astrumfutura.com/wp-content/plugins/wp-synhighlight/themes/default/images/printer.png" /></a>&nbsp;<a href="http://blog.astrumfutura.com/wp-content/plugins/wp-synhighlight/About.html" target="_blank" title="Show plugin information"><img border="0" style="border: 0 none" src="http://blog.astrumfutura.com/wp-content/plugins/wp-synhighlight/themes/default/images/info.gif" /></a>&nbsp;</td></tr></table></div><div id="wpshdi_2" class="wp-synhighlighter-inner" style="display: block;"><div class="php" style="font-family:monospace;"><ol><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;"><span style="color: #000000; font-weight: bold;">&lt;?php</span> <a href="http://www.php.net/header"><span style="color: #990000;">header</span></a><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">'Content-Type: text/html; charset=UTF-8'</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span> <span style="color: #000000; font-weight: bold;">?&gt;</span></pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">&lt;!DOCTYPE html&gt;</pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;"><span style="color: #000000; font-weight: bold;">&lt;?php</span></pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;"><span style="color: #000088;">$input</span> <span style="color: #339933;">=</span> <span style="color: #0000cc; font-style: italic;">&lt;&lt;&lt;INPUT</span></pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;"><span style="color: #0000cc; font-style: italic;">faketitle onmouseover=alert(/Meow!/);</span></pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;"><span style="color: #0000cc; font-style: italic;">INPUT</span><span style="color: #339933;">;</span></pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;"><span style="color: #000088;">$output</span> <span style="color: #339933;">=</span> <a href="http://www.php.net/htmlspecialchars"><span style="color: #990000;">htmlspecialchars</span></a><span style="color: #009900;">&#40;</span><span style="color: #000088;">$input</span><span style="color: #339933;">,</span> <span style="color: #009900; font-weight: bold;">ENT_QUOTES</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;"><span style="color: #000000; font-weight: bold;">?&gt;</span></pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">&lt;html&gt;</pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">&lt;head&gt;</pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">    &lt;title&gt;Quoteless Attribute&lt;/title&gt;</pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">    &lt;meta http-equiv=&quot;Content-Type&quot; content=&quot;text/html; charset=UTF-8&quot;&gt;</pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">&lt;/head&gt;</pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">&lt;body&gt;</pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">    &lt;div&gt;</pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">        &lt;span title=<span style="color: #000000; font-weight: bold;">&lt;?php</span> <span style="color: #b1b100;">echo</span> <span style="color: #000088;">$output</span> <span style="color: #000000; font-weight: bold;">?&gt;</span>&gt;</pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">            What's that latin placeholder text again?</pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">        &lt;/span&gt;</pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">    &lt;/div&gt;</pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">&lt;/body&gt;</pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">&lt;/html&gt;</pre></li></ol></div></div></div>
<p>Without quotes delineating the attribute value, any space character (including any character a browser might interpret as a space &#8211; there are a lot!) allows the user to inject new attributes and values. As from the above, converting all quotes to entities is pointless if there are no quotes to start with! Our escaping doesn&#8217;t convert spaces or other space-interpreted characters into entities at all.</p>
<p>By now, you should see the obvious. All HTML attribute values MUST be quoted, and preferably DOUBLE quoted, in any scenario where you suspect untrusted input will be injected into an attribute value, or where htmlspecialchars() calls do not set the second parameter to use ENT_QUOTES. Believe it or not, using single quotes or no quotes remains popular and is perfectly valid under the new HTML5 spec. Some people even celebrate this new insanity. Keep an eye on any designers who look a bit wild eyed or spend too much time smiling while staring into empty space.</p>
<h1>Excuse Me, Sir, But Someone Ate My Quotes</h1>
<p>One of the great mysteries in escaping output is a common myth known as the Great ASCII Delusion (GAD). Those under the influence of this delusion, besides hearing voices in their head, have arrived at a belief that many character encodings are equivalent for the purposes of escaping those characters which have a special meaning for HTML, e.g ISO-8859-1 and UTF-8. Alas, this is untrue because the Mice created something called Internet Explorer 6 &#8211; a thoroughly shameful (but still commonly used) browser which corporations across the Planet continue to insist on using because buying new computers and upgrading operating systems just to use some fancy new Microsoft Office version is seen as a waste of shareholder funds.</p>
<p>Internet Explorer 6 is the bad boy of the XSS world since it&#8217;s vulnerable to ridiculous exploits no decent modern browser would dare associate with. Even Netscape would probably spit on it from beyond the grave. For example, have a go with this example using IE6 and PHP 5.3. If you need a testing version of all IE browsers since IE 5.5, you can download IETester from http://www.my-debugbar.com/ietester/index_all.php and use it from Windows. Try hard, I know Windows is bad and the new Tablet makeover for Windows 8 makes you feel ill, but it&#8217;s important to see these examples in action.</p>
<div id="wpshdo_3" class="wp-synhighlighter-outer"><div id="wpshdt_3" class="wp-synhighlighter-expanded"><table border="0" width="100%"><tr><td align="left" width="80%"><a name="#codesyntax_3"></a><a id="wpshat_3" class="wp-synhighlighter-title" href="#codesyntax_3"  onClick="javascript:wpsh_toggleBlock(3)" title="Click to show/hide code block">Source code</a></td><td align="right"><a href="#codesyntax_3" onClick="javascript:wpsh_code(3)" title="Show code only"><img border="0" style="border: 0 none" src="http://blog.astrumfutura.com/wp-content/plugins/wp-synhighlight/themes/default/images/code.png" /></a>&nbsp;<a href="#codesyntax_3" onClick="javascript:wpsh_print(3)" title="Print code"><img border="0" style="border: 0 none" src="http://blog.astrumfutura.com/wp-content/plugins/wp-synhighlight/themes/default/images/printer.png" /></a>&nbsp;<a href="http://blog.astrumfutura.com/wp-content/plugins/wp-synhighlight/About.html" target="_blank" title="Show plugin information"><img border="0" style="border: 0 none" src="http://blog.astrumfutura.com/wp-content/plugins/wp-synhighlight/themes/default/images/info.gif" /></a>&nbsp;</td></tr></table></div><div id="wpshdi_3" class="wp-synhighlighter-inner" style="display: block;"><div class="php" style="font-family:monospace;"><ol><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;"><span style="color: #000000; font-weight: bold;">&lt;?php</span> <a href="http://www.php.net/header"><span style="color: #990000;">header</span></a><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">'Content-Type: text/html; charset=UTF-8'</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span> <span style="color: #000000; font-weight: bold;">?&gt;</span></pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">&lt;!DOCTYPE html&gt;</pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;"><span style="color: #000000; font-weight: bold;">&lt;?php</span></pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;"><span style="color: #009933; font-style: italic;">/**</span></pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;"><span style="color: #009933; font-style: italic;">&nbsp;* You could also subsititute \xC0 or any other impacted character</span></pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;"><span style="color: #009933; font-style: italic;">&nbsp;* above ASCII number 192</span></pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;"><span style="color: #009933; font-style: italic;">&nbsp;*/</span></pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;"><span style="color: #000088;">$input1</span> <span style="color: #339933;">=</span> <span style="color: #0000ff;">'fakeimage'</span><span style="color: #339933;">.</span><a href="http://www.php.net/chr"><span style="color: #990000;">chr</span></a><span style="color: #009900;">&#40;</span>192<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;"><span style="color: #000088;">$input2</span> <span style="color: #339933;">=</span> <span style="color: #0000cc; font-style: italic;">&lt;&lt;&lt;INPUT2</span></pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;"><span style="color: #0000cc; font-style: italic;">onerror=alert(/Meow!/)//</span></pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;"><span style="color: #0000cc; font-style: italic;">INPUT2</span><span style="color: #339933;">;</span></pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;"><span style="color: #000088;">$output1</span> <span style="color: #339933;">=</span> <a href="http://www.php.net/htmlspecialchars"><span style="color: #990000;">htmlspecialchars</span></a><span style="color: #009900;">&#40;</span><span style="color: #000088;">$input1</span><span style="color: #339933;">,</span> <span style="color: #009900; font-weight: bold;">ENT_QUOTES</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;"><span style="color: #000088;">$output2</span> <span style="color: #339933;">=</span> <a href="http://www.php.net/htmlspecialchars"><span style="color: #990000;">htmlspecialchars</span></a><span style="color: #009900;">&#40;</span><span style="color: #000088;">$input2</span><span style="color: #339933;">,</span> <span style="color: #009900; font-weight: bold;">ENT_QUOTES</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;"><span style="color: #000000; font-weight: bold;">?&gt;</span></pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">&lt;html&gt;</pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">&lt;head&gt;</pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">    &lt;title&gt;Swallowed Quotes&lt;/title&gt;</pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">    &lt;meta http-equiv=&quot;Content-Type&quot; content=&quot;text/html; charset=UTF-8&quot;&gt;</pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">&lt;/head&gt;</pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">&lt;body&gt;</pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">    &lt;div&gt;</pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">        &lt;img src=&quot;http://example.com/images/<span style="color: #000000; font-weight: bold;">&lt;?php</span> <span style="color: #b1b100;">echo</span> <span style="color: #000088;">$output1</span> <span style="color: #000000; font-weight: bold;">?&gt;</span>&quot;</pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">        title=&quot;<span style="color: #000000; font-weight: bold;">&lt;?php</span> <span style="color: #b1b100;">echo</span> <span style="color: #000088;">$output2</span> <span style="color: #000000; font-weight: bold;">?&gt;</span>&quot;&gt;</pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">    &lt;/div&gt;</pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">&lt;/body&gt;</pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">&lt;/html&gt;</pre></li></ol></div></div></div>
<p>With the above example, something very weird happens. Using ASCII character number 192 just before a double quote in a document being interpreted as UTF-8 results in the double quote&#8230;vanishing in IE6. Seriously, it&#8217;s there but not there. Obviously the Mice are behind it &#8211; no Human could possibly defy Physics like this!</p>
<p>This allows an attacker to once again break out of the HTML attribute they can inject values into. Using a coincidental opportunity to inject a second free text string nearby which a browser will concatenate to the broken out attribute value of the first, you get an effective XSS combo attack.</p>
<p>This IE6 quirk even bypasses the call to htmlspecialchars() which, as explained above, defaults to the ISO-8859-1 character encoding for PHP 5.3 or less. If the Great ASCII Delusion were not a fabrication of someone&#8217;s imaginative wishful thinking, this should not be possible. Not to be too harsh though, this weirdness is due primarily to a bug in IE6&#8242;s treatment of the various character encodings where you can trick the browser into thinking something like \xC0 (in hex) is the start of a multi-byte character thus swallowing the next ASCII character (the double quote).</p>
<p>To fix the above weirdness, you must make sure that escaping is done using the same character encoding that the document is being served as. The above HTML document is identifying itself as being UTF-8 but the default htmlspecialchars() encoding is ISO-8859-1 in PHP 5.3 &#8211; there&#8217;s obviously something not agreeing there! This brings us to the absolutely perfect use (well, almost) of htmlspecialchars(), the golden rule, the Word of The Supreme Spaghetti Monster, the bringer of frustration to XSS attackers:</p>
<p>Always set the third parameter to htmlspecialchars(), set it correctly, and make sure your document is never served with a mismatched or invalid character encoding! Don&#8217;t expect some theoretically perfect world to magically appear &#8211; browsers are filthily efficient at doing weird things you don&#8217;t expect.</p>
<p>I suppose I have to mention that most versions of IE have similar issues with other character encodings such as BIG5 and Shift-JIS. You can test your IE versions using http://ha.ckers.org/weird/variable-width-encoding.cgi to see what characters can be used across different character encodings. Believe it or not, these character encodings are actually still being used and, for some strange reason, people from China and Japan do use PHP.</p>
<p>If you want to be completely paranoid, you can either check the input for invalid UTF-8 (Drupal and HTMLPurifier have reusable functions/classes for this), and/or run it through a conversion function which should theoretically filter out the naughty bits:</p>
<pre>$input = mb_convert_encoding($input, 'UTF-8', 'UTF-8');</pre>
<p>This is probably a good idea for older PHP versions pre 2010 or earlier but recent PHP versions have specifically improved htmlspecialchars() to disallow invalid characters such as the above (if you set the right character encoding!). You should be aware, though, that htmlspecialchars() may still return blank strings on certain malformed input and, since PHP 5.4, will not issue any warnings about this.</p>
<h1>I Broke It! I Broke It!</h1>
<p>Before you think htmlspecialchars() is getting off lightly, there is one minor quibble. We&#8217;ll keep picking on Internet Explorer 6 for the rest of this article since it&#8217;s so easy to exploit.</p>
<div id="wpshdo_4" class="wp-synhighlighter-outer"><div id="wpshdt_4" class="wp-synhighlighter-expanded"><table border="0" width="100%"><tr><td align="left" width="80%"><a name="#codesyntax_4"></a><a id="wpshat_4" class="wp-synhighlighter-title" href="#codesyntax_4"  onClick="javascript:wpsh_toggleBlock(4)" title="Click to show/hide code block">Source code</a></td><td align="right"><a href="#codesyntax_4" onClick="javascript:wpsh_code(4)" title="Show code only"><img border="0" style="border: 0 none" src="http://blog.astrumfutura.com/wp-content/plugins/wp-synhighlight/themes/default/images/code.png" /></a>&nbsp;<a href="#codesyntax_4" onClick="javascript:wpsh_print(4)" title="Print code"><img border="0" style="border: 0 none" src="http://blog.astrumfutura.com/wp-content/plugins/wp-synhighlight/themes/default/images/printer.png" /></a>&nbsp;<a href="http://blog.astrumfutura.com/wp-content/plugins/wp-synhighlight/About.html" target="_blank" title="Show plugin information"><img border="0" style="border: 0 none" src="http://blog.astrumfutura.com/wp-content/plugins/wp-synhighlight/themes/default/images/info.gif" /></a>&nbsp;</td></tr></table></div><div id="wpshdi_4" class="wp-synhighlighter-inner" style="display: block;"><div class="php" style="font-family:monospace;"><ol><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;"><span style="color: #000000; font-weight: bold;">&lt;?php</span> <a href="http://www.php.net/header"><span style="color: #990000;">header</span></a><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">'Content-Type: text/html; charset=UTF-8'</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span> <span style="color: #000000; font-weight: bold;">?&gt;</span></pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">&lt;!DOCTYPE html&gt;</pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;"><span style="color: #000000; font-weight: bold;">&lt;?php</span></pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;"><span style="color: #000088;">$input1</span> <span style="color: #339933;">=</span> <span style="color: #0000ff;">'fakeimage'</span><span style="color: #339933;">.</span><span style="color: #0000ff;">&quot;<span style="color: #660099; font-weight: bold;">\xC0</span>&quot;</span><span style="color: #339933;">;</span></pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;"><span style="color: #000088;">$input2</span> <span style="color: #339933;">=</span> <span style="color: #0000cc; font-style: italic;">&lt;&lt;&lt;INPUT2</span></pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;"><span style="color: #0000cc; font-style: italic;">onerror=alert(/Meow!/)//</span></pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;"><span style="color: #0000cc; font-style: italic;">INPUT2</span><span style="color: #339933;">;</span></pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;"><span style="color: #009933; font-style: italic;">/**</span></pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;"><span style="color: #009933; font-style: italic;">&nbsp;* If you think PHP 5.4 will save you - empty strings make it guess the encoding</span></pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;"><span style="color: #009933; font-style: italic;">&nbsp;* or use the default_charset value from php.ini. You sure everyone on the whole</span></pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;"><span style="color: #009933; font-style: italic;">&nbsp;* planet uses UTF-8? Under 5.3 - empty strings === default encoding.</span></pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;"><span style="color: #009933; font-style: italic;">&nbsp;*/</span></pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;"><span style="color: #000088;">$encoding</span> <span style="color: #339933;">=</span> <span style="color: #0000ff;">''</span><span style="color: #339933;">;</span> <span style="color: #666666; font-style: italic;">// from outside source or unvalidated variable</span></pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;"><span style="color: #000088;">$output1</span> <span style="color: #339933;">=</span> <a href="http://www.php.net/htmlspecialchars"><span style="color: #990000;">htmlspecialchars</span></a><span style="color: #009900;">&#40;</span><span style="color: #000088;">$input1</span><span style="color: #339933;">,</span> <span style="color: #009900; font-weight: bold;">ENT_QUOTES</span><span style="color: #339933;">,</span> <span style="color: #000088;">$encoding</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;"><span style="color: #000088;">$output2</span> <span style="color: #339933;">=</span> <a href="http://www.php.net/htmlspecialchars"><span style="color: #990000;">htmlspecialchars</span></a><span style="color: #009900;">&#40;</span><span style="color: #000088;">$input2</span><span style="color: #339933;">,</span> <span style="color: #009900; font-weight: bold;">ENT_QUOTES</span><span style="color: #339933;">,</span> <span style="color: #000088;">$encoding</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;"><span style="color: #000000; font-weight: bold;">?&gt;</span></pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">&lt;html&gt;</pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">&lt;head&gt;</pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">    &lt;title&gt;Swallowed Quotes&lt;/title&gt;</pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">    &lt;meta http-equiv=&quot;Content-Type&quot; content=&quot;text/html; charset=UTF-8&quot;&gt;</pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">&lt;/head&gt;</pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">&lt;body&gt;</pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">    &lt;div&gt;</pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">        &lt;img src=&quot;http://example.com/images/<span style="color: #000000; font-weight: bold;">&lt;?php</span> <span style="color: #b1b100;">echo</span> <span style="color: #000088;">$output1</span> <span style="color: #000000; font-weight: bold;">?&gt;</span>&quot;</pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">        title=&quot;<span style="color: #000000; font-weight: bold;">&lt;?php</span> <span style="color: #b1b100;">echo</span> <span style="color: #000088;">$output2</span> <span style="color: #000000; font-weight: bold;">?&gt;</span>&quot;&gt;</pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">    &lt;/div&gt;</pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">&lt;/body&gt;</pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">&lt;/html&gt;</pre></li></ol></div></div></div>
<p>Setting the third $encoding parameter of htmlspecialchars() to an empty string in PHP 5.4 will set the encoding to be auto-detected, grabbed from the php.ini value of default_charset, or guessed from the current locale (in that order). Be very careful under PHP 5.4 NEVER to let this happen. Don&#8217;t leave your escaping parameters to chance.</p>
<p>Use empty() or strlen(), for example, to spot this issue if accepting encodings from another source or variable that might allow for empty strings. Again, this behaviour is very secure and there&#8217;s nothing wrong with it whatsoever. Oh, who am I kidding&#8230; This is the dumbest parameter behaviour ever invented. NULL means use the default encoding; blank string means play a guessing game. Even Vogon poetry pales in comparison to such nonsense. One slip and an empty parameter string can rip apart this house of cards because who knows which character encoding will be used.</p>
<p>Oooh, I wonder what this does under PHP 5.3&#8230; Yes, er, don&#8217;t allow blank encoding parameter strings under PHP 5.3 either. Setting an empty string in PHP 5.3 is interpreted as setting the default character encoding, i.e. ISO-8859-1, instead of triggering the expected warning about an unsupported encoding.</p>
<p>So, be careful kids. When setting the encoding for htmlspecialchars() do a safety check to make sure it&#8217;s not an empty string you are passing in. Keep it predictable and consistent.</p>
<p>There&#8217;s also one other curious behaviour when using htmlspecialchars().</p>
<div id="wpshdo_5" class="wp-synhighlighter-outer"><div id="wpshdt_5" class="wp-synhighlighter-expanded"><table border="0" width="100%"><tr><td align="left" width="80%"><a name="#codesyntax_5"></a><a id="wpshat_5" class="wp-synhighlighter-title" href="#codesyntax_5"  onClick="javascript:wpsh_toggleBlock(5)" title="Click to show/hide code block">Source code</a></td><td align="right"><a href="#codesyntax_5" onClick="javascript:wpsh_code(5)" title="Show code only"><img border="0" style="border: 0 none" src="http://blog.astrumfutura.com/wp-content/plugins/wp-synhighlight/themes/default/images/code.png" /></a>&nbsp;<a href="#codesyntax_5" onClick="javascript:wpsh_print(5)" title="Print code"><img border="0" style="border: 0 none" src="http://blog.astrumfutura.com/wp-content/plugins/wp-synhighlight/themes/default/images/printer.png" /></a>&nbsp;<a href="http://blog.astrumfutura.com/wp-content/plugins/wp-synhighlight/About.html" target="_blank" title="Show plugin information"><img border="0" style="border: 0 none" src="http://blog.astrumfutura.com/wp-content/plugins/wp-synhighlight/themes/default/images/info.gif" /></a>&nbsp;</td></tr></table></div><div id="wpshdi_5" class="wp-synhighlighter-inner" style="display: block;"><div class="php" style="font-family:monospace;"><ol><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;"><span style="color: #000000; font-weight: bold;">&lt;?php</span> <a href="http://www.php.net/header"><span style="color: #990000;">header</span></a><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">'Content-Type: text/html; charset=UTF-8'</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span> <span style="color: #000000; font-weight: bold;">?&gt;</span></pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">&lt;!DOCTYPE html&gt;</pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;"><span style="color: #000000; font-weight: bold;">&lt;?php</span></pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;"><a href="http://www.php.net/error_reporting"><span style="color: #990000;">error_reporting</span></a><span style="color: #009900;">&#40;</span><span style="color: #009900; font-weight: bold;">E_ALL</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;"><a href="http://www.php.net/ini_set"><span style="color: #990000;">ini_set</span></a><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">'display_errors'</span><span style="color: #339933;">,</span> 1<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;"><span style="color: #000088;">$input1</span> <span style="color: #339933;">=</span> <span style="color: #0000ff;">'fakeimage'</span><span style="color: #339933;">.</span><span style="color: #0000ff;">&quot;<span style="color: #660099; font-weight: bold;">\xC0</span>&quot;</span><span style="color: #339933;">;</span></pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;"><span style="color: #000088;">$input2</span> <span style="color: #339933;">=</span> <span style="color: #0000cc; font-style: italic;">&lt;&lt;&lt;INPUT2</span></pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;"><span style="color: #0000cc; font-style: italic;">onerror=alert(/Meow!/)//</span></pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;"><span style="color: #0000cc; font-style: italic;">INPUT2</span><span style="color: #339933;">;</span></pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;"><span style="color: #009933; font-style: italic;">/**</span></pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;"><span style="color: #009933; font-style: italic;">&nbsp;* Invalid encoding makes htmlspecialchars() throw a warning but it continues</span></pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;"><span style="color: #009933; font-style: italic;">&nbsp;* the current operation anyway using the default encoding even if the default</span></pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;"><span style="color: #009933; font-style: italic;">&nbsp;* is an unsafe choice for the application. Don't allow invalid encodings!</span></pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;"><span style="color: #009933; font-style: italic;">&nbsp;*/</span></pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;"><span style="color: #000088;">$encoding</span> <span style="color: #339933;">=</span> <span style="color: #0000ff;">'invalid-encoding'</span><span style="color: #339933;">;</span> <span style="color: #666666; font-style: italic;">// from outside source or unvalidated variable</span></pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;"><span style="color: #000088;">$output1</span> <span style="color: #339933;">=</span> <a href="http://www.php.net/htmlspecialchars"><span style="color: #990000;">htmlspecialchars</span></a><span style="color: #009900;">&#40;</span><span style="color: #000088;">$input1</span><span style="color: #339933;">,</span> <span style="color: #009900; font-weight: bold;">ENT_QUOTES</span><span style="color: #339933;">,</span> <span style="color: #000088;">$encoding</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;"><span style="color: #000088;">$output2</span> <span style="color: #339933;">=</span> <a href="http://www.php.net/htmlspecialchars"><span style="color: #990000;">htmlspecialchars</span></a><span style="color: #009900;">&#40;</span><span style="color: #000088;">$input2</span><span style="color: #339933;">,</span> <span style="color: #009900; font-weight: bold;">ENT_QUOTES</span><span style="color: #339933;">,</span> <span style="color: #000088;">$encoding</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;"><span style="color: #000000; font-weight: bold;">?&gt;</span></pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">&lt;html&gt;</pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">&lt;head&gt;</pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">    &lt;title&gt;Swallowed Quotes&lt;/title&gt;</pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">    &lt;meta http-equiv=&quot;Content-Type&quot; content=&quot;text/html; charset=UTF-8&quot;&gt;</pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">&lt;/head&gt;</pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">&lt;body&gt;</pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">    &lt;div&gt;</pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">        &lt;img src=&quot;http://example.com/images/<span style="color: #000000; font-weight: bold;">&lt;?php</span> <span style="color: #b1b100;">echo</span> <span style="color: #000088;">$output1</span> <span style="color: #000000; font-weight: bold;">?&gt;</span>&quot;</pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">        title=&quot;<span style="color: #000000; font-weight: bold;">&lt;?php</span> <span style="color: #b1b100;">echo</span> <span style="color: #000088;">$output2</span> <span style="color: #000000; font-weight: bold;">?&gt;</span>&quot;&gt;</pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">    &lt;/div&gt;</pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">&lt;/body&gt;</pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">&lt;/html&gt;</pre></li></ol></div></div></div>
<p>When you set an invalid character encoding, not the empty string of doom, htmlspecialchars() will issue a Warning level error&#8230;and continue merrily on its way by reinstating the default encoding. In a production scenario, you will likely have display_errors disabled and this warning will be logged and possibly ignored by some users. If this makes it through, setting an invalid character encoding whether by a deliberate user value or simple programmer error may create an exploitable scenario.</p>
<p>So, make sure you also validate the character encoding. Don&#8217;t just leave it up to htmlspecialchars() since it allows the continued execution of the application. Arguably this should be a fatal error since a bad encoding is itself a security problem.</p>
<p>Seriously, this function is like handing a box of matches to a Human and telling them there&#8217;s a rainforest nearby that&#8217;s essential to all life on Earth&#8230;</p>
<h1>Internet Explorer: Master Of Supporting Stupid Character Encodings</h1>
<p>Internet Explorer is unique in the Universe. Designed by Mice to be the dumbest, most frustrating, most stubbornly non-upgradeable piece of crap ever, it does things that make XSS far easier. The terrible part is that IE is popular with corporations and businesses using commodity hardware imported from whichever country currently has the lowest paid PC assemblers on Earth. One would think they&#8217;d like a more secure browser to protect their money making endeavours.</p>
<p>It&#8217;s no wonder that Dolphins had to think long and hard before deciding we were marginally smarter than the average cat. Cats, coincidentally, strenuously deny this claim having spent thousands of years demonstrating a lack of Human intelligence by showing how easy it is to make Humans cater to their every need&#8230;for free. Even their main rivals, Dogs, are expected to do useful work like herding sheep, chasing cars, digging holes, barking at strangers, and keeping bill bearing postal workers at bay.</p>
<p>All versions of Internet Explorer support a troublesome character encoding called UTF-7 which, oddly enough, is not supported by htmlspecialchars(). You can probably see where this is going. How do you escape a character encoding that your escaper doesn&#8217;t even support? Easy, you can&#8217;t. JUST DON&#8217;T USE UTF-7! EVER! UTF-7 has the distinction of definitely not being ASCII compatible &#8211; it encodes angle brackets (used to open and close HTML tags) very differently so they are never detected by filters or escapers relying on other character encodings.</p>
<p>Unfortunately, some applications do allow users to cherry pick an encoding. It&#8217;s not uncommon in international websites (e.g. Google which had this problem). Here&#8217;s an example of what not to do:</p>
<div id="wpshdo_6" class="wp-synhighlighter-outer"><div id="wpshdt_6" class="wp-synhighlighter-expanded"><table border="0" width="100%"><tr><td align="left" width="80%"><a name="#codesyntax_6"></a><a id="wpshat_6" class="wp-synhighlighter-title" href="#codesyntax_6"  onClick="javascript:wpsh_toggleBlock(6)" title="Click to show/hide code block">Source code</a></td><td align="right"><a href="#codesyntax_6" onClick="javascript:wpsh_code(6)" title="Show code only"><img border="0" style="border: 0 none" src="http://blog.astrumfutura.com/wp-content/plugins/wp-synhighlight/themes/default/images/code.png" /></a>&nbsp;<a href="#codesyntax_6" onClick="javascript:wpsh_print(6)" title="Print code"><img border="0" style="border: 0 none" src="http://blog.astrumfutura.com/wp-content/plugins/wp-synhighlight/themes/default/images/printer.png" /></a>&nbsp;<a href="http://blog.astrumfutura.com/wp-content/plugins/wp-synhighlight/About.html" target="_blank" title="Show plugin information"><img border="0" style="border: 0 none" src="http://blog.astrumfutura.com/wp-content/plugins/wp-synhighlight/themes/default/images/info.gif" /></a>&nbsp;</td></tr></table></div><div id="wpshdi_6" class="wp-synhighlighter-inner" style="display: block;"><div class="php" style="font-family:monospace;"><ol><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;"><span style="color: #000000; font-weight: bold;">&lt;?php</span></pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;"><span style="color: #000088;">$input1</span> <span style="color: #339933;">=</span> <span style="color: #0000ff;">'UTF-7'</span><span style="color: #339933;">;</span></pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;"><span style="color: #000088;">$input2</span> <span style="color: #339933;">=</span> <span style="color: #0000cc; font-style: italic;">&lt;&lt;&lt;INPUT2</span></pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;"><span style="color: #0000cc; font-style: italic;">&lt;script&gt;alert(/Meow!/)//&lt;/script&gt;</span></pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;"><span style="color: #0000cc; font-style: italic;">INPUT2</span><span style="color: #339933;">;</span></pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;"><span style="color: #000088;">$input2</span> <span style="color: #339933;">=</span> <a href="http://www.php.net/mb_convert_encoding"><span style="color: #990000;">mb_convert_encoding</span></a><span style="color: #009900;">&#40;</span><span style="color: #000088;">$input2</span><span style="color: #339933;">,</span> <span style="color: #0000ff;">'UTF-7'</span><span style="color: #339933;">,</span> <span style="color: #0000ff;">'UTF-8'</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;"><span style="color: #000088;">$output1</span> <span style="color: #339933;">=</span> <a href="http://www.php.net/htmlspecialchars"><span style="color: #990000;">htmlspecialchars</span></a><span style="color: #009900;">&#40;</span><span style="color: #000088;">$input1</span><span style="color: #339933;">,</span> <span style="color: #009900; font-weight: bold;">ENT_QUOTES</span><span style="color: #339933;">,</span> <span style="color: #0000ff;">'UTF-8'</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;"><span style="color: #000088;">$output2</span> <span style="color: #339933;">=</span> <a href="http://www.php.net/htmlspecialchars"><span style="color: #990000;">htmlspecialchars</span></a><span style="color: #009900;">&#40;</span><span style="color: #000088;">$input2</span><span style="color: #339933;">,</span> <span style="color: #009900; font-weight: bold;">ENT_QUOTES</span><span style="color: #339933;">,</span> <span style="color: #0000ff;">'UTF-8'</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">&nbsp;</pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;"><a href="http://www.php.net/header"><span style="color: #990000;">header</span></a><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">'Content-Type: text/html; charset='</span><span style="color: #339933;">.</span><a href="http://www.php.net/trim"><span style="color: #990000;">trim</span></a><span style="color: #009900;">&#40;</span><span style="color: #000088;">$output1</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;"><span style="color: #000000; font-weight: bold;">?&gt;</span></pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">&lt;!DOCTYPE html&gt;</pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">&lt;html&gt;</pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">&lt;head&gt;</pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">    &lt;title&gt;Mismatched Encoding&lt;/title&gt;</pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">    &lt;meta http-equiv=&quot;Content-Type&quot;</pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">    content=&quot;text/html; charset=<span style="color: #000000; font-weight: bold;">&lt;?php</span> <span style="color: #b1b100;">echo</span> <span style="color: #000088;">$output1</span> <span style="color: #000000; font-weight: bold;">?&gt;</span>&quot;&gt;</pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">&lt;/head&gt;</pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">&lt;body&gt;</pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">    &lt;div&gt;</pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">        <span style="color: #000000; font-weight: bold;">&lt;?php</span> <span style="color: #b1b100;">echo</span> <span style="color: #000088;">$output2</span> <span style="color: #000000; font-weight: bold;">?&gt;</span></pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">    &lt;/div&gt;</pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">&lt;/body&gt;</pre></li><li style="font-weight: normal; vertical-align:top;"><pre style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">&lt;/html&gt;</pre></li></ol></div></div></div>
<p>This works in all IE versions. The problem here is that we&#8217;re letting the user set the character encoding without validating it against a safe whitelist of encodings that we can actually escape. This also works even when you plead with the Supreme Spaghetti Monster and try passing UTF-7 to htmlspecialchars() since the function simply issues a warning and reinstates its ISO-8859-1 or UTF-8 default before continuing on its merry way to making you vulnerable to XSS. Yes, very secure behaviour there&#8230;</p>
<p>Note: putting the @ symbol in front of htmlspecialchars() to hide these warning errors during development is not considered an act worthy of an intelligent species. Don&#8217;t let the cats win!</p>
<p>Now, you might think that this would be the end of it, but there&#8217;s one other problem afflicting older browsers (fixed as of Internet Explorer 9). In certain scenarios you can trick the browser into rendering pages as UTF-7 even when you can&#8217;t set the page&#8217;s character encoding. This is due to a bug in how some browser versions guess the character encoding when it&#8217;s absent (i.e. not set in a header or meta tag, or set incorrectly, e.g. UTF-8 is valid; UTF8 is NOT).</p>
<p>To pull off this exploit, you need to first set some UTF-7 text which is persisted across requests, e.g. a blog comment. Since we can&#8217;t escape UTF-7 in PHP, the persisted text will contain some UTF-7 encoded XSS code. Just in case, you&#8217;re smart and you&#8217;re thinking that mbstring functions might help detect UTF-7 &#8211; they won&#8217;t. mbstring will detect UTF-7 as UTF-8, and UTF-8 as UTF-7 depending on the detection order set in mb_detect_encoding(). After that it&#8217;s a long winded story of using iframes to trick a browser into rendering the innocent looking UTF-7 strings on your webpages as UTF-7.</p>
<p>Where escaping fails, some common sense should win out. Just make sure all the responses you serve have a header that sets the appropriate character encoding for the content (use a valid encoding string, not an invalid string form). In HTML, use the relevant meta tag to indicate the content&#8217;s character encoding as a backup should the header be somehow omitted.</p>
<h1>Conclusion</h1>
<p>Htmlspecialchars() as a function for escaping output has its limitations. If you&#8217;re unaware of these and wish to persist in using it incorrectly, you should expect to be burned. No, seriously, there really is an incinerator for those labelled as biohazardous waste over in Alpha Centauri.</p>
<p>I get the feeling I&#8217;ve written enough for you today. I&#8217;m very sorry for the 0.006% of you that Vogon studies indicate are now sitting at their desk drooling all over their keyboards from encroaching insanity. If you&#8217;re worried about joining the 0.006%, please submit the correct form in triplicate, completed in capitals using a blue ball-point pen, to your local Alpha Centauri Medical Facility where the friendly Vogon staff will give you a free brain scan and determine whether the incinerator next door needs more fuel.</p>
<p>So, what next? In Part 2, we continue our voyage into madness with more examples using htmlspecialchars() though in another direction this time. In the meantime, you have a lot of examples (aka ammunition) and there are a lot of applications/frameworks/libraries (targets). I figure the rest is obvious.</p>
<p>See you for Part 2!</p>
<div class="zemanta-pixie" style="margin-top: 10px; height: 15px;"><a class="zemanta-pixie-a" title="Enhanced by Zemanta" href="http://www.zemanta.com/"><img class="zemanta-pixie-img" style="border: none; float: right;" src="http://img.zemanta.com/zemified_e.png?x-id=7b94678c-be6a-4aac-b6e0-2cf494adcafc" alt="Enhanced by Zemanta" /></a></div>
]]></content:encoded>
			<wfw:commentRss>http://blog.astrumfutura.com/2012/03/a-hitchhikers-guide-to-cross-site-scripting-xss-in-php-part-1-how-not-to-use-htmlspecialchars-for-output-escaping/feed/</wfw:commentRss>
		<slash:comments>33</slash:comments>
		</item>
		<item>
		<title>Mockery 0.7.2 Released (And On Packagist.org!)</title>
		<link>http://blog.astrumfutura.com/2012/01/mockery-0-7-2-released-and-on-packagist-org/</link>
		<comments>http://blog.astrumfutura.com/2012/01/mockery-0-7-2-released-and-on-packagist-org/#comments</comments>
		<pubDate>Wed, 25 Jan 2012 11:07:07 +0000</pubDate>
		<dc:creator>padraic</dc:creator>
				<category><![CDATA[PHP General]]></category>
		<category><![CDATA[PHP Security]]></category>
		<category><![CDATA[Zend Framework]]></category>
		<category><![CDATA[mockery]]></category>

		<guid isPermaLink="false">http://blog.astrumfutura.com/?p=700</guid>
		<description><![CDATA[Mockery is a simple yet flexible PHP mock object framework for use in unit testing with PHPUnit, PHPSpec or any other testing framework. Its core goal is to offer a framework for creating test doubles like mock objects through the use of a simple and succint API capable of clearly defining all possible object operations]]></description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: right; margin-left: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fblog.astrumfutura.com%2F2012%2F01%2Fmockery-0-7-2-released-and-on-packagist-org%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fblog.astrumfutura.com%2F2012%2F01%2Fmockery-0-7-2-released-and-on-packagist-org%2F&amp;source=padraicb&amp;style=normal&amp;service=bit.ly&amp;service_api=padraic%3AR_94101570b7e190f3de921bc15bb9438d&amp;hashtags=mockery&amp;b=2" height="61" width="50" /><br />
			</a>
		</div>
<p><a href="https://github.com/padraic/mockery">Mockery</a> is a simple yet flexible PHP mock object framework for use in unit testing with <a class="zem_slink" title="PHPUnit" rel="homepage" href="http://www.phpunit.de">PHPUnit</a>, <a href="http://www.phpspec.net/">PHPSpec</a> or any other testing framework. Its core goal is to offer a framework for creating test doubles like mock objects through the use of a simple and succint API capable of clearly defining all possible object operations and interactions using a human readable <a class="zem_slink" title="Domain-specific language" rel="wikipedia" href="http://en.wikipedia.org/wiki/Domain-specific_language">Domain Specific Language</a> (DSL). Designed as a drop in alternative to PHPUnit&#8217;s <a href="https://github.com/sebastianbergmann/phpunit-mock-objects">phpunit-mock-objects</a> library, Mockery is easy to integrate with PHPUnit and can happily operate alongside phpunit-mock-objects.</p>
<p>Today, I am pleased to announce the release of Mockery 0.7.2, a maintenance release fixing a small number of bugs and annoyances. A special thanks to all those who forked the Github project at and submitted pull requests! Leaving a developer with hardly any work to do other than a quick test and merge is always appreciated! You can install or upgrade to the new version from the <a href="http://pear.survivethedeepend.com">survivethedeepend.com PEAR channel</a>.</p>
<p>Another piece of news is that Mockery is now available on <a href="http://packagist.org/packages/mockery/mockery">Packagist.org</a> for users of <a href="http://packagist.org/about-composer">Composer</a>. Composer is a tool to help you manage your own projects&#8217; or librarys&#8217; dependencies and it can handle and mix dependencies from Composer compatible repositories like <a href="http://packagist.org">Packagist.org</a>, any git repository using tags, and any PEAR channel. I do this of my own free will and not because Luis Cordova and Benjamin Eberlei are standing behind me with pitchforks <img src='http://blog.astrumfutura.com/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' /> .</p>
<p>The more pertinant fixes include:</p>
<ol>
<li>Fixed a problem in resolving methods chains which abuse the <a title="Law of Demeter" rel="wikipedia" href="http://en.wikipedia.org/wiki/Law_of_Demeter">Law of Demeter</a> (thanks to the wizardly Robert Basic).</li>
<li>Fixed unexpected static calls to an alias mock which were causing fatal errors (thanks to Luis Cordova).</li>
<li>Fixed a crash present since PHP 5.3.6 due to a referenced $this variable entering a closure (thanks to Martin Sadovy).</li>
<li>Added support for PHP_CodeCoverage 1.1 whose filter class is no longer a singleton (thanks to Matthew Vivian).</li>
<li>Added non-halting exception handling (for Mockery exceptions) to the PHPUnit TestListener (thanks to Adrian Slade).</li>
<li>Added boolean $prepend (defaults to FALSE) parameter to  \Mockery\Loader::register() to allow for registering Mockery&#8217;s  autoloader to the top of the autoloader stack even after other  autoloaders have been registered (thanks to Hermann Kosselowski).</li>
<li>Updated documentation/tests for the release of Hamcrest 1.0.0 several  days ago (thanks to me, me, me &#8211; who finally got to do something nobody  else had a pull request for!).</li>
<li>Added new \Mockery::self() static method to make retrieving the current  mock object simpler and more readable while setting expectations without  the need to refer back to past variable assignments.</li>
</ol>
<p>Users should also note that <a href="http://code.google.com/p/hamcrest/downloads/list">Hamcrest 1.0.0</a>, which includes a small filename change (hamcrest.php was capitalised to Hamcrest.php), was released several days ago. If you use Hamcrest matchers with Mockery, you should ensure that both libraries are updated on your system.</p>
<p>As always, please report any bugs or potential improvements to the Github issue tracker using the relevant label or, even more appreciated, send me a pull request.</p>
<div class="zemanta-pixie" style="margin-top: 10px; height: 15px;"><a class="zemanta-pixie-a" title="Enhanced by Zemanta" href="http://www.zemanta.com/"><img class="zemanta-pixie-img" style="border: medium none; float: right;" src="http://img.zemanta.com/zemified_e.png?x-id=bca20a55-a102-4f9e-87be-350304e0b374" alt="Enhanced by Zemanta" /></a></div>
]]></content:encoded>
			<wfw:commentRss>http://blog.astrumfutura.com/2012/01/mockery-0-7-2-released-and-on-packagist-org/feed/</wfw:commentRss>
		<slash:comments>29</slash:comments>
		</item>
		<item>
		<title>Storing Session Data In Cookies: Problems And Security Concerns To Be Aware Of</title>
		<link>http://blog.astrumfutura.com/2012/01/storing-session-data-in-cookies-problems-and-security-concerns-to-be-aware-of/</link>
		<comments>http://blog.astrumfutura.com/2012/01/storing-session-data-in-cookies-problems-and-security-concerns-to-be-aware-of/#comments</comments>
		<pubDate>Mon, 23 Jan 2012 21:20:12 +0000</pubDate>
		<dc:creator>padraic</dc:creator>
				<category><![CDATA[PHP General]]></category>
		<category><![CDATA[PHP Security]]></category>
		<category><![CDATA[Zend Framework]]></category>

		<guid isPermaLink="false">http://blog.astrumfutura.com/?p=688</guid>
		<description><![CDATA[Back from my extended leave of absence, I&#8217;ll re-open the dusty cobwebbed depths of this blog to echo the sentiments of Paul Reinheimer in his recent article &#8220;Cookies don&#8217;t replace Sessions&#8220;. The topic is actually an old one since Ruby On Rails has adopted the strategy of storing application session data in cookies by default]]></description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: right; margin-left: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fblog.astrumfutura.com%2F2012%2F01%2Fstoring-session-data-in-cookies-problems-and-security-concerns-to-be-aware-of%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fblog.astrumfutura.com%2F2012%2F01%2Fstoring-session-data-in-cookies-problems-and-security-concerns-to-be-aware-of%2F&amp;source=padraicb&amp;style=normal&amp;service=bit.ly&amp;service_api=padraic%3AR_94101570b7e190f3de921bc15bb9438d&amp;hashtags=php&amp;b=2" height="61" width="50" /><br />
			</a>
		</div>
<div class="wp-caption alignright" style="width: 310px"><a href="http://commons.wikipedia.org/wiki/File:ChocolateChipSmile.jpg"><img class="zemanta-img-inserted zemanta-img-configured" title="English: Peanut butter cookie with a chocolate..." src="http://upload.wikimedia.org/wikipedia/commons/thumb/e/ee/ChocolateChipSmile.jpg/300px-ChocolateChipSmile.jpg" alt="English: Peanut butter cookie with a chocolate..." width="300" height="301" /></a><p class="wp-caption-text">Image via Wikipedia</p></div>
<p>Back from my extended leave of absence, I&#8217;ll re-open the dusty cobwebbed depths of this blog to echo the sentiments of Paul Reinheimer in his recent article &#8220;<a href="http://blog.preinheimer.com/index.php?/archives/373-Cookies-dont-replace-Sessions.html">Cookies don&#8217;t replace Sessions</a>&#8220;. The topic is actually an old one since Ruby On Rails has adopted the strategy of storing application session data in cookies by default (take note, performance hounds). The purposes of storing sessions in userland cookies rather than the conventional &#8220;stick-it-on-the-filesystem/database&#8221; used by many apps is one of performance and a little obscuration. Cookie data can be accessed faster than hitting the filesystem/database plus it has the dubious ability to disguise the session-targeted programming language. Really though, PHP is assumed to be on all web servers so hiding its existence is a bit like trying to hide an elephant in a zoo. Hide it all you want &#8211; we still know there has to be one in there!</p>
<p>In exchange for speeding up session reading, storing session data in cookies has some fairly uncomfortable costs.</p>
<p>Now, developers are not unaware of the problems of storing potentially sensitive application data in plain text files on the user&#8217;s PC which users can manipulate, copy, and mangle to their (or the hacker&#8217;s currently fiddling with the user&#8217;s PC) heart&#8217;s content. It&#8217;s dangerous depending on just how much you rely on session data to drive other security rules or restrictions on business logic within the application. Technically, the reliance placed on sessions should be close to nothing &#8211; session data should drive the application towards other storage solutions for the really essential stuff and just stay around as a minimal identifier/stash of basic ID info. Such minimal information can be dumped, corrupted, or overwritten with the only cost being to perhaps require a user to login again when that happens. Stuffing a bank balance into a session, on the other hand, is one (very exaggerated!) example of the kind of data you should be shot for relying on a session for.</p>
<p>Programmers being programmers, it&#8217;s not rare to see sessions become a more intrinsically important storage location than it should be. In those cases, being able to manipulate the session data can become a problem and may give rise to exploitation scenarios where tampering with the stored data leads to some benefit for the manipulator. Obviously we want to make sure that that can&#8217;t happen even in scenarios where programmers may be a bit loose with where they store data. We don&#8217;t build frameworks and libraries for Gurus, we build them for all programmers &#8211; even the sometimes ignorant and under trained ones. This cookie stored session data is often coupled with the ability to encrypt that data. However&#8230;</p>
<p>As Paul Rainheimer remarks <a href="http://blog.preinheimer.com/index.php?/archives/373-Cookies-dont-replace-Sessions.html">in his article</a>, &#8220;Encryption is often viewed as a panacea for security problems, you sprinkle a little encryption dust around, and your problems dissolve&#8221;. This is an absolute truth in programming &#8211; programmers often view encryption as a solution without regard for one teeny tiny problem. If you encrypt a set of data for any purpose, even though it&#8217;s encrypted, the user (or the hacker hacking the user&#8217;s account) still has the data in some usable form!</p>
<p>With perfectly intact data, and even through it&#8217;s hidden by encryption, that data can be recycled simply by copying it to another machine. Depending on the data that is stored (which admittedly may require the hacker/user to figure out by doing actual work like finding your open source app on Github or breaking a developer&#8217;s fingers until they spill the beans), you can restore past data just by copying over a backup of a prior cookie or repeat a past transaction by continually reusing the original cookie it required. Paul offers a few trivial examples in his article.</p>
<p>Such reuse of data is known as a <a class="zem_slink" title="Replay attack" rel="wikipedia" href="http://en.wikipedia.org/wiki/Replay_attack">replay attack</a>. A scenario where even encrypted data can be constantly reused to give rise to a positive result &#8211; all without any need whatsoever to break the encryption. The antidote to this vulnerability is to ensure that all data sets are unique and can be used only once, i.e. you include a single-use nonce (some generated set of characters or bits) in the data which is updated whenever that data is used. This continually forces the update of the relevant digital HMAC signature and/or encryption result (even for the exact same data otherwise) in order to prevent any reuse of old data in a replay attack. Once a nonce is used, it&#8217;s discarded, and the old data can no longer be accepted by your application. Of course, the downside is that since the nonce must be single-use, you need to keep track of all <a class="zem_slink" title="Cryptographic nonce" rel="wikipedia" href="http://en.wikipedia.org/wiki/Cryptographic_nonce">nonces</a> to ensure they are not accidentally used again. You will need a database, possibly using a nonce-included timestamp as a time limit so your storage requirements aren&#8217;t completed insane, which obviously means that just using the traditional database storage for sessions in the first place would have been a much better and simpler choice.</p>
<p>So, in summary, encryption prevents the reading of data but it does not prevent the reuse of existing data. For that to be prevented you need a nonce implementation. And, due to the complexity of using and tracking nonces, practically no cookie stored session solutions will actually offer nonce support because it would eliminate their speed advantage. Which means they are susceptible to replay attacks, which means they are dangerous tools to be swinging around blindly, which means that the old local session storage strategies are still far superior from a security perspective, which all means that you should avoid cookie stores like the damned plague and stick to the old, traditional but secure session storage strategies we already have unless you a) are crazy or b) trust your colleagues (and yourself) not to screw it up.</p>
<p>Even without the security concerns, there is also another less critical downside to storing sessions in cookies which is that cookies have a storage limit of around 4KB. No other storage solution for session data should have that problem but you need to be aware of it anyway as using encryption may push you there sooner than the base data size might suggest (encrypted data size is usually larger than the original data). While noting this, you should never really hit that limit unless you are storing data there that you likely shouldn&#8217;t be anyway!</p>
<p>So, cookie based session storage: It&#8217;s very fast but lethally insecure if you store the wrong type of data. If you&#8217;re going to use it, make sure you keep a tight rein on what data is being stored.</p>
<div class="zemanta-pixie" style="margin-top: 10px; height: 15px;"><a class="zemanta-pixie-a" title="Enhanced by Zemanta" href="http://www.zemanta.com/"><img class="zemanta-pixie-img" style="border: medium none; float: right;" src="http://img.zemanta.com/zemified_e.png?x-id=a2ed3479-04a9-40fd-8d36-625c94f308fb" alt="Enhanced by Zemanta" /></a></div>
]]></content:encoded>
			<wfw:commentRss>http://blog.astrumfutura.com/2012/01/storing-session-data-in-cookies-problems-and-security-concerns-to-be-aware-of/feed/</wfw:commentRss>
		<slash:comments>23</slash:comments>
		</item>
		<item>
		<title>Zend Framework 2.0: Dependency Injection (Part 1)</title>
		<link>http://blog.astrumfutura.com/2011/10/zend-framework-2-0-dependency-injection-part-1/</link>
		<comments>http://blog.astrumfutura.com/2011/10/zend-framework-2-0-dependency-injection-part-1/#comments</comments>
		<pubDate>Tue, 04 Oct 2011 14:05:17 +0000</pubDate>
		<dc:creator>padraic</dc:creator>
				<category><![CDATA[PHP General]]></category>
		<category><![CDATA[PHP Security]]></category>
		<category><![CDATA[Zend Framework]]></category>
		<category><![CDATA[dependency injection]]></category>
		<category><![CDATA[di]]></category>
		<category><![CDATA[dic]]></category>
		<category><![CDATA[pimple]]></category>
		<category><![CDATA[zend\di]]></category>

		<guid isPermaLink="false">http://blog.astrumfutura.com/?p=627</guid>
		<description><![CDATA[If you&#8217;ve been watching the PHP weather vane (we call it Twitter for short), you may have noticed a shift in Symfony and Zend Framework. Version 2.0 of both web application frameworks feature Dependency Injection Containers (DICs) as the primary means of creating the objects (and even Controllers) your application will use. This is an]]></description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: right; margin-left: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fblog.astrumfutura.com%2F2011%2F10%2Fzend-framework-2-0-dependency-injection-part-1%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fblog.astrumfutura.com%2F2011%2F10%2Fzend-framework-2-0-dependency-injection-part-1%2F&amp;source=padraicb&amp;style=normal&amp;service=bit.ly&amp;service_api=padraic%3AR_94101570b7e190f3de921bc15bb9438d&amp;hashtags=dependency+injection,di,dic,pimple,zend%5Cdi&amp;b=2" height="61" width="50" /><br />
			</a>
		</div>
<div class="zemanta-img" style="margin: 1em; display: block;">
<div class="wp-caption alignright" style="width: 247px"><a href="http://commons.wikipedia.org/wiki/File:Codex_Gigas_devil.jpg"><img title="Illustration of the devil, page 577. Legend ha..." src="http://upload.wikimedia.org/wikipedia/commons/2/27/Codex_Gigas_devil.jpg" alt="Illustration of the devil, page 577. Legend ha..." width="237" height="398" /></a><p class="wp-caption-text">Image via Wikipedia</p></div>
</div>
<p>If you&#8217;ve been watching the PHP weather vane (we call it Twitter for short), you may have noticed a shift in Symfony and Zend Framework. Version 2.0 of both web application frameworks feature Dependency Injection Containers (DICs) as the primary means of creating the objects (and even Controllers) your application will use. This is an interesting shift in a programming language that often stubbornly evaded adopting DICs to any great extent. In this mini-series of articles, I&#8217;ll take a look at the marvellous world of Dependency Injection as we run up to an examination of Zend Framework 2.0&#8242;s Zend\Di component in the next part.</p>
<h2>What is Dependency Injection (DI)?</h2>
<p>The short answer to this question is that Dependency Injection is a design pattern where, instead of dependent objects creating their dependencies internally, they instead define setters, constructor parameters or public properties which allow a user to &#8220;inject&#8221; dependencies from the outside into the dependent object and where such dependencies adhere to an expected interface.</p>
<p>If the definition sounds familiar, it&#8217;s because Dependency Injection is an obvious design pattern. As a programmer who knows how to use PHPUnit, you probably use the pattern every time you open an editor. So let&#8217;s quickly look at why the pattern is both obvious and ubiquitous.</p>
<p>Imagine a class implementation called Leprechaun. In writing the class, we realise we have a dependency on another class called PotOfGold. A naïve implementation would start out very simply with the Leprechaun object creating an instance of PotOfGold for use.</p>
<p>If you think this through, you may notice the problems. What if we want our Leprechaun to instead have a PotOfRareEarthElementsFromChina? What if we need to replace PotOfGold with a mock object during unit testing? What if another users locates a bug in PotOfGold and needs to replace it without editing the original class (since it&#8217;s under 3rd party version control)?</p>
<p>The answer to all these questions is to allow external parties to inject dependencies instead of relying on the object to create them internally. Based on our ridiculous example from above, we would define a setter called setPot(), and allow it to accept any object which implements a new Pot interface. Using an interface merely ensures the dependency that is set obeys some interface the dependent object is expecting.</p>
<p>That, in a nutshell, is why Dependency Injection is obvious. It&#8217;s a simple shuffling of creational responsibilities from within an object to some external agent which makes the dependent object more flexible, testable and amenable to the wisdom that Composition is preferred over Inheritance (i.e. injecting objects beats monkey patching!).</p>
<h2>Some External Agent</h2>
<p>In applying Dependency Injection, we eventually reach a state where all objects in a system are created by a mysterious external agent. What is this entity?</p>
<p>One possible candidate is whatever passes for a Controller in your framework based application. In Zend Framework, this would be an instance of Zend_Controller_Action. Our Controller, in this instance, would define an action method which would perform a necessary application task and create all the objects needed to perform that task. This makes a lot of immediate sense to programmers since allowing you to write Controllers with as little fuss as possible is a fundamental goal of any framework.</p>
<p>However, Controllers are objects! If you had a NewsletterController defining an emailAction method, you might expect that creating an instance of Zend_Mail inside that action is obvious (which it is). Think again! In Dependency Injection parlance, your Controller is a dependent object and an instance of Zend_Mail is one of its dependencies. This is no different from our Leprechaun example. If we create the Zend_Mail instance inside the Controller we get the same irritatingly stubborn question. How do we replace the Zend_Mail instance with an alternative, test double or monkey patched version containing an emergency bug fix?</p>
<p>Controllers, alas, are not the external agent we&#8217;re looking for to create objects. And yes, you really should be testing your Controllers <img src='http://blog.astrumfutura.com/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' /> .</p>
<p>The next entity a level above Controllers can be loosely termed the Bootstrap. In Zend Framework 1, this started out as a relatively simple script to do just enough that you could start the FrontController and dispatch a request. In other words, Zend Framework traditionally did not offer a final external agent as needed for Dependency Injection. It left it to individual users to create something of their own or, as became inevitable, to just create objects in the Controllers themselves.</p>
<p>More recent Zend Framework versions offer Zend_Application, a method of bootstrapping that allowed users to define Resources, i.e. using a method or class which created an object (and injected its dependencies) and returned it on demand when it was needed by a Controller. This was the first consistent approach to handling object creation in ZF which effectively involved defining any number of Factory classes or methods in one location and passing the managing object (the Bootstrap) around the application wherever specific objects needed to be retrieved. In effect, this was a Dependency Injection Container. So, surprise, users of Zend Framework already have a DIC. An even lesser surprise: Zend Framework 2.0 will be no different.</p>
<h2>Dependency Injection Containers Are The Devil</h2>
<p>The concept of a Dependency Injection Container (DIC) is to act as a programmable object assembler. You take your DIC, tell it how to construct objects (including how to construct and inject their dependencies), pass the DIC to wherever it&#8217;s needed, and eventually ask it to create an object it knows about. This is not rocket science. DICs are simple animals to understand, however the devilish suspicion that PHP developers have for DICs is not rooted in what they do but how they do it and whether they make a developer&#8217;s life easier.</p>
<p>There&#8217;s a widely known belief that the Ruby language doesn&#8217;t need a DIC. I&#8217;ll use Ruby as an example because it has a few features PHP programmers can salivate over (like how it uses a new method for classes vs PHP&#8217;s new keyword making class subsitutions stupidly easy). One investigator of Dependency Injection from the Ruby world is Jamis Buck. For Ruby he wrote two DICs: Copland (a port of Java&#8217;s HiveMind) and Needle (it&#8217;s like Pimple on steroids which…defeats the purpose). After fighting Ruby for a few years, he finally gave up on trying to write a Ruby DIC and documented his thoughts on his blog in &#8220;<a href="http://weblog.jamisbuck.org/2008/11/9/legos-play-doh-and-programming">LEGOs, Play-Doh, and Programming</a>&#8220;.</p>
<p>The core lesson from the article holds true even in PHP &#8211; by and large, complex DICs are a complete waste of time in most scenarios. Indeed, if you ever use a DIC and discover it requires just as many (if not more) lines of DIC code and configuration as it would to do the same thing in plain old PHP, you should start asking where the fabulous benefits have vanished to because it&#8217;s not delaying the onset of cramped finger muscles as advertised.</p>
<p>Most PHP developers understand this instinctively. Unlike Jamis, most PHP programmers probably won&#8217;t have a strong Java background. As a programming group, we&#8217;re less inclined to assume we need a special DIC blessed by the PHP Gods so we fall back to whatever strikes us as a simpler solution.</p>
<p>But here&#8217;s the rub &#8211; the simplest solution is itself a DIC.</p>
<p>In referring to Dependency Injection Containers as the devil, cursing their name, and blaming them as Java imports designed to make life more complex than needed, it&#8217;s easy to lose sight of the fact that such criticism is about the implementation of DICs and not their actual function. There is nothing wrong with having object assemblers &#8211; we use them all the time and call them Service Locators, or Factory Classes, or Zend_Application (Resources), or any of a dozen terms slightly different and probably not entirely accurate. Most of the time we&#8217;re trying to create a DIC without being aware of the term.</p>
<h2>Needles and Pimples (It&#8217;s Not What You Imagine)</h2>
<p>Jamis Buck hit the nail on the head back in 2004 with the creation of his Needle DIC Ruby. Instead of creating something inspired by Java that relied on static configuration and too many features, he realised that Ruby excelled (as does PHP to a growing degree &#8211; thank Closures) in expressing logic through a Domain Specific Language (DSL). The result was a DIC captured by a simple DSL &#8211; well, until he went and overcomplicated it (read his article).</p>
<p>You can see the exact same fundamental simplicity that a DIC is capable of in PHP. It&#8217;s a small so-tiny-you-won&#8217;t-believe-it DIC called <a href="http://pimple.sensiolabs.org/">Pimple</a>. Try calling that complex, hard, stupid or any other adjective you might instinctively think of when faced with the term &#8220;Dependency Injection&#8221;.</p>
<p>The core of Pimple is that you define object creations as closures. This immediately resolves a few traditional DIC problems. There&#8217;s no static configuration, you hand code all creation logic exactly once, and objects are named services you can recall and inject into other objects from your closure bodies. It basically takes everything you&#8217;d do in creating objects by hand and captures it all in one container. Other than the fact I hate arrays (my version uses object properties instead &#8211; it&#8217;s 50 lines; nobody was killed during its 5 minute development period), Pimple is like Dependency Injection itself &#8211; so blindingly obvious you may kick yourself.</p>
<p>Pimple proves that DICs are not the devil &#8211; they can be incredibly simple and useful tools if you can tame the urge to complicate it&#8217;s implementation.</p>
<h2>Then There Were Frameworks</h2>
<p>As you can probably see, making a strong case for DICs is not hard. Dependency Injection is obvious and omnipresent in PHP. Dependency Injection Containers can be a simple 50 line class you can write over a coffee break. The going gets tough when the simple notions we desperately want to cling to meet the complexity of PHP&#8217;s now standard tool: the application framework.</p>
<h2>Frameworks: Not Written By Monkeys</h2>
<p>As we&#8217;ve already covered, Zend Framework 1.0 covered off the external agent problem in Dependency Injection by creating Zend_Application. As Zend Framework 2.0 moves towards beta, it also needs a Dependency Injection Container to do similar heavy lifting. This time around, we called a spade a spade and the O&#8217;Phinney/Schindler hive mind wrote Zend\Di\DependencyInjector.</p>
<p>The DICs used by Symfony and Zend Framework are not like Pimple. Symfony&#8217;s DIC is driven by static configuration (preferably YAML for brevity). Zend Framework 2.0&#8242;s DIC is driven by a PHP API (no static configuration). Both have their own set of performance boosting measures to minimise any overhead in using a more complex DIC.</p>
<p>In the next part this mini series, we&#8217;ll take a deeper look at Zend\Di and see how it fares compared to Pimple or Symfony 2. In the meantime, I hope I&#8217;ve busted a few apprehensions you might have about using a DIC <img src='http://blog.astrumfutura.com/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' /> .</p>
<div class="zemanta-pixie" style="margin-top: 10px; height: 15px;"><a class="zemanta-pixie-a" title="Enhanced by Zemanta" href="http://www.zemanta.com/"><img class="zemanta-pixie-img" style="border: medium none; float: right;" src="http://img.zemanta.com/zemified_e.png?x-id=fa9ed110-fe7b-48f2-98ea-faed2f39d902" alt="Enhanced by Zemanta" /></a></div>
]]></content:encoded>
			<wfw:commentRss>http://blog.astrumfutura.com/2011/10/zend-framework-2-0-dependency-injection-part-1/feed/</wfw:commentRss>
		<slash:comments>31</slash:comments>
		</item>
		<item>
		<title>Zend Framework Contributors Mailing-List Summary; Edition #2 (July 2011)</title>
		<link>http://blog.astrumfutura.com/2011/08/zend-framework-contributors-mailing-list-summary-edition-2-july-2011/</link>
		<comments>http://blog.astrumfutura.com/2011/08/zend-framework-contributors-mailing-list-summary-edition-2-july-2011/#comments</comments>
		<pubDate>Wed, 24 Aug 2011 13:08:16 +0000</pubDate>
		<dc:creator>padraic</dc:creator>
				<category><![CDATA[PHP General]]></category>
		<category><![CDATA[PHP Security]]></category>
		<category><![CDATA[Zend Framework]]></category>
		<category><![CDATA[ZF-Summary]]></category>

		<guid isPermaLink="false">http://blog.astrumfutura.com/?p=622</guid>
		<description><![CDATA[It&#8217;s been a busy month in Zend Framework land which I&#8217;ll blog about shortly so, after a few weeks of delay, here&#8217;s the July 2011 Summary of the zf-contributor&#8217;s mailing list. ZF2 Feedback Late June kicked off with this topic from Robert Basic with a set of notes on his experiences in getting started with]]></description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: right; margin-left: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fblog.astrumfutura.com%2F2011%2F08%2Fzend-framework-contributors-mailing-list-summary-edition-2-july-2011%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fblog.astrumfutura.com%2F2011%2F08%2Fzend-framework-contributors-mailing-list-summary-edition-2-july-2011%2F&amp;source=padraicb&amp;style=normal&amp;service=bit.ly&amp;service_api=padraic%3AR_94101570b7e190f3de921bc15bb9438d&amp;hashtags=ZF-Summary&amp;b=2" height="61" width="50" /><br />
			</a>
		</div>
<div class="zemanta-img" style="margin: 1em; display: block;">
<div class="wp-caption alignright" style="width: 310px"><a href="http://commons.wikipedia.org/wiki/File:ZendFramework-Logo.png"><img title="Zend Framework logo." src="http://upload.wikimedia.org/wikipedia/commons/thumb/b/bf/ZendFramework-Logo.png/300px-ZendFramework-Logo.png" alt="Zend Framework logo." width="300" height="79" /></a><p class="wp-caption-text">Image via Wikipedia</p></div>
</div>
<p>It&#8217;s been a busy month in Zend Framework land which I&#8217;ll blog about shortly so, after a few weeks of delay, here&#8217;s the July 2011 Summary of the zf-contributor&#8217;s mailing list.</p>
<h3>ZF2 Feedback</h3>
<p>Late June kicked off with this topic from Robert Basic with a set of notes on his experiences in getting started with ZF2 by migrating a ZF1 application. Adam Lundrigan noted, correctly, that a lot of &#8220;bleeding edge&#8221; code is not included in the main repository at this time and is distributed across contributor Github forks. He also raised the suggestion for a ZF2 Status Page. Derek Miranda voiced his agreement with Adam. Robert also agreed noting the difficulty in assessing the state of components.</p>
<p>Summary: ZF2 is scattered across multiple forks &#8211; be prepared to rely on notes such as Robert&#8217;s if jumping in at the deep end.</p>
<h3>Creating a 1.11.9 Hotfix Release</h3>
<p>A short note from Matthew Weier O&#8217;Phinney announced that a 1.11.9 hotfix release would be made to fix a number of backwards compatibility breaks introduced in 1.11.8. Issue tickets involved were ZF-11548, ZF-11550, ZF-10991 and ZF-10725.</p>
<p>Summary: It&#8217;s a maintenance release. It fixes stuff.</p>
<h3>Zend\Http and MVC Developments</h3>
<p>Ralph Schindler presented a document outlining a requirement list and the overall architecture of classes and interfaces for Zend\Http, noting work would commence on a prototype once any outstanding items suggested were cleared. Rob Zienart commented that the document indicated interfaces for Zend\Http Client and Server components and mentioned they needed proposals. Matthew responded that Zend\Http&#8217;s Server would deal with classes extending Zend\Service\Abstract such as SOAP and AMF but would not comprise a HTTP Server given it was covered by PHP 5.4. Anthony Shireman asked whether there were any Zend\Http Server plans or whether it was a &#8220;time will tell&#8221; situation. Matthew confirmed that that was the case given PHP 5.4 would include a HTTP Server and ZF2 could piggy back that implementation in offering a development server environment.</p>
<p>Summary: HTTP work continues. We&#8217;ll need it to communicate with all those big tubes connecting PCs.</p>
<h3>[Proposal] ActiveRecord Proposal</h3>
<p>Artur Bodera raised the proposal and offered to implement an ActiveRecord solution noting its benefits compared to Zend\Db. The proposal was published at <a href="http://framework.zend.com/wiki/display/ZFDEV2/ActiveRecord+-+Arthur+Bodera">http://framework.zend.com/wiki/display/ZFDEV2/ActiveRecord+-+Arthur+Bodera</a> with a working branch at <a href="https://github.com/Thinkscape/zf2/branches/ActiveRecord">https://github.com/Thinkscape/zf2/branches/ActiveRecord</a>.</p>
<p>Nicolas Bérard-Nault asked why it was necessary to reinvent the wheel instead of integrating with other existing and mature implementations. Artur responded that other solutions did not integrate with Zend Framework noting his proposal is built on Zend\Db from ZF2 and he wondered what was the point of Zend\Db\Table otherwise in the face of Doctrine or Propel. Peter Kokx responded to note that Zend\Db\Table implements the Table Data and <a class="zem_slink" title="Row Data Gateway" rel="wikipedia" href="http://en.wikipedia.org/wiki/Row_Data_Gateway">Row Data Gateway</a> patterns as distinct from ActiveRecord and that users shouldn&#8217;t interpret MVC as referring solely to ActiveRecord. Artur conceded that this was a good point but pressed his point that ActiveRecord was one tool which did on impose on any others available to Zend Framework using Zend\Db. Tomáš Fejfar voiced his support for adding ActiveRecord noting its value in simple use cases to get things done fast.</p>
<p>Ralph Schindler leaped to the rescue by noting that ActiveRecord is indeed planned for ZF2 and noting the significant work done to date on Zend\Db in his own feature branch. Artur Bodera welcomed the progress stating he would migrate his ActiveRecord solution over to the improved Zend\Db once complete.</p>
<p>Summary: We&#8217;re getting an ActiveRecord implementation for ZF2.</p>
<h3>ZF2 Docbook Sources Converted to <a class="zem_slink" title="DocBook" rel="wikipedia" href="http://en.wikipedia.org/wiki/DocBook">DocBook</a> 5</h3>
<p>Another short note from Matthew Weier O&#8217;Phinney informed the community that ZF2&#8242;s docbook formatted manual files had been migrated to Docbook 5. The community silently admired the completion of this task (nobody responded but I assume they silently admired all the same!). Matthew noted the README for manual generation would be updated and that Docbook 5 made certain tasks a lot easier.</p>
<p>Summary: ZF2 Manual will be written in Docbook 5, those using a visual XML editor may celebrate.</p>
<p>ZF2 Zend\Mail: To strip/validate or not to strip/validate (email adresses)</p>
<h3>Status of the Test Suite (ZF2)</h3>
<p>Keith Pope asked after the status of the Test Suite mentioning that phpunit.xml was mostly commented out, Zend\Di was not using the @group annotation for the test runner, and TestConfiguration.php was nearing 800 lines. He suggested that the configuration be spread into a conf.d setup (i.e. each configuration segment split into a separate file and all combined at runtime). Matthew responded noting the ease with which ZF2 tests could be run by passing the necessary directory to phpunit from the main /tests directory, and noted configuration may be pushed into phpunit.xml instead of the current PHP file. While expressing an interest in a conf.d setup, Matthew noted this would depend on support in PHPUnit.</p>
<p>Summary: Ignore runtests.sh and just use the stock phpunit commands for ZF2.</p>
<h3>Serious Question about Mcrypt</h3>
<p>Artur Ejsmont observed that the mcrypt filter calls srand() with a limited range of potential seeds thus suggesting it would impact on the security of the filter. Enrico Zimuel replied that the srand() is only used in limited circumstances (where a better source of randomness is not available) and that it&#8217;s not a serious problem since the encryption security is not wholly based on the initialisation vector (IV) that uses srand() on some platforms. Nevertheless, he did note that some improvements could be made.</p>
<p>Artur responded with a general query on the efficacy of using srand() and rand() to avoid collisions. Pádraic Brady responded that rand() was particularly bad noting you could create collision in a matter of minutes. Pádraic also noted that mt_rand() was far more effective but also not entirely random (as a graph of its output would prove) suggesting that it was advisable to use better random sources such as /dev/random and /dev/urandom where feasible. Enrico also noted the availability of openssl_random_pseudo_bytes().</p>
<p>Summary: Getting random bytes is a tricky business.</p>
<h3>ZF2 Zend\Code Bugfix</h3>
<p>Nick Belhomme mentioned he had been looking at Zend\Code which is used heavily by Zend\Di. He noted his first impressions that it should work well by being token based but also referred to his opinion that it was quite error prone and the unit tests were not satisfactory.</p>
<p>To explain his case, he used an example of a method signature accepting four type hinted object parameters noting this could fail to be analysed correctly due to the whitespace in the parameter list (after each comma) not being handled correct by the ParameterScanner. Nick noted he&#8217;d committed a fix using a short trim function to his own git fork.</p>
<p>Regarding the unit tests, Nick explained why the current unit tests were insufficient in testing parameters and suggested rectifying the test doubles to account for whitespace.</p>
<p>Summary: Zend\Code needs to build up a fuller test suite accounting for different coding styles.</p>
<div class="zemanta-pixie" style="margin-top: 10px; height: 15px;"><a class="zemanta-pixie-a" title="Enhanced by Zemanta" href="http://www.zemanta.com/"><img class="zemanta-pixie-img" style="border: medium none; float: right;" src="http://img.zemanta.com/zemified_e.png?x-id=8e7d32eb-d621-4c1e-aa48-1b04fbf5d9ca" alt="Enhanced by Zemanta" /></a></div>
]]></content:encoded>
			<wfw:commentRss>http://blog.astrumfutura.com/2011/08/zend-framework-contributors-mailing-list-summary-edition-2-july-2011/feed/</wfw:commentRss>
		<slash:comments>12</slash:comments>
		</item>
		<item>
		<title>What is Mutation Testing?</title>
		<link>http://blog.astrumfutura.com/2011/08/what-is-mutation-testing/</link>
		<comments>http://blog.astrumfutura.com/2011/08/what-is-mutation-testing/#comments</comments>
		<pubDate>Tue, 02 Aug 2011 16:35:20 +0000</pubDate>
		<dc:creator>padraic</dc:creator>
				<category><![CDATA[PHP General]]></category>
		<category><![CDATA[PHP Security]]></category>
		<category><![CDATA[Zend Framework]]></category>
		<category><![CDATA[Code Coverage]]></category>
		<category><![CDATA[Mutagenesis]]></category>
		<category><![CDATA[mutation testing]]></category>
		<category><![CDATA[phpunit]]></category>
		<category><![CDATA[Test-driven development]]></category>
		<category><![CDATA[unit testing]]></category>

		<guid isPermaLink="false">http://blog.astrumfutura.com/?p=618</guid>
		<description><![CDATA[Some time ago, in between working on Zend Framework, I booted up a couple of libraries that I really wanted to integrate into my workflow. Recently, I&#8217;ve been being putting these through the grindmill so they can be properly released and supported for public consumption across PEAR. Just as Mockery fell out of older work]]></description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: right; margin-left: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fblog.astrumfutura.com%2F2011%2F08%2Fwhat-is-mutation-testing%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fblog.astrumfutura.com%2F2011%2F08%2Fwhat-is-mutation-testing%2F&amp;source=padraicb&amp;style=normal&amp;service=bit.ly&amp;service_api=padraic%3AR_94101570b7e190f3de921bc15bb9438d&amp;hashtags=Code+Coverage,Mutagenesis,mutation+testing,phpunit,Test-driven+development,unit+testing&amp;b=2" height="61" width="50" /><br />
			</a>
		</div>
<div class="zemanta-img" style="margin: 1em; display: block;">
<div class="wp-caption alignright" style="width: 310px"><a href="http://en.wikipedia.org/wiki/File:Mutant_Phase.jpg"><img title="The Mutant Phase" src="http://upload.wikimedia.org/wikipedia/en/thumb/6/64/Mutant_Phase.jpg/300px-Mutant_Phase.jpg" alt="The Mutant Phase" width="300" height="297" /></a><p class="wp-caption-text">Image via Wikipedia</p></div>
</div>
<p>Some time ago, in between working on Zend Framework, I booted up a couple of libraries that I really wanted to integrate into my workflow. Recently, I&#8217;ve been being putting these through the grindmill so they can be properly released and supported for public consumption across PEAR. Just as Mockery fell out of older work on PHPMock, Mutagenesis will fall out of another project called MutateMe. This is a short introductory article as to what Mutagenesis will do and why. In other words, what the heck is Mutation Testing?</p>
<p>First, some background.</p>
<p>The most common means of measuring confidence in a test suite is the Code Coverage metric. Code Coverage essentially checks, on a per class basis, how many of the lines of code in the class are executed by a test suite and expresses this as a percentage. For example, a Code Coverage of 85% means 85% of the lines of code in a class was executed and 15% were not. The greater the number of lines of code executed, the more confidence one can presumably have that a test suite is doing its job, i.e. verifying class behaviour, preventing the introduction of bugs, supporting refactoring, and so on.</p>
<p>I have a huge and insurmountable problem with Code Coverage. For starters, my average Code Coverage is closer to 80% than the 90% expected of projects such as Zend Framework. The gap is explained by me not testing what I call &#8220;braindead&#8221; functions, i.e. methods which are either ridiculously simple, where a malfunction would quickly become self-evident, or which are marginalised (on the borders of deprecation). So Code Coverage actually increases the amount of work I need to do for very little gain and a lot of frustration.</p>
<p>Secondly, Code Coverage is easy to spoof or misinterpret. Since it’s a metric measuring the execution of source code, you need only…well…execute the source code. It&#8217;s a simple matter to construct a series of wonderfully useless tests to do just that and obtain a high Code Coverage result &#8211; it&#8217;s done all the time in my experience once someone&#8217;s patience in writing quality unit test runs out. It is particularly evident in cases where unit tests are written after the source code is completed &#8211; a still too common practice in PHP. The less villainous flipside is that certain nuggets of source code are fundamentally difficult to test. For example, a complex algorithm suffering from poor documentation may make composing a suitable unit test near impossible. The rollout of OAuth was filled with such examples.</p>
<p>This leads into my opinion of Code Coverage. I view the venerable Code Coverage metric as a near pointless exercise. While it may tell how much source code a test suite exercises, it tells you nothing about the actual quality of those unit tests. They could be good tests, sort-of-good tests or absolutely horrendous tests &#8211; Code Coverage will never tell you either way. I say near pointless because there are precious few alternatives. We need something to give us a reason to trust and have confidence in test suites and Code Coverage is easy to implement and has been a part of PHPUnit since forever. So, by and large, we make do. We measure Code Coverage just to make certain some kind of unit testing was performed.</p>
<p>Is there nothing better?</p>
<p>A good unit test serves a simple purpose. It verifies a behaviour of an object. In PHP, we&#8217;re more likely to verify umpteen million behaviours in a single test (count your assertions!) but we&#8217;ll let that slide. Since a test verifies behaviour, it follows that a test should fail when that behaviour is changed. If a test does not fail when class behaviour is changed, it also follows that the original behaviour was not fully tested, i.e. there is a gaping hole in our test suite whether due to a flawed or missing test that could allow bugs entry into our application. So, to really stick unit tests under a microscope to assess their quality and our confidence in them, we need to introduce changes into the source code under test and see if the unit test suite can or cannot detect them.</p>
<p>This process is known as Mutation Testing. Mutagenesis is a Mutation Testing framework for PHP 5.3+.</p>
<p>Mutation Testing, as you have probably surmised, is not a super-complex activity. You take a set of source code and compile a list of possible &#8220;mutations&#8221; that are likely to break the behaviour of the source code. Then, you apply one mutation to that source to create a &#8220;mutant&#8221;, i.e. a copy of the source code with the mutation change applied. Next, you run the source code&#8217;s test suite against the mutant and see if any tests fail. If a test fails, celebrate &#8211; the mutation was detected so your tests were, in this instance, adequate. If no test fails, curse the Gods &#8211; the mutation was not detected and you&#8217;ll need to figure out whether a new test is needed or an old one modified/corrected. Rinse and repeat the above for each mutation you&#8217;ve compiled.</p>
<p>Mutations are typically quite simple such as replacing operators, booleans, strings and other scalar values with either an opposing form or a random value. Expressions might also be reversed or driven to zero to give an opposing boolean or zero value. Making such minor changes seems like a minor irritation but behind every serious flaw in an application is one or more smaller contributing errors. If your test cases can detect the potentially contributing errors, then there&#8217;s an excellent chance it would detect the bigger ones anyway. This is known as the Coupling Effect in Mutation Testing.</p>
<p>Some of you will be vaguely aware of Mutation Testing. In terms of implementations, Ruby has heckler, Python has Pester, and Java has Jumbler, Jester and a couple of others. Those who prefer Microsoft&#8217;s technologies can use Nester. There&#8217;s a running ryhme apparent since so much is inspired by the original Jester framework for Java. To my knowledge, Mutagenesis will be the only Mutation Testing framework for PHP (though I sincerely wish I was wrong).</p>
<p>Examining those libraries, you eventually realize a few problems with Mutation Testing which explain its lack of popularity until relatively recently: performance is a concern and Mutation Testing requires a Human Brain to complete the process.</p>
<p>Performance is a concern because each mutation requires a test suite to be executed. Imagine a set of classes from which you extract 100 possible mutations, coupled with a test suite that takes 5 minutes to run. A basic Mutation Testing framework (e.g. Ruby&#8217;s heckler) would therefore take 500 minutes to complete a Mutation Testing session. That&#8217;s 8.3 hours of continuous Mutation Testing. Mutation Testing for Zend Framework would be very interesting <img src='http://blog.astrumfutura.com/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' /> .</p>
<p>Similar to Jumbler for Java, Mutagenesis will utilise a few heuristics (shortcuts) to significantly improve performance without compromising results. We only need one single test to fail in order to rule that a mutation was detected and killed, so we can do a few things to boost performance:</p>
<p>1. Terminate the test suite on first failure/error or exception.<br />
2. Execute test cases in order of execution time ascending (fastest first; slowest last).<br />
3. Prioritise execution of last test case to detect a mutant to take advantage of same-class detection.<br />
4. Log which tests detect which mutations, and prioritise those associations in subsequent runs.</p>
<p>The effect of the above is to speed up Mutation Testing by a significant degree. The final heuristic ensures that for gradually changing source code and tests, the first Mutation Testing process might take a while but subsequent runs will be significantly faster making them far more usable in a Test-Driven Development setting. Mutation Testing is best served with a healthy dose of efficiency.</p>
<p>The second reason for its lack of popularity is that Mutation Testing can&#8217;t analyse the logic of the source code under test. For example, an expression might accept any integer less than 10 to evaluate to TRUE. If the input from another class were 7, and a mutation were generated to swap this for a 9, then the associated unit test would still pass (the mutation of switching 7 for 9 still allows the &lt;10 expression evaluate to TRUE). If you recall, if a mutant passes a test suite than we assume either the presence of a flawed test or the lack of a suitable test. Obviously, as the above suggests, this isn&#8217;t always the case. Mutation Testing can and often will report false positives.</p>
<p>Ruling out false positives, coupled with the need to improve test suites to detect more mutations, makes Mutation Testing a source of extra work. Who likes extra work least? Programmers, especially the lazy kind <img src='http://blog.astrumfutura.com/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' /> .</p>
<p>Mutation Testing is not a far fetched idea. The principles are sound and it beats the pants off Code Coverage when it comes to measuring what confidence we can have in our testing suites. It is still hampered, as a methodology, by the lack of good implementations in other programming languages. Mutagenesis, by adopting implementation heuristics from Java&#8217;s Jumbler, should avoid that fate and offer a decent framework in PHP that performs as well as can be expected.</p>
<p>Once it&#8217;s released…of course <img src='http://blog.astrumfutura.com/wp-includes/images/smilies/icon_razz.gif' alt=':P' class='wp-smiley' /> . Mutagenesis is in development but should see a fresh release in a couple of weeks alongside Mockery. I&#8217;ll be looking forward to seeing how people perceive it. Mutation Testing has zero presence in PHP to date but having something to complement Code Coverage can&#8217;t do any harm!</p>
<div class="zemanta-pixie" style="margin-top: 10px; height: 15px;"><a class="zemanta-pixie-a" title="Enhanced by Zemanta" href="http://www.zemanta.com/"><img class="zemanta-pixie-img" style="border: medium none; float: right;" src="http://img.zemanta.com/zemified_e.png?x-id=fa73b631-bedb-4ef0-ac5e-aa59b21aca6a" alt="Enhanced by Zemanta" /></a></div>
]]></content:encoded>
			<wfw:commentRss>http://blog.astrumfutura.com/2011/08/what-is-mutation-testing/feed/</wfw:commentRss>
		<slash:comments>19</slash:comments>
		</item>
	</channel>
</rss>

