<?xml version="1.0" encoding="utf-8" ?>

<rss version="2.0" 
   xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
   xmlns:admin="http://webns.net/mvcb/"
   xmlns:dc="http://purl.org/dc/elements/1.1/"
   xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
   xmlns:wfw="http://wellformedweb.org/CommentAPI/"
   xmlns:content="http://purl.org/rss/1.0/modules/content/"
   xmlns:creativeCommons="http://backend.userland.com/creativeCommonsRssModule">
<channel>
    <title>Maugrim The Reaper's Blog - PHP Security</title>
    <link>http://blog.astrumfutura.com/</link>
    <description>Pádraic Brady on PHP, PHP Game Development and More</description>
    <dc:language>en</dc:language>
    <generator>Serendipity 1.1 - http://www.s9y.org/</generator>
    <pubDate>Wed, 14 Jul 2010 22:31:41 GMT</pubDate>

    <image>
        <url>http://blog.astrumfutura.com/templates/default/img/s9y_banner_small.png</url>
        <title>RSS: Maugrim The Reaper's Blog - PHP Security - Pádraic Brady on PHP, PHP Game Development and More</title>
        <link>http://blog.astrumfutura.com/</link>
        <width>100</width>
        <height>21</height>
    </image>

<item>
    <title>HTML Sanitisation Benchmarking With Wibble (ZF Proposal)</title>
    <link>http://blog.astrumfutura.com/archives/430-HTML-Sanitisation-Benchmarking-With-Wibble-ZF-Proposal.html</link>
            <category>PHP General</category>
            <category>PHP Security</category>
            <category>Zend Framework</category>
    
    <comments>http://blog.astrumfutura.com/archives/430-HTML-Sanitisation-Benchmarking-With-Wibble-ZF-Proposal.html#comments</comments>
    <wfw:comment>http://blog.astrumfutura.com/wfwcomment.php?cid=430</wfw:comment>

    <slash:comments>17</slash:comments>
    <wfw:commentRss>http://blog.astrumfutura.com/rss.php?version=2.0&amp;type=comments&amp;cid=430</wfw:commentRss>
    

    <author>nospam@example.com (Pádraic Brady)</author>
    <content:encoded>
    In January of this year, I had the idea of writing a HTML Sanitiser for PHP. Why not? All PHP has is HTMLPurifier and a bunch of random solutions that are about as secure as the average wooden gate. If you think that&#039;s harsh, wait for my next blog post &lt;img src=&quot;http://blog.astrumfutura.com/templates/default/img/emoticons/wink.png&quot; alt=&quot;;-)&quot; style=&quot;display: inline; vertical-align: bottom;&quot; class=&quot;emoticon&quot; /&gt;. HTMLPurifier is the only secure by default HTML Sanitiser in PHP. Fact. But the darn thing is gigantic and slow. That has never stopped me using it (for years), even if I had to do a little funky engineering so I could minimise the performance hit. Other developers, however, have often abandoned HTMLPurifier, falling into the trap of believing that alternative solutions will serve them just as well.&lt;br /&gt;
&lt;br /&gt;
That&#039;s the state of HTML Sanitisation in PHP - pick a big slow library that crushes Cross-Site Scripting and Phishing attacks, or use yet another regular expression based sanitiser that a) barely manages a fraction of HTMLPurifier&#039;s features and b) can probably be exploited by any scriptkiddie working with a stack of data cards. It says an awful lot about security standards among PHP developers that such delusions are uncomprehendingly rampant.&lt;br /&gt;
&lt;br /&gt;
In case you haven&#039;t noticed, I&#039;m biased. Sue me.&lt;br /&gt;
&lt;br /&gt;
I have opined since forever that regular expression sanitisers are nothing short of insane. Since the problem with HTMLPurifier is speed and size, I started thinking about ways to build something like HTMLPurifier that was fast, small and almost as feature packed as HTMLPurifier. At first, this sounds like an impossible task. The typical suggestion is to use regular expressions, but I&#039;m not completely insane...yet. Instead I borrowed a concept called a DOM Filter and chucked in a helpful dose of HTML Tidy. The result was &lt;a href=&quot;http://github.com/padraic/wibble&quot;&gt;Wibble&lt;/a&gt;.&lt;br /&gt;
&lt;br /&gt;
Wibble is basically a DOM Filter. It loads up HTML into PHP DOM, applies a set of filters against all nodes in the DOM, passes the output through HTML Tidy, and then hands it back to the user - sanitised and well-formed. It&#039;s almost stupid in its obviousness. Better, this allows Wibble to skip regular expression dependence. It operates far more like HTMLPurifier by relying on a DOM representation (no string parsing to funk around with) partnered with Tidy for cleanup.&lt;br /&gt;
&lt;br /&gt;
Of course, there have to be regular expressions somewhere. And whitelists. And other stuff. Wibble is really an amalgamation of borrowed concepts. It&#039;s hard to be too original in HTML Sanitisation because originality is a good way to shoot yourself in the foot (hence regex is EVIL!), so I wasn&#039;t going to spend too long digging my own grave when there is a wealth of sanitisation resources in the programming world. Wibble&#039;s approach borrows elements from Ruby&#039;s loofah, Python&#039;s HTML5Lib, and Java&#039;s AntiSamy. Wibble mixes and matches from the useful design elements each of these offers, serving them up on top of PHP&#039;s DOM and Tidy extensions with its own distinctive twists.&lt;br /&gt;
&lt;br /&gt;
I completed the first Wibble prototype recently, so I figured that with something that was at that 90% point where the remaining 10% would be in-depth sanity testing, cleanup and documentation, it was time to see how it compared to some other PHP solutions (&lt;a href=&quot;http://www.htmlpurifier.org&quot;&gt;HTMLPurifier&lt;/a&gt; and &lt;a href=&quot;http://www.bioinformatics.org/phplabware/internal_utilities/htmLawed/&quot; &gt;HtmLawed&lt;/a&gt;). I had some fairly conservative performance objectives so the results came as a pleasant surprise.&lt;br /&gt;
&lt;br /&gt;
If you are a benchmark fiend, you can download and independently fiddle with my benchmark process from &lt;a href=&quot;http://github.com/padraic/wibble-benchmarks&quot;&gt;http://github.com/padraic/wibble-benchmarks&lt;/a&gt;. Note that the current benchmark uses a Wibble prototype - there are additional elements that need to be added over time. The benchmark currently uses three sample snippets of HTML: Small (blog comment size), Medium (markup heavy with limited textual content), and  Big (markup light with lots of textual content). It operates by filtering each HTML sample 200 times with each benchmarked HTML sanitisation solution. Each iteration includes the instantiation and setup phases of each solution (where relevant) to reflect the most likely real world experience of using sanitisation as a once off (non-repeating in same request) process. I use PEAR&#039;s Benchmark package to record the aggregate run time per loop of sanitisation tasks. All operations occur within one single PHP process with HTMLPurifier caching enabled (Wibble and HtmLawed do not use caching). Each solution is configured as close as possible to target total stripping of all HTML from the content.&lt;br /&gt;
&lt;br /&gt;
You can view a sample result at &lt;a href=&quot;http://gist.github.com/468426&quot;&gt;http://gist.github.com/468426&lt;/a&gt;.&lt;br /&gt;
&lt;br /&gt;
The results show that both Wibble and HtmLawed outperform HTMLPurifier by a very wide margin. Wibble underperforms HtmLawed by a variable margin - from twice as slow on small to medium sized input, to four times slower on large inputs with minimal HTML tags. In Wibble&#039;s slowest benchmark, it outperformed HTMLPurifier by a factor of four.&lt;br /&gt;
&lt;br /&gt;
Wibble intent is to try and replicate the completeness of HTMLPurifier, so it&#039;s speed deficit when compared to HtmLawed is expected (when stripping all tags). There is not a lot to be done to improve this specific benchmark result since Wibble does a lot of stuff behind the scenes like encoding normalisation, DOM manipulation and HTML tidying. It also does all three of these things far more consistently and completely than HtmLawed is capable of.&lt;br /&gt;
&lt;br /&gt;
So how does Wibble match up against Big Daddy? Wibble is a prototype, so obviously it still has ground to gain in terms of features with HTMLPurifier. But on the most significant points it only has one specific problem - it&#039;s not HTML 5 ready. Neither DOM or Tidy support HTML 5, though you can &quot;pretend&quot; it&#039;s HTML 4.01 (or even XHTML 1.0) for HTML 5 fragments so long as you are aware Tidy will strip unsupported HTML 5 tags and attributes.&lt;br /&gt;
&lt;br /&gt;
The other points are syncing up with HTMLPurifier quite nicely. Wibble will santitise all HTML by default using strict filters (i.e. by default it strips every tag and only outputs plain text). It handles multiple encodings including conversion if necessary. It outputs standards compliant (other than HTML 5) HTML or XHTML. It fixes all the usual page breaking stuff like unclosed tags and illegal tag nesting. It is entirely reliant on whitelists and strict validation rather than blacklists and loose reconstructive parsing. It includes minimal regular expression usage (only needed for attribute and CSS validation) based on regular expressions widely used and tested in other languages. While testing will (and must) continue, it has so far proven resistant to XSS and Phishing attacks. This can&#039;t be absolutely assured until sufficient testing has been performed.&lt;br /&gt;
&lt;br /&gt;
Otherwise, it will be interesting to see the final version of Wibble. HTMLPurifier has a tough reputation to follow, but having something which can even up the odds and do it with a pronounced advantage in speed will be really nice. Well, until someone needs to install it on CentOS &lt;img src=&quot;http://blog.astrumfutura.com/templates/default/img/emoticons/wink.png&quot; alt=&quot;;-)&quot; style=&quot;display: inline; vertical-align: bottom;&quot; class=&quot;emoticon&quot; /&gt;.  
    </content:encoded>
    <dc:creator>P&#225;draic Brady</dc:creator>

    <pubDate>Thu, 08 Jul 2010 20:50:31 +0000</pubDate>
    <guid isPermaLink="false">http://blog.astrumfutura.com/archives/430-guid.html</guid>
    <category>benchmark</category>
<category>php general</category>
<category>php security</category>
<category>wibble</category>
<category>xss</category>
<category>zend framework</category>
<category>zf proposal</category>
<creativeCommons:license>http://creativecommons.org/licenses/by/1.0/</creativeCommons:license>
</item>
<item>
    <title>Zend Framework Community Review Team</title>
    <link>http://blog.astrumfutura.com/archives/429-Zend-Framework-Community-Review-Team.html</link>
            <category>PHP General</category>
            <category>PHP Security</category>
            <category>Zend Framework</category>
    
    <comments>http://blog.astrumfutura.com/archives/429-Zend-Framework-Community-Review-Team.html#comments</comments>
    <wfw:comment>http://blog.astrumfutura.com/wfwcomment.php?cid=429</wfw:comment>

    <slash:comments>5</slash:comments>
    <wfw:commentRss>http://blog.astrumfutura.com/rss.php?version=2.0&amp;type=comments&amp;cid=429</wfw:commentRss>
    

    <author>nospam@example.com (Pádraic Brady)</author>
    <content:encoded>
    For those of you not presently keeping watch on the Contributors mailing list or IRC, a Community Review Team (CR Team) has been established to assist with contributions to the Zend Framework. The role of the team will take a bit of time to settle into and explore, but Matthew Weier O&#039;Phinney summarised it as follows:&lt;br /&gt;
&lt;br /&gt;
&lt;blockquote&gt;- Assist contributors in getting patches and features into existing components.&lt;br /&gt;
  - Act as liaison for contacting a maintainer on behalf of a contributor&lt;br /&gt;
  - If the maintainer refuses to accept a patch, act as an arbiter between the contributor and the maintainer&lt;br /&gt;
  - If the maintainer does not respond after a set period of time, would evaluate and/or apply the patch for the contributor&lt;br /&gt;
  - Would issue pull requests to the Zend team in such instances as the above&lt;br /&gt;
- Identify orphaned components&lt;br /&gt;
  - Would identify when a component is no longer under active maintenance&lt;br /&gt;
  - Solicit volunteers to take over maintenance of orphaned components&lt;br /&gt;
  - Decide when an orphaned component should be marked as such and scheduled for removal (Note: removal can only happen in major revisions)&lt;br /&gt;
- Shepherd new proposals.&lt;br /&gt;
  - Solicit community feedback on proposals&lt;br /&gt;
  - Would put competing proposal authors in touch with each other to work on a unified proposal&lt;br /&gt;
  - Provide feedback on proposals (including initial decision as to whether or not there is enough community interest in including the proposed functionality in the framework)&lt;br /&gt;
  - Would notify the Zend team when a proposal is ready&lt;br /&gt;
  - Would do initial code review on the proposal implementation&lt;br /&gt;
  - Would notify the Zend team when the proposed feature is feature complete and ready to pull into the master branch&lt;/blockquote&gt;&lt;br /&gt;
&lt;br /&gt;
So essentially, the CR Team will have an advisory/liason role as it pertains to the proposing and maintenance of components. You should note that it will have limited decision capability, and Zend will continue to issue final approval for new proposals. However, if you do have a proposal in the works (or in a queue already), the Team will doubtlessly soon be looking for you &lt;img src=&quot;http://blog.astrumfutura.com/templates/default/img/emoticons/wink.png&quot; alt=&quot;;-)&quot; style=&quot;display: inline; vertical-align: bottom;&quot; class=&quot;emoticon&quot; /&gt;.&lt;br /&gt;
&lt;br /&gt;
The purpose of the CR Team is to assist in streamlining the noted areas: proposals, patches, maintenance of orphaned/abandoned components and communications with Zend and component maintainers as needed. Streamlining is a broad term, and while the specifics will be discussed by the team, it will as noted include component reviews (hopefully on an ongoing basis), offering feedback/advice, and blackmailing the community to take over from absent maintainers &lt;img src=&quot;http://blog.astrumfutura.com/templates/default/img/emoticons/wink.png&quot; alt=&quot;;-)&quot; style=&quot;display: inline; vertical-align: bottom;&quot; class=&quot;emoticon&quot; /&gt;.&lt;br /&gt;
&lt;br /&gt;
The CR Team presently has seven members (IRC nicks in brackets where known):&lt;br /&gt;
&lt;br /&gt;
Pádraic Brady (PadraicB)&lt;br /&gt;
Rob Allen (Akrabat)&lt;br /&gt;
Steven Brown&lt;br /&gt;
Shaun Farrell (farrelley)&lt;br /&gt;
Pieter Kokx (kokx)&lt;br /&gt;
Dolf Schimmel (Freeaqingme)&lt;br /&gt;
Ben Scholzen (DASPRiD)&lt;br /&gt;
&lt;br /&gt;
Most of the names above probably sound horribly familiar (including me!).&lt;br /&gt;
&lt;br /&gt;
The Team is comprised of a fairly broad segment of active Zend Framework contributors with a variety of backgrounds. Most of the CR Team is on IRC on a daily (or weekly) basis and well known from the mailing lists. It&#039;s expected that the CR Team will serve for a fixed period (to be determined) and then have its membership reopened for review/replacements if needed. That process isn&#039;t defined at the moment but we&#039;ll get to it.&lt;br /&gt;
&lt;br /&gt;
In the meantime, if you have any questions regarding the Community Review Team you can find many of us on IRC (#zftalk.dev on Freenode) and/or Twitter. You may also get our attention via the mailing lists. While it will take us a bit of time to spin up our engines and dig into our roles, you should be aware we are out there and willing to help. If you have any particularly urgent questions about proposals, patches or maintainance, I&#039;m sure the Team will be happy to look at those while we&#039;re gearing up for full operation.  
    </content:encoded>
    <dc:creator>P&#225;draic Brady</dc:creator>

    <pubDate>Tue, 08 Jun 2010 19:07:00 +0000</pubDate>
    <guid isPermaLink="false">http://blog.astrumfutura.com/archives/429-guid.html</guid>
    <category>php general</category>
<category>php security</category>
<category>zend framework</category>
<category>zf cr team</category>
<creativeCommons:license>http://creativecommons.org/licenses/by/1.0/</creativeCommons:license>
</item>
<item>
    <title>Mockery 0.6.1 Released</title>
    <link>http://blog.astrumfutura.com/archives/428-Mockery-0.6.1-Released.html</link>
            <category>PHP General</category>
            <category>PHP Security</category>
            <category>Zend Framework</category>
    
    <comments>http://blog.astrumfutura.com/archives/428-Mockery-0.6.1-Released.html#comments</comments>
    <wfw:comment>http://blog.astrumfutura.com/wfwcomment.php?cid=428</wfw:comment>

    <slash:comments>0</slash:comments>
    <wfw:commentRss>http://blog.astrumfutura.com/rss.php?version=2.0&amp;type=comments&amp;cid=428</wfw:commentRss>
    

    <author>nospam@example.com (Pádraic Brady)</author>
    <content:encoded>
    You can read more about Mockery at &lt;a href=&quot;http://blog.astrumfutura.com/archives/427-Mockery-0.6-Released-PHP-Mock-Object-Framework.html&quot; &gt;http://blog.astrumfutura.com/archives/427-Mockery-0.6-Released-PHP-Mock-Object-Framework.html&lt;/a&gt;.&lt;br /&gt;
&lt;br /&gt;
Mockery 0.6.1 includes a functional fix which ensures mocking classes containing variants of the __call() method with or without typehinting are correctly mocked/replaced. I have also downgraded the PHP dependency to 5.3.0 from 5.3.2 by request. Thanks to everyone who so far has offered feedback! Mockery has been downloaded a total of 274 times since it&#039;s original release. Counting those of you doing it twice or three times on differing machines, that probably means around 100 or more people have installed Mockery (at a guess). Remember we have a mailing list if you wish to ask any in-depth questions, you can report issues or feature requests on Github, and I&#039;m usually somewhere on IRC (and Twitter) in the evening times (GMT).  
    </content:encoded>
    <dc:creator>P&#225;draic Brady</dc:creator>

    <pubDate>Wed, 02 Jun 2010 20:10:00 +0000</pubDate>
    <guid isPermaLink="false">http://blog.astrumfutura.com/archives/428-guid.html</guid>
    <category>mock objects</category>
<category>mockery</category>
<category>php general</category>
<category>php security</category>
<category>unit testing</category>
<category>zend framework</category>
<creativeCommons:license>http://creativecommons.org/licenses/by/1.0/</creativeCommons:license>
</item>

</channel>
</rss>