PHP, Zend Framework and Other Crazy Stuff
Zend Framework
CodeIgniter 2.0.2: Cross-Site Scripting (XSS) Fixes And Recommendations
May 10th
As many of my readers know, I have a keen dislike for regular expression based HTML sanitisation. Regular expressions simply do not understand HTML’s nested nature and the numerous possible HTML/CSS standards it must abide by. The result is that far too many developers try to program this understanding (and unfortunately their lack of comprehensive understanding) into home grown sanitisers using as little code and tests as possible. It’s a horrendous and reprehensible practice that has created a large field of so-called sanitisers and XSS cleaners which are riddled with obvious vulnerabilities despite all their sincere and utterly false claims to the contrary. The perception of safety they create is almost always a fantasy. As I’ve said before, this serves only one purpose – to lend support to claims that PHP is insecure. And why disagree given PHP’s prominence on the internet and this continuing refusal by developers to just do the right thing and use a secure solution that really does work?
Since I’ve completed my research into a broad set of these, for now, I’ll close with a final example given its widespread usage, confusing documentation and lack of a clear disclosure to date of security vulnerabilities.
On April 7, EllisLab released CodeIgniter 2.0.2 as a security maintenance release prompted by a report I sent to EllisLab shortly before St. Paddy’s Day (around mid-March). That report indicated the expected response and my own disclosure policy. This blog post is being published in accordance with those. The disclosure to date of the vulnerabilities afflicting previous CodeIgniter versions is mentioned only in the CodeIgniter 2.0.2 news release (from April 7) as follows:
An update to both CodeIgniter Reactor and CodeIgniter Core (v 2.0.1) was released today. This is a security maintenance release and is a recommended update for all sites. The security fix patches a small vulnerability in the cross site scripting filter.
EllisLab’s news release for CodeIgniter 2.0.2 makes mention of “a small vulnerability”. This small vulnerability is mentioned no where else (not even the actual changelog for 2.0.2). In reality, I reported seven distinct vulnerabilities across two classes. These vulnerabilities might allow an attacker to inject arbitrary HTML, CSS or Javascript, i.e. Cross-Site Scripting (XSS) into an application’s output. It would be nice if, in the future, EllisLab aim for more accuracy in their news releases and disclosed both the number and nature of the security vulnerabilities fixed in their release changelogs.
Users of CodeIgniter 2.0.x and 1.7.x are strongly urged to upgrade to CodeIgniter 2.0.2 (or later) as soon as possible to avail of these critical security fixes.
In addition, users are urged to follow some basic steps when writing or updating CodeIgniter applications:
- Escape ALL data being injected into views using PHP’s htmlspecialchars() function, remembering to pass the character encoding being used as the third parameter. A helper function may be useful to keep the typing to a minimum.
- Use HTMLPurifier when you need to sanitise HTML data or user input such as HTML comments, HTML emails, or RSS/Atom content (basically any HTML you do not explicitly generate yourself!).
- Ensure that all HTML pages are served with a valid Content-Type HTTP header and/or a meta tag equivalent which also declares the charset for that page. Note that HTML5 offers a separate charset element for this purpose. This helps prevent character encoding based XSS attacks by informing the browser of the correct character encoding to use.
- Ensure that all views/templates distributed by third parties are likewise reviewed to ensure they utilise proper escaping and XSS sanitisation.
CodeIgniter is one of the most prominent “micro frameworks”, web application frameworks that prosper by offering their users unparalleled simplicity. It is unusual as a framework in that it does not make any reference to standard escaping mechanisms for views/templates such as the PHP htmlspecialchars() function anywhere in its source code, examples or documentation. This may create the unfortunate impression that users should instead filter input using an XSS filter function in the CI_Security class and do nothing on output. Users taking this approach may be particularly at risk from these security vulnerabilities.
My recommendation to the CodeIgniter developers, as documented in my original report, is to deprecate and remove the CI_Security class’ XSS filter. Responsible vendors should never persist in distributing and advocating the use of insecure software. I also urge them to revise their documentation to ensure that best security practice is noted in the area of writing views/templates and offer a shortcut function to an escaping mechanism for HTML output to standardise and ease its use by members.
Mockery 0.7.1 Released – It’s a Test Double Framework
Apr 26th
It’s been up on the PEAR channel for a few days now, but Mockery 0.7.1 has been released. The successor to Mockery 0.6 has been brewing for a few months and represents a feature complete version of Mockery’s Mock Objects implementation with a vastly expanded set of features and more than one bug fix
. There has never been a better time to give it a whirl. Have a look at the README to see what Mockery is capable of and how to start using it in your unit tests today.
The 0.7.1 version is the result of quite a bit of welcome feedback from users. I’d like to thank everyone who tested it, forked it on Github, debugged it, and suggested feature additions and improvements. Feedback makes the open source world go ’round and ’round.
The race to 1.0.0 is far from over. Mockery 0.8 is the target release for a Test Spy implementation. Test Spies turn Mock Object upside down by allowing users to make assertions about the interactions a Test Spy experienced after the fact. This contrasts with Mock Objects which require that all interaction expectations are set ahead of time. Using Test Spies tends to serve some users better by following the typical unit test paradigm of assertions, and by reducing the overall amount of code needed in many cases. Once added, Mockery will become a true Test Double Framework with support for doing things whatever the hell way you want: Fake Objects, Test Stubs, Mock Objects and Test Spies. Knock yourself out.
On a sidenote, I’d like to mention that PHP can benefit a lot from using leaner meaner Test Double Frameworks. I’m happy to say that Mockery is not the only alternative that has emerged. Mike Lively has recently being working on Phake, a PHP Mocking Framework. That means you all have two choices which are entirely independent of traditional testing frameworks (and a lot better than traditional mock object solutions
).
As usual, your feedback is more than welcome. Mockery is on Github and you can find our issue tracker and a handy fork button at the same URI
.
Wishing For A PEAR Channel Aggregator? Yes, Please!
Apr 12th
There are a few things many PHP developers should be familiar with. We should be familiar with PEAR packages. We should be familiar with the PEAR installer. More and more of us actually are getting familiar with running PEAR channels. The problem that some of us have, like me, is that we’re working against an architecture which focuses on the central PEAR repository. It’s the elephant in the room, so to speak, and we spend year after year working around or completely ignoring it.
While I do criticise PEAR through this statement, that would be missing the point. PEAR is a story of two parts – the distribution mechanism using the PEAR installer and PEAR channels, and the centralised package repository. The problem is splitting the two, and I don’t believe the PEAR group (along with the rest of us) have really considered that which is a shame. As Till Klampaeckel recently stated, “PEAR packages are not as easy to use as some code you copy-pasted off the Zend devzone or phpclasses. While I agree, that we should try to make it just as easy, it’s just not one [of] PEAR’s goals right now.”.
That kind of sums up my issues with PEAR as a distribution mechanism – it’s not as easy as it could be and it really ought to be a primary goal. It’s not… PEAR obviously treats its own package repository as an elite citizen. You don’t need to discover its channel, or use a channel prefix, or go consulting Google to find something useful (it has a neat categorised package listing). No other PEAR channel set up independently has those advantages and the extra steps can’t compete with a simple download from a website. To install from another channel, you need to go search Google for a suitable library, check if it even has a PEAR channel, find the channel URI, guess the channel prefix – and then finally install something. Since channel discovery is probably not automatically performed (by default), dependency resolution can lead to more pain. This assumes hosting a PEAR channel is easy (which it is given the recent explosion of them using sane channel hosting tools like Pirum). Maybe that’s our trigger – PEAR channels are easy now. Suddenly, all the libraries I use are, or will be, hosted on a PEAR channel. Even Zend Framework 2.0 is heading to PEAR split by component. It’s fantastic!
Since we seem to like blaming the PEAR Group, and getting that ball kicked back to us, it’s time we did something useful. We’ve spent too much time ignoring PEAR as we grew apart from it with our frameworks, standalone libraries and custom plugin architectures. We’re making life harder for ourselves in doing so. Stuart Herbert has posted a short article to gather requirements for a Pear Channel Aggregator. I strongly suggest that interested PHP programmers drop by and add a comment with some suggestions/feedback. Let’s get this thing moving forward!
Gathering Requirements For A PEAR Channel Aggregator
Once that article went up, we started seeing that people are out there working on the problem. I don’t see any of them as being a final solution, and God forbid we just adopt a half-measure for some limited subset of requirements. It would be far preferable, if not essential, to see a complete solid solution meeting everyone’s varied requirements. There’s no excuse not to. Either we want a robust complete solution or we don’t.
As to the concept of a channel aggregator, I view it as a shift away from PEAR’s focus on a centralised repository to a focus on supporting and enabling an open decentralised system with one (or more) focal points. PEAR’s own channel should just be one among many. A channel aggregator should do away with channel prefixes, channel discovery, compulsory reliance on PEAR packaging/channel hosting (git is too popular to ignore), and support combining the package details from potentially hundreds of channels/git repos into one or more competing aggregators that offer easier lookup, rankings, user feedback, etc (baby steps obviously before we go nuts). Installing any package from any channel or supported git repo should only require hitting up a channel aggregator with its name – no other mucking around. It would become an ecosystem that is impossible to ignore and obvious to utilise for hosting and distribution, not simply of libraries but of components, plugins and anything that can be considered installable. I’m also being careful not to overly point at one aggregator – any decent decentralised system should be capable of supporting multiple nodes with indifference, working in tandem or not.
The other side of the problem is PEAR channel hosting and packaging. Hosting is easy. Setup Pirum and you’re ready to go. Packaging needs a better tool. A script that can accept a simple configuration file, build the package, and optionally include a cryptographic signature (because who are we kidding, we need the system to be secure). While building a robust system, overly relying on traditional PEAR distribution methods while the PEAR community has been slaving away on PEAR2 and Pyrus would be a tragedy. Do it right, or not at all. Ignoring all that work, if it proves useful, makes no sense. At the same time, allowing them to create limitations in the requirements would be a serious error.
Regardless, we really need the PEAR Group’s participation because, frankly, their input and support will make the difference between evolving PEAR distribution (with or without the existing PEAR efforts) with us to meet our needs, or sundering the community into two approaches that probably won’t be compatible.
I would hope that we’re capable of improving the entire infrastructure – not just one piece of it. Hopefully, with the assistance of our PEAR colleagues. And hopefully by avoiding segmentation of the overall community. If you’re working on pieces of the puzzle, I appeal to you to band together. Let’s not let fiefdoms derail what could be a massive boost to publishing code (not just libraries) online.
Gathering Requirements For A PEAR Channel Aggregator
In case you missed it the first time!
Private vs Protected Methods: The Debate That Never Ends
Mar 28th
As a new generation of PHP web application frameworks start establishing beach heads in preparation for an all out war for mindshare, I’ve been contemplating some of the key changes we’re seeing emerge that may gain traction over time. Today, I just thought to share my thoughts about private methods vs protected methods. Something that has impacted on Doctrine 2 and Symfony 2. And every other piece of PHP code since PHP 4 came along.
Doctrine and Symfony have adopted a practice whereby all methods are flagged as private unless there are compelling reasons to the contrary (i.e. they are necessarily part of a public API, or form part of an abstract implementation). There is an allowance for appealing that a method be switched from private to protected on a case by case basis.
I’m not a big fan of private methods. I’m in a state of persistent internal conflict over them, in fact. At this point in time, I am in favour of protected methods over private methods.
On one hand, there’s the traditional idea that private methods must never be used outside the current class to prevent interaction with and dependencies on highly unstable units of behaviour. This is a common need arising out of refactoring where non-public methods may vanish, move classes, get renamed or find themselves saddled with different behaviour. These non-public methods are inherently unstable. There is no getting away from that. While this is perfectly normal during development, it can become an irritating concept downstream if you are depending on such implementation details in your own work.
On the other hand, non-public methods contain the meat of any class – the implementation specifics. Using Inheritance, you can extend classes containing this implementation code at very little cost if they are carried by protected methods. This usually tends to make life easier, promotes source code reuse, allows for bug and security fixes ahead of schedule (assuming there it’s not one of those lost bugs that will never be fixed), dependence on existing implementation facets, etc. This all does, however, carry the risk of creating a lot of brittle code that may break at the next update due to the effects of refactoring.
These two strands of thought have been in combat for a long time in many programming languages. It’s no different in PHP. Many developers are likely under the delusion that protected methods constitute a protected API, i.e. that there is a rule that protected methods must retain backwards compatibility across updates. There is no such rule. There is, however, a form of peer pressure to conform to these delusions. Ask any set of developers about backwards compatibility and there’s probably a fair chance that the majority will insist that it applies to protected methods. Unfortunately, this is in direct violation of the principles behind refactoring. In other words, it’s a delusional belief with no basis in software engineering. Since delusional is a strong term, I’ll refer to it as the misguided status quo. There – that’s surely less offensive.
In my view, this is where PHP developers seem to like throwing themselves off a cliff. Rather than perform refactoring, they would rather preserve backwards compatibility on methods that are not even unit tested. All for a group of people down the line who are unwilling to accept that they took a risk in depending on them. If that sounds a wee bit bitter, it is. I’ve seen people do some crazy weird shit to avoid refactoring or other key changes that would make my life so much easier. Some circumstances push me into creating ridiculous brittle workarounds instead.
So we not only have two varying views, we also have two extreme solutions: mark everything private or actively discourage refactoring even to the extent of inventing stupid mind boggling excuses according to some unwritten rule book maintained by the anonymous hivemind (that’s the PHP one, not the shit crazy hacker one). Me? I’m a fan of the middle path: use protected methods unless otherwise required by design, refactor at will and tell anyone who complains about downstream brittleness that I don’t unit test protected methods and that it is fundamentally necessary to change them at times. Most developers are intelligent enough to understand that they took a risk. We can both have our cake and eat it without undue hostility by being a little more upfront.
So why did I finally decide that private methods are not worth enforcing? Like many similar beliefs, it crept up on me over time. Back in the day, when I was programming in Java as a young kid with no sense of independent thought, I would have taken my blind puritanical righteousness to extremes and pounced on anyone who questioned the party line about private methods. These days, I program in PHP, Ruby, and Python. I try to avoid Perl. None of these have enforceable private methods. Perl doesn’t have private visibility at all without hacks. Ruby and Python have easily bypassed implementations (i.e. they are more of a low fence with a “No Trespassing” sign you can hop over as needed – at the risk of being prosecuted and fined). PHP stands out due to that evil conniving Java influence that is overquoted. Too much in PHP is already “influenced” or “inspired” by a language I dropped like a hot coal over a decade ago. I can’t wait until we get over Java as the PHP deity of choice…
Private methods represent more than a “No Trespassing” sign in PHP/Java. In PHP, they are enforceable in theory, i.e. you just cannot call or invoke them. End of story. In reality, this is hogwash. As our colleagues in Ruby, Python, Perl, and all the other programming languages without private visibility enforcement already know – you cannot impose a practice. There will always be the vocal group that tell you to fuck off while they blissfully inherit classes and reuse private methods anyway. It’s easy to do. All you need is a text editor and the keyboard keys CTRL, C and V. Or Reflection. Or sed and grep. Or git. Private visibility enforcement is an illusion (or delusion?).
Once you reach into the dark corners of your mind and realise that private methods and properties are just another pointless easy-to-overcome obstacle, the real problem remains entirely unchanged and unaddressed. How to manage the expectations of downstream users who can and will extend and reuse your code in ways you never expected?
All the band aids in the world won’t answer that, and my response would be to be clearly honest and open. Educate downstream users about your need for refactoring and then make it clear that they proceed at their own damn risk if relying on non-public methods! Then I, you, and the rest of the PHP community can hack, inherit, and gleefully ignore best practices when it suits us without worrying about private method blockades. Not that that would stop us anyway…
Anyway, there’s my train of thought for today. Private methods are nice in theory but unenforceable in practice. If they really were enforceable, those of us who keep creating excuses to fiddle around with core library and framework code might blow a gasket or three out of frustration if the Nanny State mentality really took hold.
Open Letter to Gareth Heyes: Regex HTML Sanitisation Doesn’t Work
Mar 18th
Dear Gareth Heyes,
I thank you for your response that claims Regex HTML Sanitisation can work.
However, I should clarify that my article, Regex HTML Sanitisation: Off With Its Head!, was written in the context of using Perl regular expressions in PHP to both parse and filter HTML. Your challenge to test HTMLReg was unusual since HTMLReg is written in Javascript, operates as a client side library, and utilises the browser DOM to bypass HTML parsing with regular expressions.
As such, HTMLReg and your article title falls outside the context of my original article. I do, however, applaud the concept of using the browser DOM. While I cannot comment on the efficacy of client side filtering for cross-site scripting (XSS), the use of a DOM is a reliable strategy to bypass parsing problems. A similar approach accounts for the success of HTMLPurifier. Obviously, I do not begrudge some minimal use of regular expressions on pre-parsed normalised input.
This did, however, prompt me to ponder whether such an inapplicable challenge appearing on Planet-PHP undermines my argument anyway by its mere existence and blunt title in a world populated by A.D.D. sufferers. I believed it might and so I found myself determined to crack your Javascript library over a cup of coffee and a biscuit.
The result of this quick examination cannot be publicly reported here as this would be poor reporting practice. Therefore, I will report the resulting security vulnerability by email. You now have six weeks from today’s date in which to release a fixed version of HTMLReg and publicly disclose this vulnerability. I trust you will ensure that all similar or related potential vulnerabilities are also fixed. It would also, optionally, be interesting to see a blog post on the effectiveness of a client side Javascript filter.








