PHP, Zend Framework and Other Crazy Stuff
Posts tagged humbug
Securely Distributing PHARs: Pitfalls and Solutions
Mar 3rd
The PHAR ecosystem has become a separate distribution mechanism for PHP code, distinct from what we usually consider PHP packages via PEAR and Composer. However, they still suffer from all of the same problems, namely the persisting whiff of security weaknesses in how their distribution is designed.
What exactly can go wrong when distributing any sort of PHAR?
- Downloading PHARs from a HTTP URL not protected by TLS.
- Downloading PHARs from a HTTPS URL with TLS verification disabled.
- Downloading PHARs which are unsigned by the authors.
- Downloading any PHAR “installer” unnecessarily.
All of the above introduce an element of risk that the code you receive is not actually the code the author intended to distribute, i.e. it may decide to go do some crazy things that spell bad news when executed. A hacker could mount a Man-In-The-Middle attack on your connection to the PHAR server, or compromise the PHAR server and replace the file, or employ some DNS spoofing trickery to redirect download requests to their server.
I’ve started to distribute a CLI app phar of my own recently for Humbug, so I had to go and solve these problems and make installing, and updating, that phar both simple and secure. Here’s the outline of the solution I’ve arrived at which is quite self-evident.
- Distribute the PHAR over HTTPS
- Enforce TLS Verification
- Sign your PHAR with a private key
- Avoid PHAR Installer scripts
- Manage Self Updates Securely
- Do all of this consistently
Some details and a discussion on each point…
Distribute the PHAR over HTTPS
If you really don’t already have a TLS enabled download location, you can avail yourself of Github.io which supports HTTPS URLs. I’m using this for Humbug‘s development builds. You can also use Github Releases for your project and attach the phars there for new versions. If you do need to host the PHAR on your own server, get a TLS certificate for your domain.
Enforce TLS verification
PHP supports TLS verification out of the box, for the most part. It was disabled by default until PHP 5.6. Enforce it! If a user cannot make a simple request to a simple HTTPS URL, then their server is quite obviously misconfigured. That is not your problem, so don’t make it your problem. You use HTTPS, you enforce TLS, and other programmers should be more than capable of fixing their own stuff. Insecure broken systems are not the lowest common denominate you should be targeting.
Enabling TLS verification for PHP’s stream functions, e.g. file_get_contents(), is basically a disaster waiting to happen because its configuration can be fairly long winded to get just right. As something of a shim, I’ve created the humbug_file_contents package which has a ready to roll TLS-loving function that can replace file_get_contents() transparently, but only when it detects a PHP version less than 5.6.
PHP 5.6 introduced significant TLS improvements which were enabled by default. In certain areas, it actually exceeds what might be expected from other options, and it’s certainly better than any combination of pre-5.6 options can currently achieve (even humbug_get_contents() can only rate a “Needs Improvement” in its remote tests which is good enough until PHP 5.5 goes EOL).
TLS verification using stream enabled functions does require openssl. Document it as a required dependency.
Sign your PHAR with a private key
All PHARs should be signed using RSA private keys through openssl. Run the following commands in bash to create a 2048 bit key and use a strong password when prompted.
openssl genrsa -des3 -out phar-private.pem 2048 openssl rsa -in private.pem -outform PEM -pubout -out phar-public.pem
You just created phar-private.pem and phar-public.pem files. The private key is encrypted (hence the password) and the public key is unencrypted. Keep the private key safely offline and do not lose it. If you’re worried about softcopy backups, you can print a hardcopy and put that in a secure location. Do not strip the password from the private key unless you want to reset it!
If your phar file were called foo.phar, you would need to distribute the public key alongside it as foo.phar.pubkey. When someone tries to use a PHAR that’s signed, that’s how PHP locates the public key. For obvious reasons, the public key cannot be part of the PHAR itself.
Never distribute the private key. It shouldn’t even be anywhere near a remote server. It’s an offline key.
Need to automate development versions of PHARs? I’ll talk about that another day, but it’s doable by using your offline key to sign some metadata that authorises a second private key (on a server without password) to sign PHARs as a delegate. It’s not covered in this article, and any use of these approaches have management weaknesses (e.g. revoking stolen or abused keys is not straightforward). Let’s stick to the basics for now.
The public key should also be prominently available online. If I download a copy, there should be a few options to manually verify that the key is correct, e.g. manual and README on different domains. I have not done this for Humbug yet – it’s a simple but easy to forget way to advertise your genuine key.
The act of signing a PHAR requires setting a signature algorithm, assuming the PHP manual ever lets you. PHP’s documentation for PHARs is often outdated and incomplete. The few steps needed are, given a PHAR foo.phar represented by $phar in this PHP snippet:
/** Get private key contents as $privateKey using openssl. Prompt for the password from a script – do not include it in any configuration. */ $phar->setSignatureAlgorithm(\Phar::OPENSSL, $privateKey); copy('/path/to/phar-public.pem', '/path/to/foo.phar.pubkey'); /** wrap up any last PHAR stuff */
There are other PHAR signature types, but these do not use keys, and so they are not performing the same function as key based signing. If the code above is scary, The box library can do the signing and PHAR compiling for you (you can git clone it ).
Avoid PHAR Installer scripts
In the race to make users happy, some PHARs come with an unsigned installation routine packaged as a PHP file which is downloaded and passed directly to the PHP interpreter on the command line. This is usually completely unnecessary. As PHARs are self-contained files, they require no “installation”. Such installers verify if a system meets the necessary dependencies and settings, and then initiate the download (often with TLS disabled unnoticed).
Document your dependencies, and your needed PHP INI settings. Let users check if their systems support your PHARs documented requirements. Then provide them with the URLs to the PHAR and pubkey files to download using wget, curl or their browser.
This misconception of installers being required is reinforced if the URL to your actual PHAR and pubkey (the only two files I need on my filesystem) are not provided without having to open the installer and read code.
Manage Self-Updates securely
This is the process of checking for new versions, downloading them, and replacing the old version. All of the above rules from earlier still apply with a few key twists (such as the irony that a self-update command is suspiciously like an installer!). Luckily, you downloaded a signed PHAR over a TLS protected HTTPS URL (fingers crossed), so you have slightly more trust that it won’t rampage through your system compared to a PHP script piped to PHP from an unprotected URL.
Whenever you download a PHAR that is expected to be genuine and properly signed, you always face a few tasks. It’s generally easier when done manually when getting it the first time because a) you don’t have the public key already so you must choose whether or not to trust it (Trust On First Use - TOFU), b) you should be requesting two specific URLs manually and not relying on potentially egregious installer scripts, and c) this means you should have better than even odds of getting the code you want if the self-updating routine is decently designed.
In other words, you trust the existing PHAR, and you can enforce that trust across all future updates by reusing the existing public key.
For this I’m in the process of writing the phar-updater package. Right now, it supports a basic SHA-1 synchronisation strategy where the PHAR self-updates when a remote SHA-1 version file updates, thus indicating a new PHAR build was released. It then downloads the new PHAR, runs validation routines to ensure it’s genuine (e.g. was signed by same private key as the original), and then replaces the original PHAR in-situ. Of course, it also enforces TLS.
Do all of the above consistently
I’ve just written a big article for two relatively simple to implement things: TLS enforcement and PHAR signing with RSA private keys – all with self-updating support if needed. The outcome, to a user, is that they end up with two files instead of one and a nice self-update option. This is not an outrageous outcome to introducing proper security on PHAR downloads. Go forth and do it for all PHARs. Help create an environment where distributing and installing code in secure ways is the normal expected thing to do.
Those who would prefer a GPG approach can run with that also. There is nothing to prevent distributing current.phar.asc and current.phar.pubkey.asc files for that purpose. However, GPG is a manual checking process whereas using RSA keys can be automated and is performed for all PHAR uses as standard. Neither approach excludes the other.
Lies, Damned Lies and Code Coverage: Towards Mutation Testing
Jan 14th
I spent the vast majority of 2014 not contributing to open source, so I kicked off 2015 by making Humbug available on Github.
About Humbug
Humbug is a Mutation Testing framework for PHP. Essentially, it injects deliberate defects into your source code, designed to emulate programmer errors, and then checks whether your unit tests notice. If they notice, good. If they don’t notice, bad. All quite straightforward. Humbug will log which defects were not noticed by your unit tests, complete with diffs, and provide some basic metric scores so that you can fuel your Github badge mania someday.
You can try out Humbug today, though it remains a work in progress (PHPUnit only) and certain combinations of code, tests and moon phase may result in “issues”. Do try it though, it’s polishing up very nicely and I’m looking forward to a stable release. The readme has more information.
This article however is mostly reserved to explain why I wrote Humbug, and why Mutation Testing is so badly needed in PHP. There’s a few closing words on Mutation Testing performance which has traditionally been a concern impeding its adoption.
Code Coverage
Code Coverage is a measure of how many lines of code a unit test suite has executed. Not tested. Executed. If lines of code are not executed then, it logically follows, they were not executed by the test suite. This might be bad. There are probably tests missing. Off to the editor with you!
This distinction between testing and executing is all important, and something I feel that we’ve lost sight of in PHP when we’re busy decorating our Github pages with nice green 100% badges and talking about imposing 100% Code Coverage.
Let’s imagine that you have 100% Code Coverage. That’s actually a lie. More specifically, you actually have 100% Line Coverage. PHP_CodeCoverage and XDebug are incapable, at this time, of measuring Statement Coverage, Branch Coverage, and Condition Coverage. Your 100% score is only 25% of the story. Let’s call it 10% because I’m mean and there are other forms of Coverage that I have not mentioned.
Your Code Coverage is now 10%.
You know, I think I was too generous, and I’ve arbitrarily assigned a score. This simply will not do and its unfair to those who, in reality, might have 11% Code Coverage. We’ll have to take a more scientific approach.
Your Code Coverage is now 0% pending scientific research and peer review.
We can rephrase all of the above as follows: Line Coverage is an indicator of where source code was definitely not executed by any test. It does not indicate that a line was tested, or even fully exercised, merely that something on that line was executed at least once.
Taking this at face value, we can invent a problem to provide more illumination on how Line Code Coverage can mislead us:
if ($i >= 5) { // do something }
The above is a condition where there are three possible outcomes. $i will either be greater than 5, equal to 5 or less than 5. Two of these possibilities will evaluate the expression to true, the other to false. This suggests that we need 3 tests – one for each of the outcomes. We also need to be very specific. What if an error changes the 5 to a 6? Testing if 10 passes would be a bad test. What if it were changed to a 4? Then not testing with values of 4 or 5 would make for bad tests. It’s not all random integers we want in such tests – their selection should be deliberately targeting the boundary of the condition so as to avoid writing overly positive tests that are unlikely to ever fail.
Writing just one test that executes the above line will still leave us two tests short of where we should be. How do we know when those two tests are missing? Line Code Coverage will give us a 100% percent score for writing between 0 and 33% of the expected effective tests.
Dave Marshall recently wrote about Code Coverage with another real life example.
Line Code Coverage in PHP is simply not fit for our purposes. Being the sole possible Code Coverage type in PHP at present does not excuse it from being a misleading, inflated, and overly trusted metric that is easily fooled by writing bad tests and relying on coincidental execution.
The more insidious problem is that relying solely on Code Coverage as a measure of test quality, which is what we often end up doing, is attempting to automate an intellectual task. You can’t simply run a magic report and leave your brain at home. Your brain is very much required when assessing test suite effectiveness.
Measuring Unit Test Quality
Above, I made a distinction between code that was executed and code that was tested. Code coverage is an assertion that code is executed. It’s entirely possible to attain 100% Code Coverage, yet test absolutely nothing at all. The probable methods of achieving this are through tests which make no assertions, positive tests with long odds of failure, and coincidental execution by tests not specifically targeting the line (see PHPUnit’s @covers annotation).
This needn’t be intentional! It’s quite easy to overlook tests and that’s why we use Code Coverage to help us identify missing tests. We just can’t rely it as our sole means of ensuring test existence. Better Code Coverage would help us find a lot more missing tests, but it’s still solely a measure of execution.
So, given a test suite with 100% Line Coverage, how can we examine the test suite and arrive at any conclusion as to its quality and effectiveness in preventing regressions?
This is where Mutation Testing shines.
Mutation Testing
Imagine our original example:
if ($i >= 5) { // do something }
During Mutation Testing, Humbug would introduce three subtle defects, i.e. mutations. It would mutate the “>=” to each of “>” and “<”. It would also mutate the “5” to “6”. Depending on the nature of the code block, this should result in unexpected behaviour that your unit tests, if written well, should have assertions against. Occasionally, a mutation is equivalent to the original statement (e.g. perhaps $i is hardcoded to >5 and it’s not actually settable from a test) but we would expect the false positive rate to be minimal.
For each mutation, noting that only one is applied at a time, we run the relevant unit tests. If a defect causes a test failure, error or a timeout (infinite loops may occur infrequently with a mutation) then we can assert that this particular defect is tested. If the tests all pass, we can assert that this defect was not tested and we can log it for investigation. A new test would now be needed to cover that defect unless, of course, it’s a provable false positive.
We are no longer playing games with execution statistics. We’re actually measuring the effectiveness of a test suite, and improving its effectiveness over time. The provided scores, taken with a double pinch of salt, assist in gauging how bad or good defect detection is by calculating the ratio of detected mutations to the total generated and the total covered by tests (yes, we contrast to Code Coverage). The logs are the more valuable output, offering diffs for each undetected mutation. These can be examined (by an actual living entity) to see where new tests might be needed.
Your Code Coverage metrics essentially tell me nothing about the effectiveness of your unit test suite. They only tell me that your unit tests executed stuff. Your Mutation Testing scores, on the other hand, give me some ballpark estimates on the real effectiveness of those same tests.
Performance
I can’t sign off without mentioning Mutation Testing performance.
Traditionally, Mutation Testing has been ridiculously slow, often running the entire test suite for every single mutation. On one library this morning, I generated close to 1000 mutations. The test suite typically took 5 seconds to run. Doing the math is close to crazy. The solution implemented by Humbug was to take something I criticised (ahem, Code Coverage) and use its data to only run tests which execute the mutated line. It takes around 2 minutes for Mutation Testing of that library. In another example, a library with ~5000 tests running in 3 minutes took around 12 minutes to mutation test (~1.5k mutations were generated).
I expect to improve on that even more and enable specific class targeting as a future feature. It would be even faster if we had improved Code Coverage in PHP. And, as always, your mileage will most definitely vary – performance is influenced by the mutation count and the performance of both code and tests. Slow tests, in particular, while ordered to run last may have a significant impact.
Tools like Humbug are no longer restricted to academic papers.
I bring this up, because performance is clearly one huge reason why Mutation Testing hasn’t already become commonplace despite its very obvious benefits. You won’t be mutation testing all the time, but running it occasionally for your entire test suite, or at least a few times for each new testable class, is quite reasonable and within current reach. Implementing filters and other focus aids, would allow for even more dynamic and regular usage alongside your testing framework to keep feedback regular and fast.
I’ll blog more specifically about Humbug in time as development rolls on.