TLS/SSL Security In PHP: Avoiding The Lowest Common Insecure Denominator Trap

1

A few weeks back I wrote a piece about updating PHARs in-situ, what we’ve taken to calling “self-updating”. In that article, I touched on making Transport Layer Security (TLS, formerly SSL) enforcement one of the central objectives of a self-updating process. In several other discussions, I started using the phrase “Lowest Common Insecure Denominator” as a label for when a process, which should be subject to TLS verification, has that verification omitted or disabled to serve a category of user with poorly configured PHP installations.

This is not a novel or even TLS-only concept. All that the phrase means is that, to maximise users and minimise friction, programmers will be forever motivated to do away with security features that a significant minority cannot support by default. In the case of PHP users on Windows, this may include not having openssl or curl installed. Without either of these options, TLS verification in PHP becomes impossible without looking outside PHP (e.g. locally available system commands).

The problem is that while programming to the Lowest Common Denominator is fine for many things, doing so to the point of maintaining active security vulnerabilities is not. Let’s take the simple example of Composer. It’s an incredible tool, used by most PHP programmers I know, but it can’t perform TLS verification worth a damn despite operating primarily over HTTPS URLs. On Reddit, there is a another tool just announced which relies on Composer to update application modules. That inherits the same vulnerability by depending on Composer in a live server setting. So too will other Composer dependent tools merely by inheriting from or reusing its download classes. In time, you finally have people seeking refuge in authority because Composer does this, and look, everyone and their pet hamster still uses it!

There’s A Topic In Here Somewhere

Much as I did around writing phar-updater and, more importantly, documenting the reasoning behind a tool that enforces TLS and supports openssl signing as a first citizen, I’d like to drill down into the specifics of how to approach this problem. It’s not an insurmountable one assuming you accept some basic ideas:

  1. You should never knowingly distribute insecure code.
  2. You should accept responsibility for reported vulnerabilities.
  3. You should make every effort to fix vulnerabilities within a reasonable time.
  4. You should responsibly disclose vulnerabilities and fixes to the public.

These four ideas are self-explanatory as the guiding principles that any good security policy is founded upon. When you violate them, you earn general mistrust and reputational damage when your users either figure out that violations occurred, or that those violations contributed towards the worst case scenario: getting hacked and all the ugly outcomes that follow. You only need to go on Reddit and other news sites to find that Magento’s reputation is currently being ripped to shreds over failing to uphold these principles recently.

So, given something like an application where the expectation is that everyone will install it, whether it be on Ubuntu, Windows, or Terminator-X45, how does one go about implementing TLS verification as securely as possible without being overly burdensome on programmers? Is it even possible?

Step 1: Implement TLS Verification

In keeping with those four ideas from earlier, the first course of action is to just implement TLS verification and get a handle on the consequences. Foisting a security vulnerability onto all members without their consent is irresponsible programming and should never be tolerated by the community.

It’s essential to reiterate that Insufficient Transport Layer Protection is a security vulnerability, making it possible for attackers to perform Man-In-The-Middle (MITM) attacks (this applies to intranets as much as on the internet). If it’s located and reported, it falls under your published security policy (if any) and the four central principles expressed earlier. There is a reason why URLs on the internet are prefixed with HTTPS.

We like to think of PHP as the programming language of the web, yet we continue to struggle and fight against a 20 year old protocol that underpins the security of users on that web.

Step 2: Identify The Consequences

All programmers spend a decent amount of time problem solving, and that should come into play now. For most people, there will be no unwanted consequences. They will have openssl/curl installed, and their operating system will have the necessary Certificate Authority (CA) certificates available. The only errors that they will experience will be the usual HTTP fare, and infrequent SSL errors for misconfigured remote servers (not the local system). Attempted MITM attacks will also, very obviously, generate errors assuming the TLS implementation and its dependencies are sound.

The less desirable consequences will then make themselves known. The most commonly quoted one is that Windows users will encounter errors because their local PHP does not have openssl or curl enabled. Other common issues are errors around locating the CA certificates necessary when verifying the remote server’s SSL certificate.

You now have two choices in implementing TLS. Enforce it or disable it.

First of all, you can’t just disable it because it would then put you in the position of deliberately introducing a security vulnerability. Leaving it enabled is not the end of the world. Just because some users have a poorly configured PHP, it does not immediately follow that all users should have their security compromised by default.

The more logical approach is to assess the local system prior to making remote requests. Those who pass muster will be fully secured, and those who don’t? We’ll get to them…

Step 3: Document Solutions & Stand Over Your Dependencies

TLS verification in PHP requires openssl or curl (or both depending on the application dependencies). Short of falling back to secure local system options on the command line, this is an unavoidable part of programming in PHP. So, when users don’t have the requisite extensions, and you have no other fallback to hand, you should simply start by telling them this.

When it comes to missing extensions, the solution is generally just a minor php.ini edit away. On Windows, it’s often just adding or uncommenting a line like “extension=php_openssl.dll”. Don’t only give users an “openssl not installed”, or worse, a puzzling one liner lifted from a PHP error message. Provide some information as to what is missing, why it is required, and a link as to where to find help.

This brings us to documentation. Most of the dependency issues have very simple solutions, editing a line or two in php.ini, or installing/downloading a CA certificate pack. Those extension DLL files are normally available regardless of how you get PHP. You can summarise the common solutions on your website or wiki and include the link in any error or feedback messages within the application output.

Step 4: Let Loose the Big Red Box of Doom?

To this point, you’ve avoided introducing a security vulnerability and have done your best to enable users to fix their dependency and configuration issues. They still want to use your application despite not following your recommendations. Perhaps it’s time to make it possible for users to shoot themselves in the foot without your assistance?

In a CLI application, for example, you can create a new “—disable-tls” flag which disables TLS protections when set. Whenever it is used, a very obvious, very unavoidable and very red box is displayed, informing the user that TLS protections have been disabled for the current command.

Text along the lines of “The end of the world is nigh!” would probably be too much. Mentioning that they are now vulnerable to Man-In-The-Middle attacks would simply be fact.

Conclusion

At its crux, this article is as much about user consent as anything else. Distributing code which simply turns off all TLS protection (or indeed, any security protection) without a user’s knowledge is irresponsible. Enforcing it but allowing for users to opt-out of that protection after seeing an informative warning is not quite as dire. Some security professionals will still be unhappy with the idea of ignorant users just wanting all the errors to go away and being able to achieve that, but this is actually an approach that exists in modern browsers whenever an invalid SSL certificate or TLS error is experienced.

At some point, users need to take on a bit of the responsibility for their own protection.

Doubtlessly, this would inconvenience some users. Extra settings or CLI flags might not be immediately obvious, digging through a php.ini to find those extension lines is an inconvenience, and red warning boxes everywhere can be annoying, but the alternatives as they popularly exist in PHP today need to go extinct. This idea of disabling security by default, of programming to the Lowest Common Insecure Denominator, without anyone’s consent or knowledge is neither responsible nor sustainable.

As Gandalf would say, “Keep it secret, keep it safe”. Just putting it on a table in plain view for the Sackville-Bagginses to steal is neither!

Introduction to Humbug: A Mutation Testing Framework for PHP

4

On 1 January 2015, I first pushed Humbug onto Github and three months later it is reaching a state where I can prep for the release of 1.0.0. Release early, release often! I haven’t publicised it a lot so this is my first Humbug specific blog post since I started writing code on December 21st with a season appropriate name becoming my chosen namespace.

https://github.com/padraic/humbug

About Humbug

Humbug is a Mutation Testing framework intended to measure the true effectiveness of test suites and provide sufficient information to allow for their improvement.

You may already be familiar with the concept. In Mutation Testing, defects which emulate simple programmer errors are introduced into source code (your canonical code is untouched) and the relevant unit tests are run to see if they notice the defect. The more defects that are noticed, the more effective the test suite is presumed to be. The methodology relies on the theory that a quantity of relatively simple defects, either in isolation or combined, provide as much useful information as would a series of more complex defects.

You can find a comprehensive (and growing) list of the types of defects Humbug creates here: https://github.com/padraic/humbug#mutators

The traditional tool in PHP for measuring test suite quality is Code Coverage. Whereas Code Coverage measures execution statistics (without regard for how unit tests are written), Humbug provides a reliable and conservative assessment of how effective your test suites are at fulfilling their objective: the detection of regressions, and an accurate and complete description of implemented behaviour.

The differences between Code Coverage and Mutation Testing can be quite obvious. On a library I run Humbug on frequently, the Code Coverage is 65%. After running Humbug for several minutes to its conclusion, the resulting Mutation Score Indicator (MSI) is 47%, a quite stark difference of 18%.

Why the discrepancy? Since Code Coverage only cares about what lines you execute, it’s blissfully unaware of any other essential information: the content of a line of code, logical branches and paths, the likely errors that might arise, whether the unit tests were written poorly or well. It ignores all of these factors which are essential to assessing the real effectiveness of a test suite.

As a result, Code Coverage’s importance as a test quality metric is overestimated. Merely executing lines of code is not a good indicator of test suite quality, and it really only informs you of what parts of your code are definitely not tested. It’s possible to reach a 100% Code Coverage score with the most horrible unit tests imaginable. There are other limitations, such as Code Coverage in PHP generally being Line based and not measuring statement or branch coverage.

I don’t want to give the impression that a 0-100 score is the full extent of Humbug’s purpose: it also produces detailed logs of defects (with sufficient information to replicate them outside Humbug) which go undetected by a test suite, allowing you to write new targeted tests that better document the actually implemented behaviour to support refactoring and prevent unnoticeable regressions.

Hmm, I should probably explain how to use it now…

Installing Humbug

Humbug requires PHP 5.4 and only works, for now, with PHPUnit. I’ll be looking into phpspec/behat support in the near future.

Humbug is available on Packagist as `humbug/humbug` to install globally via Composer, or you can clone it and run a composer install, but the simplest way to get it is to just download the PHAR. Given I’m a security freak, the PHAR is cryptographically signed (hence the additional public key download) and delivered over HTTPS. You can move or rename these, so long as both files are kept together.

wget https://padraic.github.io/humbug/downloads/humbug.phar
wget https://padraic.github.io/humbug/downloads/humbug.phar.pubkey

If you wish to make humbug.phar directly executable:

chmod +x humbug.phar

The PHAR is self-updating using the following command:

./humbug.phar self-update

I manually update the central PHAR as new functionality or bug fixes are added. This will track the development version in the run up to 1.0.0. Thereafter, I expect there to be a choice between stable versions or development versions when updating your PHAR copy.

Once you have Humbug somewhere, you’ll need a guinea pig. Assessing Humbug’s performance (a lot more on that later) on a huge repository of code is probably not the greatest idea ever. Pick something more moderate, where you can get used to the size vs performance ratio, and navigate to its base directory. From there:

./humbug.phar configure

Follow the steps presented as questions to generate a configuration file. The main information needed are the location(s) of source code that is being tested, the directory from which to run tests (if not the base directory), the timeout to apply for any one test run (defaults to 10s – used to kill infinite loops arising from certain mutations) and any directories which should be excluded from mutation testing within the source code directories you chose (e.g. Tests if under the src hierarchy). This will generate a humbug.json.dist configuration file. You may write it manually – it’s relatively simple.

{
    "timeout": 10,
    "source": {
        "directories": [
            "src"
        ]
    },
    "logs": {
        "text": "humbuglog.txt",
        "json": "humbuglog.json"
    }
}

Running Humbug

Mutation Testing itself is just the default command, so now run:

./humbug.phar

Humbug operates in several stages. The first is to run the test suite normally to ensure that it’s in a passing state. At this stage, Humbug also collects data on the tests: execution times, code coverage and the junit log. This data is utilised later for optimisation purposes to ensure we can eliminate tests where they don’t exercise the specific line of code being mutated and also execute the fastest tests first. As a result, it’s essential that all tests are currently passing – not only for the data, but because unavoidable failures would play havoc with the mutation testing results. If they don’t pass, Humbug will terminate the process and show an extract of the TAP formatted output indicating the failing test.

Note: Humbug is still a relatively young framework, so there are also edge cases where it will terminate even on a passing test suite. We’ll gradually deal with those cases over time.

The second stage analyses the source code, breaking it into tokens which are fed into a queue of Mutation Operators (aka Mutators). These Mutator objects check every token to see if they are capable of applying a specific mutation at that point in the source code, returning a simple boolean as confirmation either way.

In the third stage, all of the information gathered to date is sent to the assembly line (also known as the God Method that gets an F on Scrutinizer). In the example I mentioned earlier, over 650 mutations are generated. In this third stage, we setup configuration files and a StreamHandler (which intercepts includes) for a separate process. In this separate process, the StreamHandler intercepts the inclusion of the original file, and replaces it with a mutated form (the mutant containing the current mutation). The test suite is then run within this separate process to see how it responds to the mutant’s presence.

All of the optimisations also kick in so that only relevant tests are executed for a given mutation. The PHPUnit output is then reported back to the main Humbug process to assess and collect the result.. Progress is rendered as a series of dots and letters until we have finally iterated across all of the available mutations. In this case, 653 mutations, including 653 PHPUnit runs, takes a total of around 3 minutes on my local VPS. Given the code coverage of 65%, it would be over 5 minutes if the unit tests were more complete.

The fourth and final stage is rendering a result summary and writing any requested logs. This includes the now simple calculation of the Mutation Score Indicator, the primary metric referred to in Mutation Testing.

Here’s a sample of the resulting command line output:

 _  _            _
| || |_  _ _ __ | |__ _  _ __ _
| __ | || | '  \| '_ \ || / _` |
|_||_|\_,_|_|_|_|_.__/\_,_\__, |
                          |___/
Humbug version 1.0-dev

Humbug running test suite to generate logs and code coverage data...

  361 [==========================================================] 28 secs

Humbug has completed the initial test run successfully.
Tests: 361 Line Coverage: 64.86%

Humbug is analysing source files...

Mutation Testing is commencing on 78 files...
(.: killed, M: escaped, S: uncovered, E: fatal error, T: timed out)

.....M.M..EMMMMMSSSSMMMMMSMMMMMSSSE.ESSSSSSSSSSSSSSSSSM..M.. |   60 ( 7/78)
...MM.ES..SSSSSSSSSS...MMM.MEMME.SSSS.............SSMMSSSSM. |  120 (12/78)
M.M.M...TT.M...T.MM....S.....SSS..M..SMMSM.......T...M...... |  180 (17/78)
MM...M...ESSSEM..MMM.M.MM...SSS.SS.M.SMMMMMMM..SMMMMS....... |  240 (24/78)
.........SMMMSMMMM.MM..M.SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS |  300 (26/78)
SSSSSSSSM..E....S......SS......M.SS..S..M...SSSSSSSS....MEM. |  360 (37/78)
.M....MM..SM..S..SSSSSSSS.EM.S.E.M............M.....M.SM.M.M |  420 (45/78)
..M....MMS...MMSSS................M.....EME....SEMS...SSSSSS |  480 (52/78)
SSSSS.EMSSSSM..M.MMMM...SSE.....MMM.M..MM..MSSSSSSSSSSSSSSSS |  540 (60/78)
SSS....SSSSSSSSMM.SSS..........S..M..MSSMS.SSSSSSSSSSSSSSSSS |  600 (68/78)
......E...M..........SM.....M..MMMMM.MMMMMSSSSSSSM.SS

653 mutations were generated:
     283 mutants were killed
     218 mutants were not covered by tests
     130 covered mutants were not detected
      18 fatal errors were encountered
       4 time outs were encountered

Out of 435 test covered mutations, 70% were detected.
Out of 653 total mutations, 47% were detected.
Out of 653 total mutations, 67% were covered by tests.

Remember that some mutants will inevitably be harmless (i.e. false positives).

Humbug results are being logged as JSON to: log.json
Humbug results are being logged as TEXT to: log.txt

Interpreting Humbug

You might recognise the summary results as being the example I explained earlier. Code coverage is 65% but the Mutation Score Indicator (MSI) is 47% (though it needs to be emphasised in output). I didn’t quite explain what the MSI is, so I’ll do so now.

A Mutation Score (MS) is a simple calculation of Detected Mutations as a percentage of Total Mutations, i.e. the more defects a test suite detects, the better its score. There is however a certain flexibility as to what constitutes both of these values.

Humbug doesn’t generate every single mutation possible. Humbug also doesn’t eliminate false positive results which may arise from Mutant Equivalents, i.e. when a generated defect behaves identically to the original source code. Some of this will make its way into later Humbug versions as time (and PHP’s cooperation) allows. Certain other things like eliminating Equivalents can actually be quite hard and resource/time intensive. If Mutation Testing needs to perform well, this simply can’t be tolerated.

Rather than make a false claim, Humbug therefore reports the more nebulous and conservative Mutation Score Indicator (which is par for the course in the real world). It indicates what your actual Mutation Score might be, but it’s not definitive. An MSI of 47% may be slightly understated as a result of this uncertainty. It also means that a perfect score of 100% is very likely unobtainable except in very simple straightforward cases.

My Kingdom For A Diff

The logs generated by Humbug are essentially a collection of file name, line number, mutator type and a diff demonstrating how a specific mutation is applied, all categorised by the result type. The purpose is to allow you to review mutations which were covered by tests but went undetected, apply them as needed, and write tests allowing you to detect the defect represented by the mutation should it or a related behavioural regression ever occur.

The logs can be generated in Human readable text, and in JSON format to be consumed by other services or your own creations. We also generate, optionally, a number of JSON logs which cache results and other information necessary to perform Incremental Analysis (IA) which is an experimental feature mentioned in the next section.

Performance

In a nutshell, performance is the reason why Mutation Testing is not regularly used in any programming language. Running your entire test suite for potentially thousands of possible defects can indeed be extremely slow, so focus has remained on ad-hoc manual reviews and code coverage to assess test suite quality.

Humbug, like some other Mutation Testing frameworks, pursues performance optimisations even where it may have a small cost to accuracy. In our prior example, we generated 653 mutations with a test suite that normally takes 29 seconds on average. That would suggest a runtime of roughly 5.26 hours. In reality, Humbug completes the mutation testing in around 5 minutes. Your mileage may vary.

One significant optimisation is that Humbug uses Code Coverage for its actual purpose: assessing what lines of code are tested, and what tests actually exercise those lines. This eliminates “run the whole test suite per defect” since we can select only relevant tests and then order by their execution times to run the fastest relevant tests as a priority. It also means that we can skip running any tests where a mutated line’s Code Coverage is zero. There are other smaller tweaks (both to speed and memory utilisation) and running faster tests first eliminates most of the costs of slow tests.

Additional undiscovered optimisations (micro or otherwise) may yet be possible, although the more obvious targets are understandably the focus at this point. Certain other optimisations such as parallel processes and more fine grained test selection are dependent on the test suite being sufficiently well designed (which is relatively rare), so these have gone unimplemented for the moment as we chase optimisations applicable to the broadest base possible.

Incremental Analysis

Incremental Analysis (IA) is an experimental feature under progress to incorporate caching into Humbug. The principle being to cache results, incrementally updating them as the source and test code changes, to eliminate the upfront cost of iterating across all possible mutations when unnecessary.

While it won’t be stable for an initial release, IA promises to bring performance down to the point where Humbug is more usable with larger projects, and capable of being used more frequently in general.

In Closing

Where now? Humbug 1.0.0 is intended to get the ball rolling on Mutation Testing with PHPUnit,  allow for some experimentation with Incremental Analysis, and start attracting issues for the inevitable bugs and various PHPUnit setups that exist out in the wild.

Other than the inevitable bug fixes and basic maintenance, the next obvious target is phpspec and behat, taking on the issue of how diverse we’ve become in describing behaviour instead of merely verifying all things.

This is largely a question of increasing Humbug’s detection net to what people actually use in real life. In a setup with behat, phpspec and PHPUnit, some mutants may escape a PHPUnit oriented approach by design. Humbug will need to take all of the various tools into account in those cases.

In the meantime, I hope Humbug terrifies your unit tests ;).

Go to Top