Unit Testing as a practice is like any other – there are good practices, and bad practices. Two of the worst practices are overloading tests with assertions, and writing lazy or shallow tests.

Before we recount the dire consequences of these practices, it’s worth knowing why they are so attractive and not immediately perceived as being bad. In short, every test you write requires that you setup the test environment, create a scenario for possible failure, add an assertion, and then ensure the source code makes that assertion pass. This requires code – sometimes a lot of code. So adding multiple assertions to each test minimises the work needed to write tests, since using multiple assertions takes advantage of existing code to avoid writing new stuff to clutter your test classes. It can also help to tackle multiple but related results in the same test.

So long as you know the assertions will pass – this makes writing unit tests quite a bit faster at times. Unfortunately, a preoccupation with minimising test code also encourages developers to keep tests overly simple to the point that they do not dig deep enough into whether a test actually accomplishes its objective – often because that objective has never previously been documented.

These considerations lead to tests which may be similar to (using PHPUnit):

[geshi lang=php]class GameTest extends PHPUnit_Framework_TestCase
{
public function testScoreIsZeroWithNoScoring()
{
$game = new Game;
$this->assertEquals(0, $game->score);
$this->assertEquals(0, $game->scoreTotal);
$game->score(1);
$this->assertEquals(1, $game->score);
$this->assertEquals(1, $game->scoreTotal);
}
}[/geshi]

Here we have a simple test with four assertions. All four test $score and $scoreTotal to make certain they remain at zero after the object is initialised and no score (or zero score) is assigned. They then revisit the situation after a score has occured. To the naked eye, the test is easy to understand. Here’s a class which will pass the above test (if you spot the problem after seeing the class, you should give yourself a treat ;) ).

[geshi lang=php]class Game
{

public $score = 0;

public $scoreTotal = 0;

public function score($score)
{
$this->score = $score;
$this->scoreTotal += $score;
}

}[/geshi]

Back to those dire consequences. Consider the output of test results showing a failure I will now introduce by initially setting $score to 1 in the class.

PHPUnit 3.3.14 by Sebastian Bergmann.

F

Time: 0 seconds

There was 1 failure:

1) testScoreIsZeroWithNoScoring(GameTest)
Failed asserting that <integer:1> matches expected value <integer:0>.
D:\projects\tinker\GameTest.php:11

FAILURES!
Tests: 1, Assertions: 1, Failures: 1.

Multiple assertions do unfortunately have side effects. The most obvious one is that it only takes one assertion to fail, and the entire test will fail with it. It ignores whether other assertions in the same test would have actually passed (hence we see that the last line of the results show PHPUnit only executed the first assertion, and ignored all others), which leaves you blind as to whether the error is impacting other assertions. This creates a maintainance nightmare – for every test that fails, you’re never certain what should have failed! You only get one part of the puzzle to work with.

Unit Tests should be specific. In fact, as a general rule, there should only be one assertion per test method. If a failed test doesn’t immediately tell you where the problem is, and what assertions will fail, and offer at least some minimal description of the failed behaviour (typically the test title should be sufficiently descriptive) then its utility is severely reduced. You end up doing the same detective work needed in the absence of unit tests – which makes those unit tests less beneficial since you rob the maintainer of instantansous specific feedback and force them to edit tests, often rewriting them to be more specific, and/or employ typical debugging approaches to locate the problem in the source code.

Another impact, is that multiple assertions are often a sign that the tests were written post development, or without attention to the behaviour of the class. This increases the risk that the tests are not only confusing when they fail, but that the tests are not even complete. Truly paying attention to the role of behaviour discourages multiple assertions and promotes specificity.

There’s a side story here about the foolishness of believing that code coverage is an absolute measure of the effectiveness of a unit testing suite. It’s not – it’s only one metric to assist in that measurement, and not a very reliable one at that. It’s entirely possible to gain 90% or even 100% code coverage without writing tests that cover even a quarter of the expected behaviour of the class. Code coverage measures how much of the source code lines are actually executed – it doesn’t tell you if they were executed enough times, in the right order, or if the tests were even appropriate to start with.

This is the problem some people will have noted from before (give yourself a treat!). The class obviously has more behaviour than the original passing test seemed aware of. Despite this, guess what the original test had as a code coverage metric? 100% :)

Here’s how the test should have been written, code coverage and multiple assertions be damned. Your tests are only complete when you are absolutely certain they cover off on all class behaviour – and not a second sooner.

[geshi lang=php]class GameTest extends PHPUnit_Framework_TestCase
{

public function testStartingLastScoreIsZero()
{
$game = new Game;
$this->assertEquals(0, $game->score);
}

public function testStartingTotalScoreIsZero()
{
$game = new Game;
$this->assertEquals(0, $game->scoreTotal);
}

public function testLastScoreOnlyStoresLastScoreRewarded()
{
$game = new Game;
$game->score(2);
$game->score(5);
$this->assertEquals(5, $game->score);
}

public function testTotalScoreAccumulatesRewardedScores()
{
$game = new Game;
$game->score(1);
$game->score(2);
$this->assertEquals(3, $game->scoreTotal);
}

}[/geshi]

The first test class was very obviously smaller and simpler. Not only is it hampered by multiple confusing assertions, but its simplicity also indicates a lack of good test design – by reusing and ignoring specific test scenario setups (i.e. seeking a fail, before editing code to pass) it’s a test suite that passes, but doesn’t quite verify everything. Unfortunately, the two go hand in hand in my experience. To make the first test pass, and the second more specific tests fail, use the following version of the Game class.

[geshi lang=php]class Game
{

public $score = 0;

public $scoreTotal = 0;

public function score($score)
{
$this->score += $score;
$this->scoreTotal = $score;
}

}[/geshi]

Here we’ve simply assumed someone got confused, and mixed up the purpose of the two properties in the score() method, so the += sign has moved in error. Now how many times has that happened to you? ;) If you run the original simpler and multiple assertion stuffed test – it will show everything is working as intended (as an aside, this is one example of where Mutation Testing could have picked up a problem which the original test couldn’t detect).

PHPUnit 3.3.14 by Sebastian Bergmann.

.

Time: 0 seconds

OK (1 test, 4 assertions)

The later more detailed, specific tests show a different story:

PHPUnit 3.3.14 by Sebastian Bergmann.

..FF

Time: 0 seconds

There were 2 failures:

1) testScoreOnlyStoresLastScoreRewarded(GameTest)
Failed asserting that <integer:7> matches expected value <integer:5>.
D:\projects\tinker\GameTest.php:38

2) testTotalScoreAccumulatesRewardedScores(GameTest)
Failed asserting that <integer:2> matches expected value <integer:3>.
D:\projects\tinker\GameTest.php:46

FAILURES!
Tests: 4, Assertions: 4, Failures: 2.

Now imagine someone has written hundreds of tests in the manner of my first example. Can you imagine the world of hurt anyone attempting to refactor and maintain the underlying source code is facing? The countless tests they’ll need to debug, rewrite, and expand? The tears of frustration? The hair pulling? The talking to your reflection because of a psychotic break?

Don’t do that the next time you write unit tests :) . Remember – be as specific as possible about what the class should do, and you will quickly realise that you only need one assertion per test. Sure, it means writing additional test code – but at least you’re now writing tests that truly work!

Related posts:

  1. Thoughts on a Unit Testing and Test-Driven Design Experience
  2. Mutation Testing Brain Dump
  3. Behaviour-Driven Development Explored
  4. The PHPSpec Zend Framework App Testing Manifesto: Preamble
  5. Ruby Testing Tools Missing From PHP