Exploring Test-Driven Development with QuickCheck

I've been wondering how property-based testing tools, such as QuickCheck affects the Test-Driven Development process and the designs that the process drives out.

When TDD was introduced in the late 1990s, it was described in terms of example-based testing tools, such as the xUnit family of frameworks. Since then, property-based testing tools, such as QuickCheck, have become available for all major languages (and many minor ones). But you wouldn't think so if you read most of what is written about TDD & BDD. I wonder why this is. Are we TDD practitioners stuck in an XP-shaped rut? Do we dismiss anything that was not invented by a Smalltalker? Or are property-based testing tools just not as well suited to the TDD process as example-based testing tools?

I started trying to answer these questions by doing Keith Braithwaite's excellent TDD As If You Meant It exercise again. As before, I used Python and pytest, and added pytest-quickcheck for property-based testing.

In the exercise, the challenge at the start is to find the smallest thing you can say about the problem domain (scoring TicTacToe) and avoid introducing any concepts not driven out by the tests. The first, very smallest, test I wrote was the same as when doing example-based testing, and so didn't need QuickCheck at all.

def test_initially_there_is_no_winner():
    winner = "Nobody"
    assert winner == "Nobody"

Not very interesting if I want to explore my use of QuickCheck.

The next test is that there is no winner after one move. Before I can write that I need to be able to find the winner, given some moves. I use the Extract Method refactoring to create a winner() function and then add a moves parameter to it. I also extract a constant, Nobody, rather than using a string literal everywhere.

Nobody = "Nobody"

def winner(moves):
    return Nobody

def test_initially_there_is_no_winner():
    assert winner([]) == Nobody

Now I can use QuickCheck to define that all games of a single move have no winner.

I found that the Python QuickCheck implementation I was using didn't let me easily describe what a move is, so I after a brief digression to write my own implementation (which can be found here), I wrote the following additional test:

rows = from_range(3)
cols = from_range(3)
moves = tuples(rows,cols)

@forall
def test_after_one_move_there_is_no_winner(move: moves):
    assert winner([move]) == Nobody

After that it's easy to generalise those two tests to all sequences of fewer than six moves (at six moves, the first player may be able to win the game):

rows = from_range(3)
cols = from_range(3)
moves = tuples(rows,cols)
def games(move_counts): return sequences(lengths=move_counts, elements=unique(moves))

@forall
def test_until_it_is_possible_to_create_a_line_there_is_no_winner(game: games(move_counts=from_range(6))):
    assert winner(game) == Nobody

I like the way QuickCheck tests read. The definitions of the generators and the property-based test capture knowledge about the domain in a very clear, declarative way. It is closer to a true specification than an example-based unit test.

However, these properties haven't yet driven any development! They are all satisfied by the dummy implementation I wrote at the very start. I need to test the scenarios where the winner() function returns the winner of a complete game. For example, something like:

players = choices([FirstPlayer,SecondPlayer])

games_with_a_winner = ...

@forall
def test_reports_winner(game: games_with_a_winner):
    assert winner(game.moves) == game.winner

One way to write the games_with_a_winner generator is to generate random games and filter out those that do not have a winner. However, the filter would have to determine the winner of the game, and that's the very thing I'm testing. To test-drive code that determines the winner of a game the test has to determine the winner of a game. Circular reasoning!

Another way is to generate a random winning line for a random player and fill in the rest of the board with random, non-winning moves. To do that correctly, the generator would have to filter out any random filler moves that would create an another winning line. Which means the generator would have to detect winning lines. Circular reasoning again.

Is this really a problem? The rules of the TDD As If You Meant It exercise only allow one to create production code by moving existing code out of the tests. If the generator used to exercise the property in tests contains an implementation of the winner function, I can move it into production in the refactoring step and "hey presto!" I'll have driven development from the tests. Except I don't think I really would have. I'll have developed the generator within my test code without ever testing it, let alone test-driving it. And, after the move into production code, the tests will use the code they are testing to generate valid test data. Circular reasoning yet again.

And that's where I'm currently stuck.

I'm probably taking the wrong approach. I'm used to starting with the success cases in TDD and working outside-in. It looks like in this case it's too big a step to define properties at the very start of development.

So where can QuickCheck fit into the TDD process? And how does it affect the designs that TDD guides into existence?

I believe the paper The Practice of Theories: Adding "For-all" Statements to "There-Exists" Tests by David Saff & Marat Boshernitsan contains the solution. The paper describes how Saff & Boshernitsan use JUnit's Theory mechanism to do property-based testing, albeit without random test data generation. They start with examples & then use the algebraic properties of the class being tested, in this case a Dollar class to represent USD monetary amounts, to generalise example-based tests to theories. They add add operations to the class to make it easier to write theories that express universal relationships between the operations of the class.

For example, they notice duplication between the test and implementation of the times operation:

public class DollarTest {
    ...

    @Theory public void multiplyAnyAmountByInteger(int amount, int multiplier) {
        assertEquals(amount * multiplier,
            new Dollar(amount).times(multiplier).getAmount());
    }
}

public class Dollar {
    ...

    public void times(int multiplier) {
        return new Dollar(this.amount * multiplier);
    }
}

They then add a divideBy method to Dollar and change the theory to remove the duplicated logic in the test:

@Theory public void multiplyIsInverseOfDivide(int amount, int multiplier) {
    assertEquals(amount,
        new Dollar(amount).times(multiplier).divideBy(multiplier).getAmount());
}

This has some odd effects. The theories are driving out functionality in the production code under test that is not required by the rest of the production code. Because the property-based tests define invariants they do not show you how to use the class to achieve anything, only how to achieve nothing! And, the theory doesn't constrain what times and divideBy actually do. The theory would be met if times really performed addition and divideBy really performed subtraction!

So it appears that property-based testing is useful for describing non-logical axioms of a domain model. If used in a test-driven process, property-based testing should push the design to be more compositional and algebraic, so that we can more easily define non-logical axioms about it. But, we still need example based tests to verify and demonstrate what an operation actually does.

But I'm still unsure as to why it's so hard to apply to Keith's Tic Tac Toe exercise.

If you're interested in exploring property-based testing and TDD, Keith Braithwaite and I have submitted a technical workshop on Property Based TDD As If You Meant It at XP Day London 2012. Bring a laptop and join in!

Mistaeks I Hav Made

Exploring Test-Driven Development with QuickCheck