Property-Based TDD at XP Day London 2012
Keith Braithwaite and I ran the Property-Based TDD As If You Meant It exercise as a workshop at XP Day London 2012.
I'd hoped that everyone in the session would have previously experienced Keith's original TDD As If You Meant It workshop, so we could focus on the different ways that Example-Based and Property-Based Testing affected how we test-drive code. On the day only about half of the attendees had. This wasn't too bad. We paired those who hadn't with those who had and there was enough people for everyone to use a language they preferred, or at least were fluent in.
The languages used were, in order of popularity, Ruby, Python, C#, Clojure, Java. I expect that Python was so high up the list because they knew they could use my QuickCheck library (now called factcheck) and get me to help them get up to speed with it. If so, that tactic worked. Unfortunately, nobody chose to do the exercise in Haskell or ML.
The participants carried on with the exercise further than I did but ran into the much the same problems.
At the end of the session we spent a few minutes gathering thoughts and feedback (unfortunately we overran the time slot and so we didn't have as much discussion as I'd have liked). The following is my recollection of what people found, based on the rough notes I recorded on a flip chart.
There's a blurry line between examples and properties. When starting out with the smallest possible test you can think of, the property might be expressible as an example-based test. For example: a game with no moves is still in progress and it is player 1's turn to play next. There's only one game with no moves. As you extend the behaviour in small steps, you introduce behaviour about which for-all statements can be made. For example: after any one move, the game is still in progress and it is player 2's turn to play next. As you work, more of the tests become for-all statements, and fewer remain as there-exists statements.
The generators end up duplicating production logic. Are they valid tests if they rely on the code they're testing?
As the behaviour becomes more complicated, it becomes harder to write the properties and the generators that generate arbitrary input values. They can become more complicated than the production code that they are testing and need to be engineered as rigorously as the production code. They end up needing their own tests.
This works best when the complexity is in the test code, not the generators. E.g. when one can compose test generators from simple generators and predicates to feed values into more complex test code.
Arbitrary inputs for one function can often be defined in terms of the properties of other functions that the tested function depends on. Could a QuickCheck library "reverse" a property to create a generator of values that meet that property?
The technique would be easier to use in a language that understands algebra. E.g. has an algebraic type system and forces you to model the domain as an algebra.