Thursday, November 15, 2007

Testable code: it's the structure, stupid!

The topic of writing testable code is a quagmire of opinions. See, for example, Learning From Sudoku Solvers, which gives the argument that Test-Driven Design is not always great... and it's a fantastic example.

I have found, however, that test-driven design helps to generate code that is not necessarily clearer, but better factored. I ran across a blog post from Phil Haack:

Writing Testable Code Is About Managing Complexity: "the real benefit of testable code is how it helps handle the software development’s biggest problem since time immemorial, managing complexity."

This post really cuts to the heart of why Test-Driven Design works, and why it can also utterly fail. Phil's main arguments are that generating tests gives you a canary in the coal mine for bugs, and that it indirectly helps manage complexity. The bug finding is absolutely true, that is the main attraction for most people.

But Phil doesn't quite get the second point right. Test-driven design absolutely helps to directly manage complexity. Why? The answer is factorization. Jeff uses the term "separating concerns" to talk about how writing tests helps to simplify code. That's a start, but not quite the whole story.

Unit tests work best on atoms of a program: a single algorithm that completes a specified task. Firefox is not an algorithm. The MySQL package is not an algorithm. GCC is not an algorithm. And none of them have a single, unified unit test. This is pretty logical, seeing that each program is built of parts.

But the extension here is that in order to write unit tests, you must understand what the parts of the program are. If a part of a program is very well unit-testable, then it is an atom, or a factor, or whatever you want to call it. It cannot be broken into any further multiple parts without destroying the semantics of what it does. This goes all the way back to undergraduate-level algorithms, using Hoare semantics: Precondition, Instructions, Postcondition. The test very simply has to set the precondition, call the function, and test the postcondition.

Let's say you are writing a program to parse a record, see if it is in an existing data set (say, a database), and then insert it or update it to the known data set. How do you write unit tests? To figure out the unit tests to write, you must first understand the factors.

I trivially see three factors: testing the existence in the data set, the insert operation, and the update operation. The "upsert" operation, that is, the decision whether to insert or update and then doing the appropriate action, is not an atom in this program. It should be covered in an integration test.

Let's think about that for a minute: why should upsert be an integration test? Consider a trivial set of two tests:

1. Upsert a record which does not exist (the record should be inserted)
2. Upsert a record which already exists in the set (the record should be updated)

These tests can be broken apart without changing the semantics of the upsert. We know that the upsert operation follows two code paths, and we can run these two tests against those code paths to ensure their respective operation.

The fact that the upsert is then covered as a small integration test signifies that it is also a (larger) factor (maybe a molecule instead of an atom?). The point is that the unit tests (and integration tests) fall clearly along borders within the design of the code. Functions are partitions of functionality of a program, and tests are written along the borders of those partitions.

The question about which comes first, however, still leaves a lot of room for argument. Should you write the tests first as a way of understanding the partitions in the code, or should you write the code first, making smart decisions about how to split functionality and write tests later?

I don't think there is an answer. Writing the tests before hand might give you some artificial lock-in to a design that is not great, but at the same time simply starting to code head first might prevent you from seeing what the overall design of the code should be, in terms of keeping logical functions separate.