ArticleS. DavidChelimsky.
KeepingInfectionInCheck [add child]

Keeping Infection in Check.


There is an irony about the phrase "Test-infected". It was intended to be a good thing, but we're talking about INFECTION. INFECTIONS ARE BAD. Aren't they? Here's one dictionary definition: "Invasion by and multiplication of pathogenic microorganisms in a bodily part or tissue, which may produce subsequent tissue injury and progress" Yuck. We don't want this in our bodies. Why do want this in our systems?

I remember some time ago that people were talking about a certain type of infection and there was some motion towards changing the language about it from infection to imbalance. It's because the particular organism is something that our body actually needs, but in managed amounts. The organism, of course, has a different idea, so there's a constant struggle between the organism trying to infect and the body keeping it in check. And when the body does what it's supposed to do, balance is acheived.

One of the practices that evolved out of test-infection is the "one test case per one system class" model. This makes perfect sense if you are testing existing code. However, if you are test driving, it makes little sense. The only way to achieve that is to constantly restructure your tests as you restructure your code. "test/code/refactor" becomes "test/code/refactor-the-code/refactor-the-tests". That is extra work. It makes code harder to refactor (higher stability due to more clients), and is often not that useful (more on this later).

I've been exploring the concept of behavior driven development, and that is certainly coloring my perspective. Here is one of the many lessons I've learned. If you really adhere to the rule of only adding code to get a failing test to pass, then there's no need to refactor your tests to this "one test case to one system class" deal. Everything is tested, by definition. Most of the time, that's all the granularity you need.

Recognizing this frees you up to organize the tests as what we keep saying they are (yet rarely organize them as such): executable specifications. AND you get to refactor mercilessly without having to change your tests. Test-infected code requires changing tests to change structure.

A bunch of us were discussing this yesterday, and Micah Martin pointed out the flip side: that the further your tests are away from the code being tested, the more landscape there is that could be hiding a bug. So if you only have FitNesse tests, for example, and no unit tests, while development might go more quickly, when you finally do have to chase down some bug it's going to be much more difficult. So in that respect, more granular testing is very useful. But there is a balance to be sought after.

I think that balance is easy enough to define. Tim Ottinger said something to this effect when we were discussing this (paraphrasing) - you want the least number of tests necessary to define clearly define the system and provide effective fault isolation.

Now the question is: how? More on that soon.

!commentForm
 Fri, 10 Feb 2006 10:04:57, Paul Pagel, a good night sleep is all the tests need to sweat out the fever
Tim Ottinger said "testing implementation offers reasons to stay wrong."

This articulates well a problem with testing on a very granular level. When you want to refactor design, it can become extremly painful, offering insentive to not change. However, when doing these large refactorings, I generally test drive them at the class/granular/implementation level. Test driving these things can be very useful, as this is how I write my best code. However, I only want to keep around the tests which test behavior, which should test the entire unit. As far as tracking down a bug, I can test drive a bug fix if it falls through the cracks in my behavior test. Bugs in test driven systems, in my experience, have shown themselves in these behavior tests. The bug fix will need to be test-driven anyway.

I don't think it is a matter of test-driving, so much as test commiting. Test-driven code is generally written well, and the behavior tests tell its story. Granular tests are useful during workflow, but painful when revisited during a refactoring. What about refactoring tests in terms behavior at the end of a story?
 Fri, 10 Feb 2006 11:23:02, Tim Ottinger, "Fluidity Is The Key" (apologies to Alistair MacLean[?])
The thing that bothers me is that our goal is to maintain fluidity in development, which requires contradictory things. We need to have very granular tests because we need fault isolation in order to move quickly, but we recognize that tests are reasons to stay wrong. So it seems that we need a minimal and sufficient set of tests in order to move quickly: sufficient to isolate errors and provide close air support as we refactor, and also minimal because we don't need a lot of mass to screw up our inertia.

I think that the adage about the ideal process being the least process that's sufficient may also apply to tests. You don't want to have insufficient coverage, but you don't want overlap and you don't want tests that cover the structure of your solution instead of the behavior of your solution.

But TDD means that we sometimes write tests in order to allow us to write the next line of code, the next method or the next class, and those tests may be toxic in the long run (say, when we want to change the structure). It probably means that we need to throw more tests away, and do it more often.

Also, test refactoring can lead to some pretty tenuous and stringy dependencies or call trees. It is sometimes more scary to change the tests than the code. That probably means we need to denormalize, or "unfactor" some of the code to bring the content of the setup back into local focus. This is a different take on the same forces, no? We have to balance locality and visibility against duplication. Again, we want enough intelligent, guided refactoring (OAOO) to keep the tests *manageable*. And sometimes these tests have to be thrown out, or the refactored bits pulled back inline. If the goal is fluidity and productivity, then even some of our principle-driven changes may have to be sacrificed.

I'm loving the "infection" analogy, BTW.
 Fri, 10 Feb 2006 11:34:20, Tom Rossen, XP Hyperfundamentalist, What about just refactoring tests?
Paul Pagel said "What about refactoring tests in terms [of] behavior at the end of a story?"

[Disclaimer - I'm writing this in the middle of an attempt to clean up a humongously complex test - pairing with David, as a matter of fact.]

What about just plain refactoring the tests? Among other problems, this test class had two different ways of doing the same kind of setup (RhinoMocks and hand mocks). It is (at the moment I'm writing this) 1287 lines long and has 91 test methods.

If tests really were first class citizens (remember, Uncle Bob says they should be), this would never have happened. No "production" class would ever be allowed to reach this level of humongosity.

Treating tests as first class citizens is not easy. One obvious reason: we don't have tests for tests, so we can't TDD them. But we do know what to do when something smells like last year's gorgonzola.

 Fri, 10 Feb 2006 13:56:44, Paul Pagel, test me, map me, but don't abuse me!
Refactoring a test is dangerous. Changing actual code inside a test can lead to changing the meaning of the test. I am very hesitant to go around looking at tests with refactorings in mind. However, when there are simple changes I can make to make the tests to increase quality, such as removing code duplication or making the tests more readable, I will jump on the chance. I will not implement a template method, decorator, etc. design pattern in my tests. The tests should not be misdirectional, they are meant to have an easy mapping to production code.

Tests being first class citezen's,to me, means they can access the code base as any other client can. This means writing accesable code, even when the tests are the only client of that access point. Using test patterns to increase the testability of the code(see Michael Feathers Working Effectively with Legacy Code). The sheer size of a class does not in itself prove first class tests, however it is not without smell.
 Fri, 10 Feb 2006 14:15:26, Tom Rossen, Refactoring tests considered harmful?
Paul said, "Refactoring a test is dangerous. Changing actual code inside a test can lead to changing the meaning of the test."

Not sure what "actual code" is, but if you change the "meaning", it's not refactoring. Refactoring changes implementation without affecting behavior/meaning/functionality.

The refactorer's hypercritical oath: first do no harm.
 Fri, 10 Feb 2006 15:38:14, Paul Pagel, ...don't remove my appenix!!!
i mean when you refactor the production code, you have the unit tests as a safty net, so you can go skydiving without a worry. There is always a reset button in case your parachute doesn't open. However, if you change the tests, you must do it in a surgical room with a scalpel, because if you change what it tests accidentally, you lessen code coverage/degrade the integrity of the system. The mistakes could start internal bleeding in the system you don't even notice.
 Fri, 10 Feb 2006 15:47:48, Tom Rossen, We're in violent agreement then
Agreed. That's what I meant about treating tests as first-class citizens being hard. But we really need to do it.
 Sat, 11 Feb 2006 11:08:00, JakeB[?], Testing the tests?
What if refactoring a test is approached in the same way as refactoring production code? Under TDD doctrine, you wouldn't refactor production code unless you have unit tests covering it. So, the most logical extension would be to enter recursive territory and have a unit test for your test that you wish to refactor.

It may sound stupid but what would you actually put in a test (call it testB) that tests a test (call it testA)? TestB[?] can check that:
1) your testA guarantess a certain test coverage
2) your testB makes certain assertions, and/or a certain number of assertions

Then you can happily refactor your testA, in the knowledge that if you break it (i.e. reduce coverage, remove or even add assertions) it will be flagged up by your testB.
 Fri, 17 Feb 2006 00:17:33, Ryan Platte, Finding an appropriate distance from which to test
I just delivered a project in which I tested first almost exclusively through providing test inputs and expected outputs. The tests didn't hold my hand through the OO design decisions as much, so I had to stay even more vigilant for code smells and push myself to refactor, write lower-level tests, and comment when debugging sessions did occur. But doing that a few times quickly got me back to a nice flow, and the tests were easy for the customer to verify, since they were written in the environment we designed together just for his team's daily work.

I began with more lower-level tests and quickly found I just didn't need them for this project whose inputs and outputs were so clearly defined. I'd long been wanting to try a "looser" TDD sometime, and I hope I can apply this technique again. I felt more involved and less likely to paint myself into a corner by testing implementation instead of results.

Maybe the key is to find the broadest parts of the system that I can imagine a high-cohesion API for, and start testing from there. And maybe, by extension, I can just keep zooming out till the whole system gets a nice API like that, then drill down...wow, I'm learning TDD all over again, from a different angle.
 Mon, 27 Feb 2006 08:18:13, Bill Wake, I'm pro-fluidity too:)
>One of the practices that evolved out of test-infection is the "one test case
>per one system class" model.

I'd say that approach pre-dates TDD. If you look at how Kent Beck works, he seems to work more "one test class per unique test setup" so several test classes may relate to one system class.

Like you, I like tests that approach the "executable specification" ideal. I have a couple frustrations with the mock-based approach. The minor one is that the tools are a bit painful (e.g., where method names are strings). The other is that it seems to make tests a bit snoopy about the internals of implementation, thus making the tests more resistant to change. (I don't yet have an opinion on whether this is inherent to the mock approach, but it's true in a fair bit of the mock code I see.)