Thursday, July 26, 2007

Moist Tests: Production Code Needs to be DRY, But Tests Don't

I've spoken a bit about this before when discussing our standard_params notation. I think we've taken the DRY principle too far. And I think this happens a lot in the software industry. We latch onto a good principle, and our meticulous (some would say anal) natures lead us to want to apply it everywhere. And we do it with DRY. Don't Repeat Yourself. Removing duplication in code removes bugs. Re-use promotes efficient development. It's a fantastic principle, first outlined by Dave Thomas and Andy Hunt in The Pragmatic Programmer. It's probably still underused in most code bases. But one area where I see it abused is in tests. Tests don't need to by DRY, they can be Moist*. Take these two tests for example:


def test_formatted_telephone_number
telephone_number = TelephoneNumber.new(:area_code => "123", :telephone_number_base => "5556811")
assert_equal "(123) 555-6811", telephone_number.formatted
end

def test_unformatted_telephone_number
telephone_number = TelephoneNumber.new(:area_code => "123", :telephone_number_base => "5556811")
assert_equal "1235556811", telephone_number.unformatted
end


DRY would see the duplicate creation of the same TelephoneNumber object, and say "Extract Method" to remove the duplication. Or perhaps we could stick an instance variable in our setup, and assign the TelephoneNumber object to it there:



def setup
@telephone_number = TelephoneNumber.new(:area_code => "123", :telephone_number_base => "5556811")
end

def test_unformatted_telephone_number
assert_equal "1235556811", @telephone_number.unformatted
end



And if this wasn't a test (if it was actual code that will be executed in production), I would definitely agree. If you remove the duplication, you only have one place to change the code when that time arises, and bugs are less likely to result. But in tests it is different. When a test fails, I want to fix it quickly. And when I have to search around a large test class for extracted methods, or instance variables that may be modified throughout the class, this costs me time. And I'm willing to accept the risk of the duplication causing a bug (after all, it is only test code) if I reduce the time wasted.

So for me, private methods in test are a smell. So too are setup methods. The only time I use them is when I can get the abstraction such that I very very rarely have to look at what is happening beneath the abstraction. A good example is rails' Controller tests. They have a setup which initializes the controller, and a request and response object. I think I can count on one hand the number of times that I've had to care about these variables, so in that case, it's ok.

*I believe zak came up with this term, though I'm not sure. It may have been Jay Fields. Either way, I'm certainly not taking credit.

7 comments:

obfuscated said...

Hey, nice post, i totally agree with you. I've seen people abuse the DRY principle, will point them to you blog entry the next time.

dale said...

DRY is not just about reducing bugs. It is also about reducing maintenance and refactoring costs.

I agree that your approach makes reading tests much simpler.

I've tested this approach on a bigish project (8 devs - 6 months). Test maintenance did become a serious issue towards the end.

Have you followed this on large projects? did it scale for you?

I believe there is a balance here. If you have setup code that isn't specifically relevant to the piece of functionality under test I think you can factor this out. But not necessarily into a private method. Pushing this stuff out sideways into a sensible testing domain model will provide more reuseable code.


Just my 2 pence.

Shane Harvie said...

Hi Dale, On my current project we have 6 devs and have been working for a year - we don't have a huge problem with test maintenance. If we keep to a convention we can make changes pretty quickly using regular expression search/replace, but my regex skills have had to improve a bit, I must say. I really like your last comment about taking code out that isn't specifically relevant to the piece of functionality. This is really important. It's really hard to get the abstraction right (ie choosing exactly what to take out, and what to leave in). Not many developers seem to understand and/or care about the subtleties of abstraction. I'm interested in this concept and trying to put together something on it. Thanks for your comments...

sarnacke said...

I'd go a step further. It's not just about fixing broken tests... in Agile, we talk a lot about how our tests are our documentation. When I want to understand how a class works, I'll often read the tests first. When I'm trying to get a handle on a new codebase, it's frustrating in the extreme if I have to continually jump out to setup methods and object mothers to see exactly what the important parts of my test objects are. (This is even more frustrating if your team uses an underpowered "IDE" (read as: "text editor") like TextMate where you can't easily jump to method definitions.) I don't mind having a generic "default" object set up somewhere else, as long as every test modifies the pertinent fields within the test body itself so that I can read them.

Patrick Farley said...

*Multiple Occurrences of Identical or Similar Text (MOIST)

Bernd Eckenfels said...
This comment has been removed by the author.
Bernd Eckenfels said...

You need to differentiate between test-cases and support code.

Tests tend to have repeating support code (especially if the Testing Framework used does not support the technology needed, like creating sample documents, threading, mocks or stuff.

It is not a good idea to have 5 lines of DOM Parser code in multiple places when all you need is a instance of a fixed XML fragment.

It is much better to have a private "createDomFromString()" or "initialiseMockServer(p1, p2, p3) or "validateDomElement(dom, key, value)" than to repeat that code.

In the end the tests get more human readable as well.

Gruss
Bernd
--
http://itblog.eckenfels.net