ArticleS. DavidChelimsky.
SpecOrganization [add child]

Spec Organization

One of the things that really interests me about the Behaviour Driven Development discussion is the effect it has on how you organize your specs. I've been advocating allowing duplication in specifications in the name of clarity for a while. I believe that clarity trumps DRY-ness in specs, but that doesn't mean I don't want them DRY. Think of it in agile manifesto format: "We prefer clear specs over well factored specs", which is to say that while there is value in DRY specs, there is more value in clarity.

The reason that clarity is important is that specs serve as documentation for future development. If you want to understand how to use a class, look at the tests. Right?

Thinking of the specs as specs (i.e. documentation) has led me to organize the specs for a Stack like this:

# Disclaimer - this is an example, not a recommendation
An empty stack
- should be empty
- should not be full
- should add pushed item to the top of the stack
- should complain on peek
- should complain on pop

An almost empty stack (with one element)
- should not be empty
- should not be full
- should add pushed item to the top of the stack
- should return the top element on peek
- should not remove the top element on peek
- should return the top element on pop
- should remove the top element on pop

An almost full stack (with one element less than capacity)
- should not be empty
- should not be full
- should add pushed item to the top of the stack
- should return the top element on peek
- should not remove the top element on peek
- should return the top element on pop
- should remove the top element on pop

A full stack
- should be full
- should not be empty
- should complain on push
- should return the top element on peek
- should not remove the top element on peek
- should return the top element on pop
- should remove the top element on pop

This particular grouping of contexts clearly expresses the boundaries, and how the Stack should behave on and around them. There are, however, a few problems.

Notice that a lot of the same specifications are applied to all of the non-empty stacks. This duplication bugs most people, including me, but in my crusade to challenge those who would gum up the specs in the name of DRY-ness, I've allowed for it in spite of my own inner reservation. Crusades are dangerous things.

Besides the duplication, we're not really addressing all of the interesting things about the boundaries. What happens when an almost full stack receives a push? It should then be full, right?

Many of the reactions to this duplication have been requests to get rspec to be more supportive of things like nested and/or reusable contexts and specs - something that the committers are generally opposed to because BDD and rspec exist, in part, as a reaction to problems that come up repeatedly in the implementation of TDD - many of which are related to lack of clarity in tests, and much of that lack of clarity due to overuse of indirection and reusable test components. This push to "enhance" rspec and my part in the resistance to the push have combined to distract me from thinking about a better solution - not in what rspec supports, but rather in how to organize the specs.

Thinking about this a bit more, and considering equivalence groups and boundaries, I was playing around w/ the structure during a class and came up with this:

# Disclaimer - this is also an example, not a recommendation - but closer to a recommendation than the example above!
A stack (in general)
- should add to the top when sent 'push'
- should return the top item when sent 'peek'
- should NOT remove the top item when sent 'peek'
- should return the top item when sent 'pop'
- should remove the top item when sent 'pop'

An empty stack
- should be empty
- should no longer be empty after 'push'
- should complain when sent 'peek'
- should complain when sent 'pop'

An almost empty stack (with one item)
- should not be empty
- should remain not empty after 'peek'
- should become empty after 'pop'

An almost full stack (with one item less than capacity)
- should not be full
- should become full when sent 'push'

A full stack
- should be full
- should remain full after 'peek'
- should no longer be full after 'pop'
- should complain on 'push'

I think this is a much better example! The generic "A stack" context describes the normal case for how a stack behaves, while the other contexts deal specifically with how a stack at or near a boundary behaves differently from the normal case. This is a far more clear specification and eliminates most of the duplication. Win/Win.

!commentForm
 Tue, 1 Aug 2006 09:39:32, Ged Byrne, Have you considered DbC[?]?
For me this example shows the effectiveness of Design by Contract. Perhaps rspec could benefit from implenting its ideas?

With DbC[?], as implemented in Eiffel, you can specify the stack with all the advantages of multiple inheritance and executability.

Consider the definition of stack in Eiffel, as found here:

http://www.gobosoft.com/eiffel/gobo/structure/flatshort/ds_stack.html

In Eiffel the word 'put' is used instead of 'push'. The spec for the put feature is:

put (v: G)
Push v on stack.
(From DS_DISPENSER.)
require
extendible: extendible (1)
deferred
ensure
one_more: count = old count + 1
pushed: item = v

Notice that it is an inherited feature. The same spec is schared with the FIFO queue as well, so you don't repeat yourself there either: http://www.gobosoft.com/eiffel/gobo/structure/flatshort/ds_queue.html

Stack adds one additional post condition: pushed: item = v

The item feature is the eiffel equivalent to 'peek', so this line has the same meaning as 'should add pushed item to the top of the stack'

It looks to me like Eiffel could provide a rich seam of ideas that could be mined to improve rspec.

 Mon, 28 Aug 2006 09:25:29, Luke Redpath,
David, its funny, but I was thinking about this the other day. I've previously used the generic context "A foo" for writing specs for class methods, because obviously there is only one context for a class method, with no state to worry about. But I also like the idea of applying generic specs that apply to "all foos" no matter what the context in a generic context.
 Mon, 28 Aug 2006 10:59:10, Jay Levitt, First glance
I'm not sure what the above spec means.

In English, it seems to mean that an empty stack, being a stack in general, should add to the top when sent 'push'. However, I don't think you're proposing a change to the rspec DSL, so that wouldn't actually get executed, because in the DSL, the two specs are separate.

If an empty stack does execute the "A stack (in general)" context, then "should remove the top item when sent 'pop'" would give an error; it takes a human to see that "should complain when sent 'pop'" overrides it.

 Mon, 28 Aug 2006 11:16:57, David Chelimsky, re: First Glance
I see your point Jay. Would it make a difference if we called the first context "A non-empty, non-full stack"?

The idea I'm going for here is that this first context is the normal path. When I describe a stack to someone, I don't start with "if it's empty, it does xyz". I start with "A stack is a LIFO collection that responds to the messages 'push', 'pop' and 'peek'. 'push' adds an item to the top of the stack. 'peek' returns the top item without removing it from the stack. 'pop' returns the top item AND removes it from the stack. Now if the stack is empty..."

So the goal is to describe the general behaviour first, and then cover the "interesting" cases. The fact that we don't specify that an empty stack should behave like any other non-full stack in response to 'push' doesn't bother me here.

Thoughts?
 Mon, 28 Aug 2006 16:13:23, Jay Levitt,
I think "A stack (in general)" is fine, as long as I know what I'm, well, expecting. Otherwise, you'll end up listing the inverse of each following specification to define what "in general" means, and that's not DRY.

But it doesn't feel right to skip over those "in general" cases for the specific contexts, since each of them could be edge cases. What if pushing onto a 1-item stack adds to the bottom?

But, then, that's exactly why I like the idea of nested contexts and other reusable constructs. I can see why you wouldn't, but I don't think this type of reorganization is an argument for not needing them - if anything, it seems to show why they're needed.
 Mon, 28 Aug 2006 19:10:38, Uncle Bob, should_conform_to stack
So I would make the following changes:

A stack (in general)
- should add to the top when sent 'push'
- should return the top item when sent 'peek'
- should NOT remove the top item when sent 'peek'
- should return the top item when sent 'pop'
- should remove the top item when sent 'pop'

An empty stack
- should conform to stack
- should be empty
- should no longer be empty after 'push'
- should complain when sent 'peek'
- should complain when sent 'pop'

An almost empty stack (with one item)
- should conform to stack
- should not be empty
- should remain not empty after 'peek'
- should become empty after 'pop'

An almost full stack (with one item less than capacity)
- should conform to stack
- should not be full
- should become full when sent 'push'

A full stack
- should conform to stack
- should be full
- should remain full after 'peek'
- should no longer be full after 'pop'
- should complain on 'push'
 Mon, 28 Aug 2006 21:34:44, David Chelimsky, re: should_conform_to_stack
Someone on the rspec developers list had a similiar suggestion, where you could create groups of more general specs that could be applied to specific contexts. If I recall, it would look something like this:

spec_group :non_empty_stack do
specify "should return top item when sent 'peek'" do
...
end
...
end

spec_group :non_full_stack do
specify "should add item to the top when sent 'push'" do
...
end
...
end

context "An empty stack" do
is :non_full_stack
end

context "An almost empty stack" do
is :non_empty_stack, :non_full_stack
end

context "An almost full stack" do
is :non_empty_stack, :non_full_stack
end

context "A full stack" do
is :non_empty_stack
end


Then the output would be something like:

An empty stack
- should add item to the top when sent 'push'

An almost empty stack
- should return top item when sent 'peek'
- should add item to the top when sent 'push'

An almost full stack
- should return top item when sent 'peek'
- should add item to the top when sent 'push'

A full stack
- should return top item when sent 'peek'


And any additional examples that only apply to a given context could be added directly to that context. Seems like a good idea, but there was (and remains) some concern that this would lead us towards unclear specs w/ lots of indirection - something that we'd like to make very difficult to do in rspec.
 Tue, 29 Aug 2006 09:00:47, Uncle Bob, Indirection and should_satisfy
If you are worried about indirection, you should worry about should_satisfy since that simply boils down to a function call that can be used to support unlimited indirection. Controlling indirection is not a matter of eliminating it, it's a matter of constraining it into several useful forms. Rather like while,if,else constrained goto into useful forms for structured programming, and like virtual functions constrained pointers to functions for C++, etc. You can't eliminate indirection so you have to corral it by laying the fences around the acceptable forms. IMHO.
 Wed, 30 Aug 2006 05:23:12, David Chelimsky, re: Indirection and should_satisfy
You can also write helper methods that can do anything you wish. The indirection concern is more a concern of confusion due to multiple setups. The problem w/ the structure proposed above is what the code ends up looking like when you implement the examples. At this point, we've only got two cases - a non-empty stack a non-full stack. There are values that need to be present in the spec_group, and they need to either be hard coded or suppplied from the contexts that use them. Here's what it might look like if they are hard coded:

spec_group :non_empty_stack do
specify "should return top item when sent 'peek'" do
@stack.peek.should_equal 10
end
end

spec_group :non_full_stack do
specify "should add item to the top when sent 'push'" do
@stack.push 11
@stack.peek.should_equal 11
end
end

context "An empty stack" do
is :non_full_stack
setup do
@stack = Stack.new
end
end

context "An almost empty stack" do
is :non_empty_stack, :non_full_stack
setup do
@stack = Stack.new
@stack.push 10
end
end

context "An almost full stack" do
is :non_empty_stack, :non_full_stack
setup do
@stack = Stack.new
(2..10).each { |n| @stack.push n }
end
end

context "A full stack" do
is :non_empty_stack
setup do
@stack = Stack.new
(1..10).each { |n| @stack.push n }
end
end


Note how we "cleverly" make 10 the value at the top of all the non-empty stacks. This, in my view, is the problem. The values needed in the spec_group in order to express expectations put what I view as awkward constraints on the setups in the contexts. Keep in mind that this example is trivial. Imagine what the full stack spec might have to look like here.
 Wed, 30 Aug 2006 10:34:48, Uncle Bob, It's a state machine.
You could think of :empty, :almost_empty, :non_empty, :almost_full, and :full as states in an FSM. An STD can define how you get from one state to the next:

[
{initial_state => :empty},
{current_state => :empty, :transition => {|s| s.push(10)}, new_state => :almost_empty},
{current_state => :almost_empty, :transition => {|s| s.push(11)}, new_state => :non_empty},
...
]

Given this, the magic numbers in your spec groups suddenly make sense, and you can remove all the setups from the contexts. Moreover, the runner can walk the state map and find every path to every state while invoking the specs (which are really just invariants) for each state.

What's interesting about this is that it gives us a way to describe the behavior and the data separately. The STD describes the behavior without checking any data, and the specifications and spec_groups specify the expected data at each stage of the behavior.
 Tue, 5 Sep 2006 11:09:56, David Chelimsky, re: It's a state machine
The state machine idea is certainly interesting. It does seem to work well in something with such a limited and linear set of states as an ordered collection. Do you think it would still work (i.e. stay clean and clear) with multiple properties that can be in multiple states that aren't necessarily so inherently linear?
 Thu, 5 Oct 2006 00:02:58, Mike H, Some Thoughts and Examples
I was thinking about this very issue today, in the context of thiking about the proper way to set up my specs.

In your very good discussion here (http://rubyforge.org/pipermail/rspec-devel/2006-June/000191.html), all of your nested has to do with characteristics of the entity, in this case a person.

A person
- who is an American citizen
- and is 17 years old

In this case, the nested example is (arguably) better factored, but it's pretty clear that the un-nested example is clearer.

In the spec I was writing today, the nested involved doing things to an entity, instead of characteristics of that entity. Let me try to convey what I mean through an example. Here's a non-nested version of part of the stack spec

context "Empty Stack" do
setup do
@stack = Stack.new
end
specify "should be empty" do
@stack.should_be_empty
end
specify "should add pushed item to the top of the stack" do
@stack.push 1
@stack.peek.should_equal 1
end
end

The part I don't like is the 2nd spec. Anytime you're performing an action in a specify, it strikes me as a potential code smell.

Here's another non-nested version. I'm not claiming it's right or better.

context "Empty Stack" do
setup do
@stack = Stack.new
end
specify "should be empty" do
@stack.should_be_empty
end
end
context "Stack with element just pushed onto it" do
setup do
@stack = Stack.new
@stack.push 1
end
specify "element is at top of stack" do
@stack.pop.should_equal 1
end
end


Really, the context for the 2nd test is "Stack with element just pushed onto it." In the first spec, you are taking the empty stack context, transforming into a new context with the push, and then specifying.

We've all seen xUnit test methods that look like this

def test_produce_widget
@factory.setup_for_widget_production!
@factory.widget_size = 14
@factory.some_other_method_to_setup_for_test!
@factory.some_other_method_to_setup_for_test!
@factory.some_other_method_to_setup_for_test!
@factory.some_other_method_to_setup_for_test!
@factory.some_other_method_to_setup_for_test!
widget = @factory.produce_widget
assert_equal Widget,widget.class
assert_equal :green,widget.color
assert_equal 14,widget.size
end

All that setup in the test method is an obvious code smell; it belongs in some kind of setup.

To me, that seems different from the stack example in degree, but not in kind.

Here's the nested example of the stack spec

context "Empty Stack" do
setup do
@stack = Stack.new
end
specify "should be empty" do
@stack.should_be_empty
end
context "Element pushed onto stack" do
setup do
@stack.push 1
end
specify "element should be at top of the stack" do
@stack.peek.should_equal 1
end
end
end


For this example, that doesn't look better to me. For more complex examples, I'm not sure.

This comment isn't so much about nesting as it is about my trying to figure out the best way to think about my specs.

The way most people would write xUnit tests for a stack, they would make one class, setup an empty stack, and then in each test method modify the stack as appropriate before doing 1 (or probably many) assertions.

For me, the importance of BDD/rSpec isn't a different syntax, etc, but the way is which it makes me think less about exercising the methods in my code and more about specifying behavior. Obviously, that's the whole idea. Where I get confused as to the best way to construct a spec is in those cases where the two alternatives are to either
1. Perform an action in your specify before doing an assertion
2. Create a new context with the action in the setup

When I'm testing if the produced widget is green, the context shouldn't be "Widget Factory." It should be "Widget Factory set to produce green widgets," or to go even further, "Widget produced by green Widget Factory." This is the (to use an overloaded word) context in which I think about nesting.

Rambling comment complete. I'd love to hear your thoughts, even if they are just "I have no idea what you're talking about."

I just added the wiki markup {{{ and }}} to your post so that the formatting works. Thought you'd like to know how it was done. -- Tim Ottinger