ArticleS. DeanWampler.
ExtractingUsageDocumentationFromTests [add child]

Extracting Component Usage Documentation from Tests


Consider this section of the "javadocs" for a BankAccount class.

 BankAccount Javadoc
Constructor Detail
public BankAccount(int id)

Method Detail
public void deposit(double amount)
public double getBalance()
public void withdraw(double amount)
public void setBalance(double amount)

The nut job who wrote this code (that would be me...) didn't put in any comments. Is that good or bad?? Well, if you want to use this class, you don't know details such as, is the initial balance zero, are overdrafts allowed, etc.?

Here's the corresponding JUnit test:
 BankAccountTest

package demo;

import junit.framework.TestCase;

public class BankAccountTest extends TestCase {
private BankAccount ba;

protected void setUp() throws Exception {
super.setUp();
ba = new BankAccount();
}
public void testCreate() {
assertEquals(0.0, ba.getBalance());
}
public void testOneDeposit() {
ba.deposit(10.0);
assertEquals(10.0, ba.getBalance());
}
public void testTwoDeposits() {
ba.deposit(10.0);
ba.deposit(5.0);
assertEquals(15.0, ba.getBalance());
}
public void testWithdraw() {
ba.withdraw(10.0);
assertEquals(-10.0, ba.getBalance());
}
}

The tests are an executable specification of the software. The answers to the questions are here. So, why don't we mine the tests when generating our documentation, e.g., when using javadoc for Java code and rdoc for Ruby code? Sure, you could read the tests (assuming you have them for that 3rd-party library), but we all seem to prefer the convenience of formatted documentation.

There are a few tools that translate test method names into readable strings. TestDox[?] does this for JUnit tests, while rbtestdox and rtestdox are ports for Ruby. Rbtestdox generates HTML output while the other two generate plain text.

Rspec offers the same capability, but with two advantages: (i) more output format options and (ii) potentially more meaningful information. Since the "tests" are written as specifications, the resulting text can, in principle, be more useful to a reader who is trying to understand how a component works. Dave Astel's sspec version for Smalltalk will probably also support these options at some point (no pressure, Dave!).

All of this is interesting because typical javadoc/rdoc-style code documentation usually lacks information on proper usage of the component, such as the contract of the method calls (allowed inputs and guaranteed results) and any requirements on the calling order of methods, etc. This information will be in a complete and comprehensive test suite! Some developers may embed this information in the comments. However, since comments aren't testable, they are notoriously unreliable, as developers often don't keep them current with the changing code, if the comments exist at all. The testdox approach is a little better, as the output reflects executable code, if the developer uses meaningful method names. Again, there is no enforcement mechanism.

So, the "xspec" approach shows the most promise for giving us useful, automatically-generated documentation that reflects the real behavior of the code.

Whatever you do, make your test and spec method names descriptive and make sure the tests and specs demonstrate all the usage issues for your code. Also, let's work together to improve our tools for extracting this documentation from our tests. Can we extract and repackage the information inside the test methods?

!commentForm
 Sat, 5 Aug 2006 15:14:05, Dave Astels, sSpec doc output
" Dave Astels' sSpec version for Smalltalk will probably also support these options at some point (no pressure, Dave!)."

It does already. Check the documentation at www.behaviourdriven.org/sspec

Dave
 Sun, 6 Aug 2006 11:24:49, ,
Protest (http://xspec.sf.net/protest.html) does this for Python.
 Sun, 6 Aug 2006 12:24:32, John Roth, Good idea
I think this is a good idea. I've got a little program I use for PyFIT to create a list of test method names, organized by test module and class. It's not part of the distribution because I haven't found it all that useful ... yet.

One issue you raise here is quite interesting: how do you relate the test method names to a specific method. I don't organize my tests by method, and trying to pull the methods out by inspecting the source code seems like it would be a lot of work to create a hodge-podge.

This bears some thinking about. Possibly a Related: item somewhere so I know where to index the test?

John Roth
 Sun, 6 Aug 2006 12:39:32, John Roth, Can't find protest...
The path for "protest" seems to get a 404 however I try it.
 Sun, 6 Aug 2006 18:09:49, Dean Wampler, sSpec doc output
Dave, thanks for the link! I'm glad sspec is already there.
 Sun, 6 Aug 2006 18:12:57, Dean Wampler, RE: Good idea
I agree that the first goal would be to associate whatever information is easy to extract with the appropriate method, etc. In Java, you could use yet-another annotation, which risks the usual "annotation pollution", though.

I think the format of the "xSpec" specifications could easily support this kind of detail extraction just by using suitable conventions. For example, the rspec example at http://rspec.rubyforge.org/examples.html has a lot of "@stack.pop", "@stack.push", etc. What if the documentation output just found all calls to "@stack.pop" and somehow associated them with the rdoc for Stack#pop? It would be a start. Of course, the specs. that contain calls to several different methods would get associated with each one of those methods, which is still okay, it seems to me.
 Sun, 6 Aug 2006 20:45:59, David Chelimsky, generating docs from the example code
We've talked about generating docs from the example code, but we've felt that we wouldn't be able to generate consistently useful docs. It is a goal in the back of our minds, though, as it would reduce the duplication between names and code.
 Mon, 7 Aug 2006 09:13:36, John Roth, Crossreference
Associating tests with methods is a bit of a hassle. However, there might be a real simple way of doing it: just index the method at the beginning of a line that actually contains a test. That way other methods that occur in the specify can be ignored.

It remains to be seen whether this is too simple.

John Roth