Lee Burgess Wrote:
> So I spent some time Tuesday talking to Ben and Karl about the client
> test suite. Basically what is need is something that is fully
> automated rather than partially automated.
>
> The test suite is not satisfactory in at least two ways:
>
> * The real, qualitative result of each client operation is not
> checked;
...
> The way I see it, I have two choices: Perl or Python. I am more
> fluent in Perl, but I like Python more.
Daniel Stenberg Wrote:
> Is portability an issue? I mean, is there any plans of ever bringing
this to
> something like windows and is that then an issue when selecting language?
Greg Hudson <ghudson@MIT.EDU>:
> As a site integrator, I find it obnoxious when packages rely on perl
> for any part of the build or regression test procedure.
Hi all.
It looks like keeping this thread from degenerating into
an all out language advocacy war is going to be very
interesting.
About 8 months ago I sat down and started writing
a regression test system that does much of what
you describe here. My focus was on testing Java
compilers, but the mechanics are basically the
same as what subversion needs. A high level
overview of the results of that work can be
found here:
http://www-106.ibm.com/developerworks/library/l-jacks/?dwzone=linux
Since the Jacks project was begun, a large number
of tests have been added. There are currently
about 1600 individual test. As the number of test
cases increased, there were a number of
"scalability" issues that came up.
I found that solving these problems as they
came up was made significantly easier because
the test suite was implemented in Tcl
(vs a compiled language like C, Java, ...)
Now, at this point the more reactionary elements
will be thinking, "Noooo! we can't use Tcl,
both RMS and ESR said it was bad!" I am
going to try to avoid emotional issues
and focus on the actual problems and how
solving them with a scripting language
like Tcl was the right solution for this
problem set.
First the "scalability issue". When you have
a small number of tests, the mechanics of
how the test runs, how you examine results,
and how new tests are integrated into the
suite are not that critical. When you
have to deal with a couple of hundred test
cases, the mechanics become really important.
Let me provide one quick example of a
"scalability" problem and how it
was solved in the Jacks regression test
suite.
A Jacks test involves sending some known
input to a compiler and then checking
for the return status of the compiler
and possibly the output (a .class file).
Early on, we were saving the test case
in a .java file and then writing a test
case that would compile that given
.java file and check the result.
(assume that One.java is on the filesystem)
test example-1 { compile One.java } {
compile One.java
} PASS
Looks simple right. What kind of
"scalability problem" could this test have?
Well there are quite a few, but lets just
focus on the actual input to the test case
for right now.
The One.java file needs to exist on the
filesystem for this test to work. That
means you as the developer need to keep track
of One.java. Of course, you need to create
it, then you need to add it to the CVS, possibly
modify a ChangeLog, and so on. Not too hard for
1 file, but it becomes a big deal when you want
to add 50 new tests. You also run into an ugly
"lookup" problem here. The mapping from test
"example-1" to source code "One.java" exists
only in the test case. When it comes time to review
test cases for correctness or diagnose a failure, you
end up with a bunch of files open in an editor.
Believe me, it is quite a pain and can be very
error prone.
Things get a lot easier if you combine the
test input and expected output. In this
example, the source code is saved and then
compiled by the test case:
test example-1 { compile example } {
saveas One.java {
class One {}
}
compile One.java
} PASS
That one change means you no longer have to deal
with another file that stores the test input.
The next step is to store more than one test case
in the same file, it seems simple but it is
quite important that the system provide the
ability to do this. I can not say this strongly
enough, a system that depends on a 1 test
case to 1 file mapping is doomed! Tests
need to be grouped by function. When regressions
show up during development (and they will),
simply knowing the general location of the
set of tests that are having problem can be
half the battle.
Now lets talk about the really hard problem.
The regression test system needs to provide
consistent results and the results need to
be interpreted automatically. The expected
test results must not change from one run
to the next. This is critical since we
need the test system to examine results
and inform us when there is a problem.
Let's be honest here, nobody likes to
be blamed for adding a bug to the system.
Developers like to fix bugs not add them.
Most of the time, bugs are added accidently.
Changes in one part of the code broke
something else in another part of the
code and the developer did not know
about it. This is one of the main
things the test system needs to
help us avoid. To do that, a developer
really needs to be able to press a button
and then wait for the system to tell him
if there were any regressions. The
system must not require anything of
the developer at this stage. If the
developer needs to examine test results
and compare them to previous runs,
we are just asking for trouble. Yes,
someone could do all this by hand,
but they could also just bust out an
abacus and avoid the middleman!
Here is a quick example of the kind
of test results and logging provided
by the Jacks test suite. This snip
is from the logging/changes file
that is automatically generated
after a full test run.
2001-02-17 {Passed {1448 1450} Failed {160 158}} {
15.18.1-2 {FAILED PASSED}
15.18.1-3 {FAILED PASSED}
15.18.1-7 {FAILED PASSED}
15.28-null-4 {PASSED FAILED}
}
This shows that on 2001-02-17, bugs
that caused 3 test cases to fail were
fixed. In the process, an unrelated
test case regressed.
When the system provides you concrete
data like this, it actually becomes
hard to break something without
noticing. As the number of test cases
increase, the chances of accidently
breaking something also decrease.
This also greatly simplifies porting
to a new system. Obviously, there
are some details I am glossing
over here. I have not really talked
about the ability to restart tests
that have crashed or about restarting
the test suite itself after a crash.
I have also avoided the issue of
in-process testing vs an exec
of a subprocess (this is a big
deal on a Mac OS since exec is not
really supported on Mac OS classic).
Before I sign off on this brain
dump, I just want to point out
a couple of other really nice
Jacks features that make actually
using the regression testing system
easy.
Earlier, I presented an example of
a test case that suffered a regression:
15.28-null-4
Looking up this test case is very
easy in Jacks since the suite
automatically generates test case
documentaiton from the test
cases themselves. Try it out
for yourself, go to:
http://oss.software.ibm.com/developerworks/opensource/cvs/jikes/~checkout~/jacks/docs/tests.html
The test case is located in section
15.28, scroll down to the link for
section 15.28 and click on it. You
can now scroll down the list and
click on the link for 15.28-null-4.
Neat eh?
I would like to implement this same
testing framework for subversion.
In fact, I have already started
working on it. At this point, I
have only written tests for the
svn client front end. I had
hoped to get it 95% working
before letting people try it out,
but since folks are talking about
the issue now it seemed like a
good time to mention it.
Here are a couple of quick examples
from the tests I have written for
the svn client. The check_err command
just execs the snv with the given args
and returns a pair (list) containing
the exit status and the output of
the cvs command.
set help_add {add (ad, new): Add new files and directories to version
control.
usage: add [TARGETS]
}
test client-help-add-1 { print help text on help add } {
check_err {svn help add}
} [list 0 $help_add]
test client-help-add-2 { print help text on add without args } {
check_err {svn add}
} [list 1 $help_add]
I don't see any major problems incorporating
this test infrastructure into subversion.
Initially, we would want to simply exec
the existing test cases written in C
and check the expected results. Once
the initial framework is done, I
would like to work on a Tcl API
to the subversion libraries themselves.
That would make it easy to convert the
C test cases over to scripts (test
cases that do not need to be compiled
are a real benefit). It would also
be easy to make use of the Tcl API
to Berkeley DB to write tests
that examined the database directly.
(no dbdump needed)
What do you folks think? Does that
sound good? I know what needs to
be done and I am willing to do
the work. It seems like this
"language advocacy" thing is
the only thing that folks
might object to. Thing is,
I have already written and
debugged the tools needed
to implement this so I
am not really too interested
in rewriting them in another
language to to make some
language advocates happy.
Don't take that the wrong
way, I am just lazy :)
Mo DeJong
Red Hat Inc
Received on Sat Oct 21 14:36:25 2006