[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

SVN test suite requirements (was CVS update ...)

From: Mo DeJong <mdejong_at_cygnus.com>
Date: 2001-03-28 07:38:07 CEST

I went ahead and changed the subject for this
thread since it was getting overloaded.

Also note that I have attached an updated
goals.html file to this email.

On Tue, 27 Mar 2001, Greg Stein wrote:

> > Can folks take a look at these requirements and suggest
> > any revisions or clarifications you think might be needed?
>
> I like it, but for a couple nits:
 
> *) automatic recovery is kind of a "3rd generation" type of thing for a test
> harness. that can get difficult. not saying it shouldn't be there, but
> our expectations should be set property to "probably not for a while"

Funny, I was thinking this was needed sooner rather than later.
I updated the Automatic Recovery: section to try to explain
why this is needed a bit better. Does it help?

> *) the "Platform Specific Results" section is very unclear, and I think not
> entirely correct/appropriate. Specifically, I would think that we would
> NOT have platform-specific results -- that all platforms should end up
> the same. if there *are* differences, then CVS should have separate
> copies of each platform's expected output (where that differs from the
> "portable, expected output")

I think my description was less than clear about which "results"
I was talking about (the test results compared to the PASSED/FAILED
results one gets by comparing a test's return value to the expected
result). I have added a much more detailed example to the
"Platform Specific Results" section in hopes of addressing this.

> *) you imply that a set of information is saved from "the last run". I'm not
> sure that I buy into saving info about the last run. Essentially, I think
> each test suite run should be stateless. But if we *do* have a need for
> retaining state, then the doc might want to discuss where that is
> properly kept.

Well, yes. Each run should be stateless. I am only talking
about comparing the final results from one run to the next.
Instead of just implying, I added an "Automatic Logging of Results:"
section to try to deal with this. Does that clear things up?
I also added a section about why test named need to be unique
and how developers would interface the svntest driver.

> *) what's with the 40 column formatting? do you use really huge fonts or
> something, and so you have fewer columns on your screen? I'm finding it
> is actually harder to read the text when there is a newline every five
> words or so. (in the past, I've actually reformatted your emails before
> reading them)

I am just strange :)
 
> I'd like to see your next draft checked into source control. I think it is
> very well done, overall.

I have my own CVS repo for the svntest code I am working on right
now. I guess I would rather just merge the whole directory in
later, if possible. If folks like my test harness, I would
like to get it added in subdirectories of a new
subversion/subversion/svntest directory. Of course, that
is still a number of weeks off (regular job, you know).
My hope is to have something I can demo at the SVLUG meeting.

Mo

<html>
<title>SVN Test</title>

<body bgcolor="white">

<h1>Design goals for the SVN test suite</h1>

<ul>
<li>
<A HREF="#WHY">Why Test?</A>
</li>
<li>
<A HREF="#AUDIENCE">Audience</A>
</li>
<li>
<A HREF="#REQUIREMENTS">Requirements</A>
</li>
<li>
<A HREF="#EASEOFUSE">Ease of Use</A>
</li>
<li>
<A HREF="#LOCATION">Location</A>
</li>
<li>
<A HREF="#EXTERNAL">External dependencies</A>
</li>
</ul>



<A NAME="WHY"><H3>Why Test?</H3></A>

Regression testing is an essential
element of high quality software.
Unfortunately, some developers
have not had first hand exposure
to a high quality testing framework.
Lack of familiarity with the positive
effects of testing can be blamed
for statements like:
<br>
<blockquote>
"I don't need to test my code,
I know it works."
</blockquote>
It is safe to say that the
idea that developers do not
introduce bugs
has been disproved.
</p>


<A NAME="AUDIENCE"><H3>Audience</H3></A>

The test suite will be used by
both developers and end users.

<p>
<b>Developers</b> need a test suite to help with:
</p>

<p>
<b><i>Fixing Bugs:</i></b><br>
Each time a bug is fixed, a test case should be
added to the test suite. Creating a test case
that reproduces a bug is a seemingly obvious
requirement. If a bug cannot be reproduced,
there is no way to be sure a given change
will actually fix the problem. Once a
test case has been created, it can be used
to validate the correctness of a given patch.
Adding a new test case for each bug also
ensures that the same bug will not be
introduced again in the future.
</p>

<p>
<b><i>Impact Analysis:</i></b><br>
A developer fixing a bug or adding
a new feature needs to know if
a given change breaks other parts
of the code. It may seem obvious,
but keeping a developer from
introducing new bugs is one
of the primary benefits of
a using a regression test
system.
</p>

<p>
<b><i>Regression Analysis:</i></b><br>
When a test regression occurs,
a developer will need to manually
determine what has caused the failure.
The test system is not able
to determine why a test case
failed. The test system should
simply report exactly which test
results changed and when the
last results were generated.
</p>

<b>Users</b> need a test suite to help with:

<p>
<b><i>Building:</i></b><br>
Building software can be a scary process.
Users that have never built software
may be unwilling to try. Others may
have tried to build a piece of software
in the past, only to be thwarted by
a difficult build process. Even if
the build completed without an error,
how can a user be confident that the
generated executable actually works?
The only workable solution to this
problem is to provide an easily
accessible set of tests that the
user can run after building.
</p>

<p>
<b><i>Porting:</i></b><br>
Often, users become porters when
the need to run on a previously
unsupported system arises. This
porting process typically require
some minor tweaking of include files.
It is absolutely critical that
testing be available when porting
since the primary developers
may not have any way to test
changes submitted by someone
doing a port.
</p>


<p>
<b><i>Testing:</i></b><br>
Different installations
of the exact same OS can
contain subtle differences
that cause software to
operate incorrectly.
Only testing on different
systems will expose problems
of this nature. A test suite
can help identify these sorts
of problems before a program
is actually put to use.
</p>




<A NAME="REQUIREMENTS"><H3>Requirements</H3></A>

Functional requirements of
an acceptable test suite include:

<p>
<b><i>Unique Test Identifiers:</i></b><br>
   Each test case must have a globally
   unique test identifier, this identifier
   is just a string. A globally unique
   string is required so that test cases
   can be individually identified by name,
   sorted, and even looked up on the web.
   It seems simple, perhaps even blatantly
   obvious, but some other test packages
   have failed to maintain uniqueness in
   test identifiers and developers have
   suffered because of it. It is even
   desirable for the system actively
   enforces this uniqueness requirement.
</p>

<p>
<b><i>Exact Results:</i></b><br>
   A test case must have one expected
   result. If the result of running the
   tests does not exactly match the
   expected result, the test must fail.
</p>

<p>
<b><i>Reproducible Results:</b></i><br>
   Test results should be reproducible.
   If a test result matches the expected
   result, it should do so every time
   the test is run. External
   factors like time stamps must
   not effect the results of a test.
</p>

<p>
<b><i>Self-Contained Tests:</b></i><br>
   Each test should be self-contained.
   Results for one test should not
   depend on side effects of previous
   tests. This is obviously a good
   practice, since one is able to
   understand everything a test is
   doing without having to look
   at other tests. The test system
   should also support random access
   so that a single test or set of
   tests can be run. If a test is not
   self-contained, it cannot be run
   in isolation.
</p>

<p>
<b><i>Selective Execution:</i></b><br>
   It may not be possible to run
   a given set of tests on certain
   systems. The suite must provide
   a means of selectively running
   tests cases based on the
   environment. The test system
   must also provide a way to
   selectively run a given
   test case or set of test
   cases on a per invocation
   basis. It would be incredibly
   tedious to run the entire
   suite to see the results
   for a single test.
</p>

<p>
<b><i>No Monitoring:</i></b><br>
   The tests must run from start to
   end without operator intervention.
   Test results must be generated
   automatically. It is critical
   that an operator not need to
   manually compare test results
   to figure out which tests failed
   and which ones passed.
</p>


<p>
<b><i>Automatic Logging of Results:</i></b><br>
   The system must store test
   results so that they can be
   compared later. This applies
   to machine readable results
   as well as human readable
   results. For example, assume
   we have a test named
   <code>client-1</code>, it expects
   a result of 1 but instead 0
   is returned by the test case.
   We should expect the system to
   store two distinct pieces of
   information. First,
   that the test failed. Second,
   how the test failed, meaning
   how the expected result
   differed from the actual result.
<p>

<p>
   This following example shows
   the kind of results we might
   record in a results log file.
</p>

<p>
   <code><pre>
   client-1 FAILED
   client-2 PASSED
   client-3 PASSED
   </pre></code>
</p>

<p>
<b><i>Automatic Recovery:</i></b><br>
   The test system must be able to recover
   from crashes and unexpected delays.
   For example, a child process might
   go into a infinite loop and would
   need to be killed. The test shell
   itself might also crash or go into
   an infinite loop. In these cases,
   the test run must automatically
   recover and continue with the tests
   directly after the one that crashed.
</p>

<p>
   This is critical for a couple of
   reasons. Nasty crashes and infinite
   loops most often appear on users
   (not developers) systems. Users
   are not well equipped to deal with
   these sorts of exceptional situations.
   It is unrealistic to expect that
   users will be able to manually
   recover from disaster and restart
   crashed test cases. It is an
   accomplishment just to get them
   to run the tests in the first
   place!
</p>

<p>
   Ensuring that the test system
   actually runs each and every
   test is critical, since a failing
   test near the end of the suite
   might never be noticed if a
   crash halfway through kept
   all the tests from being run.
   This process must be completely
   automated, no operator intervention
   should be required.
</p>


<p>
<b><i>Report Results Only:</i></b><br>
   When a regression is found, a developer
   will need to manually determine the reason
   for the regression.
   The system should tell the developer exactly what
   tests have failed, when the last set of
   results were generated, and what the previous
   results actually were.
   Any additional functionality is outside the
   scope of the test system.
</p>

<p>
<b><i>Platform Specific Results:</i></b><br>
   Each supported platform should
   have an associated set of
   test results. The naive
   approach would be to maintain
   a single set of results and
   compare the output for any platform
   to the known results. The problem
   with this approach is that is does
   not provide a way to keep track
   of when changes differ from one
   platform to another. The following
   example attempts to clarify
   with an example.</p>

   <p>
   Assume you have the following
   tests results generated on a
   reference platform before
   and after a set of changes
   were committed.
   </p>

<table BORDER=1 COLS=2>

<tr>
<td><b>Before</b> (Reference Platform)</td>

<td><b>After</b> (Reference Platform)</td>
</tr>

<tr>
<td><code>client-1 PASSED</code></td>
<td><code>client-1 PASSED</code></td>
</tr>

<tr>
<td><code>client-2 PASSED</code></td>
<td><code>client-2 FAILED</code></td>
</tr>

</table>

   <p>
   It is clear that the change you made introduced
   a regression in the <code>client-2</code> test.
   The problem shows up when you try to compare
   results generated from this modified code on
   some other platform. For example, assume
   you got the following results:
   </p>

<table BORDER=1 COLS=2>

<tr>
<td><b>Before</b> (Reference Platform)</td>

<td><b>After</b> (Other Platform)</td>
</tr>

<tr>
<td><code>client-1 PASSED</code></td>
<td><code>client-1 FAILED</code></td>
</tr>

<tr>
<td><code>client-2 PASSED</code></td>
<td><code>client-2 PASSED</code></td>
</tr>

</table>

   <p>
   Now things are not at all clear. We know that
   <code>client-1</code> is failing but we don't
   know if it is related to the change we just
   made. We don't know if this test failed the
   last time we ran the tests on this platform
   since we only have results for the reference
   platform to compare to. We might have fixed
   a bug in <code>client-2</code>, or we might
   have done nothing to effect it.
   </p>

   <p>
   If we instead keep track of test results
   on a platform by platform basis, we can
   avoid much of this pain. It is easy to
   imagine how this problem could get
   considerably worse if there were
   50 or 100 tests that behaved differently
   from one platform to the next.
   </p>

<p>
<b><i>Test Types:</i></b><br>
   The test suite should support two
   types of tests. The first makes
   use of an external program
   like the svn client.
   These kinds of tests will need
   to exec an external program and
   check the output and exit status
   of the child process. Note that
   it will not be possible to run
   this sort of test on Mac OS.
   The second type of test will
   load subversion shared libraries
   and invoke methods in-process.
</p>

<p>
   This provides the ability to
   do extensive testing of the
   various subversion APIs without
   using the svn client. This also
   has the nice benefit that it
   will work on Mac OS, as well
   as Windows and Unix.
</p>

<A NAME="EASEOFUSE"><H3>Ease of Use</H3></A>

<p>
Developers will tend to avoid using
a test suite if it is not easy to
add new tests and maintain old ones.
If developers are uninterested in
using the test suite, it will
quickly fall into disrepair
and become a burden instead of
an aide.
</p>

<p>
Users will simply avoid running
the test suite if it is not
extremely simple to use. A
user should be able to build
the software and then run:

<blockquote>
<code>
% make check
</code>
</blockquote>

This should run the test suite
and provide a very high level
set of results that include
how many tests results have
changed since the last run.
</p>

<p>
While this high level report
is useful to developers, they
will often need to examine
results in more detail.
The system should provide a
means to manually examine
results, compare output,
invoke a debugger, and
other sorts of low level
operations.
</p>

<p>
The next example shows how
a developer might run a
specific subset of tests
from the command line. The
pattern given would be used
to do a glob style match on
the test case identifiers,
and run any that matched.
</p>

<blockquote>
<code>
% svntest "client-*"
</code>
</blockquote>

<A NAME="LOCATION"><H3>Location</H3></A>

<p>
The test suite should be packaged
along with the source code instead
of being made available as a separate
download. This significantly
simplifies the process of running
tests since since they are
already incorporated into
the build tree.
</p>

<p>
The test suite must support
building and running inside
and outside of the source
directory. For example,
a developer might want to
run tests on both Solaris
and Linux. The developer
should be able to run the
tests concurrently in two
different build directories
without having the tests
interfere with each other.
</p>


<A NAME="EXTERNAL"><H3>External program dependencies</H3></A>

<p>
As much as possible, the test suite should avoid
depending on external programs or libraries.

Of course, there is a nasty bootstrap problem
with a test suite implemented in a
scripting language. A wide variety
of systems provide no support for modern
scripting languages. We will avoid
this issue for now and assume that
the scripting language of choice is
supported by the system.
</p>

<p>For example, the test suite should not depend
on CVS to generate test results. Many users
will not have access to CVS on the system
they want to test subversion on.</p>

<hr>

</body>
</html>
Received on Sat Oct 21 14:36:26 2006

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.