Nathan Hartman wrote:
> Random input tests are not deterministic in the sense that one never
> knows in advance which inputs will be tested.
> > While that's great for finding bugs that a deterministic test won't
> surface, I wonder if we should try to keep the regression test suite
> deterministic (as much as feasible)
I was thinking about this yesterday, as I noticed on some test runs one
of the tests that I marked as XFAIL can unexpectedly pass (XPASS). I
agree there is merit in keeping the default run repeatable.
The (pseudo-)random number generator (RNG) used in the tests is a
deterministic algorithm, under our control.
The way mergeinfo-test.c is currently written, the existing tests 6 and
10 each seed the RNG to a time function before using it, while the tests
22-25 that I recently added don't. If I run 'mergeinfo-test.c 22', the
seed starts at zero and the test runs with a deterministic sequence, the
same on every run.
When I was developing these tests, I was running one of them at a time
and the sequence was repeatable. When running the whole test suite, so
that these tests run after tests 6 and 10 (and/or in parallel, I
suppose), only then does it become non-repeatable.
To make each test use a repeatable sequence independent of other tests,
each test should maintain its own RNG state. That is do-able.
> and move any random input testing
> to a separate "fuzz testing" program.
We do have an AFL tests subdirectory, but ...
> Meanwhile, I'm *not* suggesting to remove the test from the regression
> test suite. Rather, I'm suggesting to make it deterministic by giving
> it a fixed list of test inputs. ([...])
A good random-testing infrastructure would make it easy to capture sets
of inputs that caused failures in the past, and inputs that exercise a
high proportion of code paths, and construct a regression test
configuration that re-plays just the selected cases.
However, we don't have such an infrastructure. It is easier to manage
if instead we use parameters to make it run repeatably (and reasonably
quickly) by default and have the option to make it run non-repeatably
(and optionally for a much longer time) on demand. Then we can keep the
test code where it is, alongside all the similar tests, where they can
easily share test infrastructure.
For now, I propose to make each test use a repeatable sequence,
independent of the other tests. I think that will be enough; the
options I mentioned can be added if and when there is a demand for them.
How does that sound?
- Julian
Received on 2020-01-07 10:47:52 CET