ChatGPT解决这个技术问题 Extra ChatGPT

Random data in Unit Tests?

I have a coworker who writes unit tests for objects which fill their fields with random data. His reason is that it gives a wider range of testing, since it will test a lot of different values, whereas a normal test only uses a single static value.

I've given him a number of different reasons against this, the main ones being:

random values means the test isn't truly repeatable (which also means that if the test can randomly fail, it can do so on the build server and break the build)

if it's a random value and the test fails, we need to a) fix the object and b) force ourselves to test for that value every time, so we know it works, but since it's random we don't know what the value was

Another coworker added:

If I am testing an exception, random values will not ensure that the test ends up in the expected state

random data is used for flushing out a system and load testing, not for unit tests

Can anyone else add additional reasons I can give him to get him to stop doing this?

(Or alternately, is this an acceptable method of writing unit tests, and I and my other coworker are wrong?)

"random values means the test isn't truly repeatable" not true, since the tets will be using pseudo-random numbers. Provide the same initial seed, get the same sequence of "random" tests.
Anecdote: I once wrote a CSV export class, and random testing revealed a bug when control characters were placed at the end of a cell. Without random testing, I would not have ever thought to add that as a test case. Did it always fail? No. Is it a perfect test? No. Did it help me catch and fix a bug? Yes.
Tests can also serve as documentation, to explain when the code expects as input and what is expected as output. Having a test with clear arbitrary data can be simpler and more explanatory than code that generates random data.
If your unit test fails because of a randomly generated value, and this value is not part of an assert, good luck with debugging your unit test.
Small cost to random tests is that it can make it harder to refactor. When a random test fails in refactoring, you're not certain the bug is due to the refactoring. It also helps to debug execution of the same test payload on the master and refactoring branch on different machines. So random testing really needs a framework to easily export and import test payloads

R
RobertB

There's a compromise. Your coworker is actually onto something, but I think he's doing it wrong. I'm not sure that totally random testing is very useful, but it's certainly not invalid.

A program (or unit) specification is a hypothesis that there exists some program that meets it. The program itself is then evidence of that hypothesis. What unit testing ought to be is an attempt to provide counter-evidence to refute that the program works according to the spec.

Now, you can write the unit tests by hand, but it really is a mechanical task. It can be automated. All you have to do is write the spec, and a machine can generate lots and lots of unit tests that try to break your code.

I don't know what language you're using, but see here:

Java http://functionaljava.org/

Scala (or Java) http://github.com/rickynils/scalacheck

Haskell http://www.cs.chalmers.se/~rjmh/QuickCheck/

.NET: http://blogs.msdn.com/dsyme/archive/2008/08/09/fscheck-0-2.aspx

These tools will take your well-formed spec as input and automatically generate as many unit tests as you want, with automatically generated data. They use "shrinking" strategies (which you can tweak) to find the simplest possible test case to break your code and to make sure it covers the edge cases well.

Happy testing!


+1 to this. ScalaCheck does a phenomenal job of generating minimized, random test data in a repeatable way.
It's not random. It's arbitrary. Big difference :)
reductiotest.org now longer seems to exist, and Google did not point me anywhere else. Any idea where it is now?
It's now part of the Functional Java library. Link edited. But I would just use Scalacheck to test Java code.
@Apocalisp This is how one does testing of digital hardware while simulating it. The semiconductor industry term is 'constrained random verification'.
S
Silver Dragon

This kind of testing is called a Monkey test. When done right, it can smoke out bugs from the really dark corners.

To address your concerns about reproducibility: the right way to approach this, is to record the failed test entries, generate a unit test, which probes for the entire family of the specific bug; and include in the unit test the one specific input (from the random data) which caused the initial failure.


it's also possible to print the seed that was used for the random data generator each time the test fail. It will be therefore possible to reproduce them using the same seed in your home env.
W
Will Dean

There is a half-way house here which has some use, which is to seed your PRNG with a constant. That allows you to generate 'random' data which is repeatable.

Personally I do think there are places where (constant) random data is useful in testing - after you think you've done all your carefully-thought-out corners, using stimuli from a PRNG can sometimes find other things.


I have seen this work well on a system that had lots of locking and threads. The 'random' seed was writen to a file on each run, then if a run failed, we could work out the path the code took and write a hand written unit test for that case we had missed.
What does PRNG stand for?
Pseudo Random Number Generator
B
BeeOnRope

I am in favor of random tests, and I write them. However, whether they are appropriate in a particular build environment and which test suites they should be included in is a more nuanced question.

Run locally (e.g., overnight on your dev box) randomized tests have found bugs both obvious and obscure. The obscure ones are arcane enough that I think random testing was really the only realistic one to flush them out. As a test, I took one tough-to-find bug discovered via randomized testing and had a half dozen crack developers review the function (about a dozen lines of code) where it occurred. None were able to detect it.

Many of your arguments against randomized data are flavors of "the test isn't reproducible". However, a well written randomized test will capture the seed used to start the randomized seed and output it on failure. In addition to allowing you to repeat the test by hand, this allows you to trivially create new test which test the specific issue by hardcoding the seed for that test. Of course, it's probably nicer to hand-code an explicit test covering that case, but laziness has its virtues, and this even allows you to essentially auto-generate new test cases from a failing seed.

The one point you make that I can't debate, however, is that it breaks the build systems. Most build and continuous integration tests expect the tests to do the same thing, every time. So a test that randomly fails will create chaos, randomly failing and pointing the fingers at changes that were harmless.

A solution then, is to still run your randomized tests as part of the build and CI tests, but run it with a fixed seed, for a fixed number of iterations. Hence the test always does the same thing, but still explores a bunch of the input space (if you run it for multiple iterations).

Locally, e.g., when changing the concerned class, you are free to run it for more iterations or with other seeds. If randomized testing ever becomes more popular, you could even imagine a specific suite of tests which are known to be random, which could be run with different seeds (hence with increasing coverage over time), and where failures wouldn't mean the same thing as deterministic CI systems (i.e., runs aren't associated 1:1 with code changes and so you don't point a finger at a particular change when things fail).

There is a lot to be said for randomized tests, especially well written ones, so don't be too quick to dismiss them!


Hey, I'm wondering if you can share an example where randomized tests found a bug? I've read this argument frequently but somehow cannot imagine :) Usually, I'm trying to write easy and obvious code and cannot come up with an example where I could benefit from a randomized test. Thanks!
J
Jimmy Bosse

If you are doing TDD then I would argue that random data is an excellent approach. If your test is written with constants, then you can only guarantee your code works for the specific value. If your test is randomly failing the build server there is likely a problem with how the test was written.

Random data will help ensure any future refactoring will not rely on a magic constant. After all, if your tests are your documentation, then doesn't the presence of constants imply it only needs to work for those constants?

I am exaggerating however I prefer to inject random data into my test as a sign that "the value of this variable should not affect the outcome of this test".

I will say though that if you use a random variable then fork your test based on that variable, then that is a smell.


m
mreggen

In the book Beautiful Code, there is a chapter called "Beautiful Tests", where he goes through a testing strategy for the Binary Search algorithm. One paragraph is called "Random Acts of Testing", in which he creates random arrays to thoroughly test the algorithm. You can read some of this online at Google Books, page 95, but it's a great book worth having.

So basically this just shows that generating random data for testing is a viable option.


R
Robert Gould

Your co-worker is doing fuzz-testing, although he doesn't know about it. They are especially valuable in server systems.


but isn't this a fundamentally different thing from unit tests? and done at a different time?
@endolith there is no law of physics forcing you to run particular tests at particular times
@immibis But there are good reasons to do particular tests at particular times. You don't run a battery of unit tests every time a user clicks an "Ok" button.
@user253751 I think we should print some t-shirts with your quote! :)
T
Trystan Spangler

One advantage for someone looking at the tests is that arbitrary data is clearly not important. I've seen too many tests that involved dozens of pieces of data and it can be difficult to tell what needs to be that way and what just happens to be that way. E.g. If an address validation method is tested with a specific zip code and all other data is random then you can be pretty sure the zip code is the only important part.


J
John Millikin

if it's a random value and the test fails, we need to a) fix the object and b) force ourselves to test for that value every time, so we know it works, but since it's random we don't know what the value was

If your test case does not accurately record what it is testing, perhaps you need to recode the test case. I always want to have logs that I can refer back to for test cases so that I know exactly what caused it to fail whether using static or random data.


O
Outlaw Programmer

Can you generate some random data once (I mean exactly once, not once per test run), then use it in all tests thereafter?

I can definitely see the value in creating random data to test those cases that you haven't thought of, but you're right, having unit tests that can randomly pass or fail is a bad thing.


O
Ohad Bruker

You should ask yourselves what is the goal of your test. Unit tests are about verifying logic, code flow and object interactions. Using random values tries to achieve a different goal, thus reduces test focus and simplicity. It is acceptable for readability reasons (generating UUID, ids, keys,etc.). Specifically for unit tests, I cannot recall even once this method was successful finding problems. But I have seen many determinism problems (in the tests) trying to be clever with random values and mainly with random dates. Fuzz testing is a valid approach for integration tests and end-to-end tests.


I would add that using random input for fuzzing is a poor substitute for coverage-guided fuzzing when it is possible.
c
craigb

If you're using random input for your tests you need to log the inputs so you can see what the values are. This way if there is some edge case you come across, you can write the test to reproduce it. I've heard the same reasons from people for not using random input, but once you have insight into the actual values used for a particular test run then it isn't as much of an issue.

The notion of "arbitrary" data is also very useful as a way of signifying something that is not important. We have some acceptance tests that come to mind where there is a lot of noise data that is no relevance to the test at hand.


L
Leonardo Mangano

I think the problem here is that the purpose of unit tests is not catching bugs. The purpose is being able to change the code without breaking it, so how are you going to know that you break your code when your random unit tests are green in your pipeline, just because they doesn't touch the right path? Doing this is insane for me. A different situation could be running them as integration tests or e2e not as a part of the build, and just for some specific things because in some situations you will need a mirror of your code in your asserts to test that way. And having a test suite as complex as your real code is like not having tests at all because who is going to test your suite then? :p


m
maybesomename

In my experience unit tests and randomized tests should be separated. Unit tests serve to give a certainty of the correctness of some cases, not only to catch obscure bugs. All that said, randomized testing is useful and should be done, separately from unit tests, but it should test a series of randomized values. I can't help to think that testing 1 random value with every run is just not enough, neither to be a sufficient randomized test, neither to be a truly useful unit test.

Another aspect is validating the test results. If you have random inputs, you have to calculate the expected output for it inside the test. This will at some level duplicate the tested logic, making the test only a mirror of the tested code itself. This will not sufficiently test the code, since the test might contain the same errors the original code does.


E
EBGreen

Depending on your object/app, random data would have a place in load testing. I think more important would be to use data that explicitly tests the boundary conditions of the data.


T
Tim James

We just ran into this today. I wanted pseudo-random (so it would look like compressed audio data in terms of size). I TODO'd that I also wanted deterministic. rand() was different on OSX than on Linux. And unless I re-seeded, it could change at any time. So we changed it to be deterministic but still psuedo-random: the test is repeatable, as much as using canned data (but more conveniently written).

This was NOT testing by some random brute force through code paths. That's the difference: still deterministic, still repeatable, still using data that looks like real input to run a set of interesting checks on edge cases in complex logic. Still unit tests.

Does that still qualify is random? Let's talk over beer. :-)


T
Tom W

I can envisage three solutions to the test data problem:

Test with fixed data

Test with random data

Generate random data once, then use it as your fixed data

I would recommend doing all of the above. That is, write repeatable unit tests with both some edge cases worked out using your brain, and some randomised data which you generate only once. Then write a set of randomised tests that you run as well.

The randomised tests should never be expected to catch something your repeatable tests miss. You should aim to cover everything with repeatable tests, and consider the randomised tests a bonus. If they find something, it should be something that you couldn't have reasonably predicted; a real oddball.


K
Keeely

A unit test is there to ensure the correct behaviour in response to particular inputs, in particular all code paths/logic should be covered. There is no need to use random data to achieve this. If you don't have 100% code coverage with your unit tests, then fuzz testing by the back door is not going to achieve this, and it may even mean you occasionally don't achieve your desired code coverage. It may (pardon the pun) give you a 'fuzzy' feeling that you're getting to more code paths, but there may not be much science behind this. People often check code coverage when they run their unit tests for the first time and then forget about it (unless enforced by CI), so do you really want to be checking coverage against every run as a result of using random input data? It's just another thing to potentially neglect.

Also, programmers tend to take the easy path, and they make mistakes. They make just as many mistakes in unit tests as they do in the code under test. It's way too easy for someone to introduce random data, and then tailor the asserts to the output order in a single run. Admit it, we've all done this. When the data changes the order can change and the asserts fail, so a portion of the executions fail. This portion needn't be 1/2 I've seen exactly this result in failures 10% of the time. It takes a long time to track down problems like this, and if your CI doesn't record enough data about enough of the runs, then it can be even worse.

Whilst there's an argument for saying "just don't make these mistakes", in a typical commercial programming setup there'll be a mix of abilities, sometimes relatively junior people reviewing code for other junior people. You can write literally dozens more tests in the time it takes to debug one non-deterministic test and fix it, so make sure you don't have any. Don't use random data.


a
armandino

This is an old question, but I wanted to mention a library I created that generates objects filled with random data. It supports reproducing the same data if a test fails by supplying a seed. It also supports JUnit 5 via an extension.

Example usage:

Person person = Instancio.create(Person.class);

Or a builder API for customising generation parameters:

Person person = Instancio.of(Person.class)
    .generate(field("age"), gen -> gen.ints.min(18).max(65))
    .create();

Github link has more examples: https://github.com/instancio/instancio

You can find the library on maven central:

<dependency>
    <groupId>org.instancio</groupId>
    <artifactId>instancio-junit</artifactId>
    <version>LATEST</version>
</dependency>


G
Greg Whitfield

How can your guy run the test again when it has failed to see if he has fixed it? I.e. he loses repeatability of tests.

While I think there is probably some value in flinging a load of random data at tests, as mentioned in other replies it falls more under the heading of load testing than anything else. It is pretty much a "testing-by-hope" practice. I think that, in reality, your guy is simply not thinkng about what he is trying to test, and making up for that lack of thought by hoping randomness will eventually trap some mysterious error.

So the argument I would use with him is that he is being lazy. Or, to put it another way, if he doesn't take the time to understand what he is trying to test it probably shows he doesn't really understand the code he is writing.


It's possible to log the random data or random seed so that the test can be reproduced.
in addition to logging the random numbers, generally a test that fails with random inputs, even if the random inputs are not logged, that test can be re-run in a while(true) loop and will most likely fail in the time that it takes to check your snapchat feed.