
I recently had an upstream reviewer telling me that I should not randomise my test input because “randomness does not provide a useful input to the test and sets a bad example of practices for writing tests”.
I am going to explain here why this is wrong and it’s actually good practice to randomise inputs. Let me start by saying that random test failures are not the same thing as spurious test failures. I’ll come back to that later.
Consider this simple code under test, it’s a contrived example but you will get the idea:
def myfunc(thing): """This function just returns what it's given.""" return "foo"
OK so let’s consider this a stub implementation, as it has an obvious bug. So, if we wanted to write a test for this we might write something like this:
class TestMyfunc(unittest.TestCase): def test_myfunc_returns_same_input(self): returned = myfunc("foo") self.assertEqual(returned, "foo")
Here, I am using a fixed input of “foo” as many people like to do in tests as a way of saying “this value is irrelevant”.
The bug should be obvious here — the test passes when it should not because the code under test is returning the same value as used in the test. As I say a fairly contrived example, but it illustrates the point that tests should never assume anything about code under test.
Here’s a better way of writing the test:
import random class TestMyfunc(unittest.TestCase): def test_myfunc_returns_same_input(self): expected = random.randint(0, 1000) returned = myfunc(expected) self.assertEqual(returned, expected)y
(A further improvement could be to generate a random string, but I’ll leave that for a future blog entry.)
Here, we’re generating a random input and asserting that the returned value is the same as the input. This not only avoids the bug above but it is far better at demonstrating test intent. It will also never fail unless the code under test is buggy, and that brings me back to the point above about random vs spurious test failures.
A random test failure is good. It means you found a bug! A spurious test failure is one that indicates you’re not testing properly – an example of this is where you depend on some network connectivity to complete your test; as networks are inherently unreliable this is a bad test and will create spurious test failures when the network fails.
Finally, I can recommend that you look at a tool called Hypothesis, which is a property-based testing utility. My friend Jono explains it in his blog here: https://jml.io/2016/06/evolving-toward-property-based-testing-with-hypothesis.html
[…] ← Random inputs in unit testing […]