How do you unit test a concurrent data structure?

You have written a data structure D, and you want to test that D is safe for concurrency.

What sort of unit tests do you usually make to ensure that your data structure D is safe for concurrency, and to catch any possible concurrency related bugs ?

1 Like

I’d rather rely on design review, code review, as well as on the race detector and deadlock detection.

  • Design & code review: Code is obviously concurrency-safe if it shares no memory between goroutines, and if it does not access limited resources in a competing way. If the code does either of these, the review extends to verifying proper decoupling (channels), locking, and resource pooling.

  • Race detector: indispensable for detecting data race conditions that the review might have missed.

  • Deadlock detection: also super useful, but keep in mind that the deadlock alarm only goes off if all goroutines in a binary are blocked. Even if you only start an HTTP server besides your concurrent worker code, the deadlock detector remains silent if your workers lock up.

  • For unit testing, I see two ways of increasing the chance to catch a deadlock, data race, or other concurrency-induced issues:

    • Run parallel tests
    • Do fuzzy testing (requires Go 1.18+)
1 Like

hi @christophberger ,

Great points, thank you for an excellent answer.

Regarding the parallel tests, can you give more details and perhaps link to an example ? Are you talking about a table test that gets to have its test cases run in parallel ?

Also, can you point an example of fuzzy testing used to detect concurrency issues ? My general idea about fuzzy tests is that they generate random input to test numerical algorithms. How can I adapt one to test concurrency, sounds like a very appealing idea, please give more details.

Yes, I am referring to the t.Parallel() feature for parallel test.

How can fuzzing help? Well, concurrency bugs are often hard to spot because they depend on the (non-deterministic) behavior of tasks running independently of each other. Generating lots of input data can greatly increase the chance of triggering a bug. And fuzzing can generate lots of input.

Fuzzing is not restricted to numerical algorithms. Besides numeric types, the input arguments (“fuzzing arguments”) can also be strings, byte slices, runes, or booleans.

hi @christophberger

Here is an algorithm example, along with my fuzzy test:

type Summable interface {
    	~int | ~int8 | ~int16 | ~int32 | ~int64 |
        	~uint | ~uint8 | ~uint16 | ~uint32 | ~uint64 | ~uintptr |
	    ~float32 | ~float64 | ~string
}

func Build[S ~[]T, T Summable](input S) S {
    if len(input) == 0 {
	    return *new([]T) 
    }

    result := make([]T, len(input))

    result[0] = input[0]
    for i := 1; i < len(result); i++ {
	    result[i] = result[i - 1] + input[i]
    }

    return result
}

And my fuzzy test is:

func FuzzBuild(f *testing.F) {
    f.Add([]byte{1, 2, 11, 12, 3, 3, 15, 8})
        f.Parallel()

    f.Fuzz(func(t *testing.T, slice []byte) {
	    if len(slice) == 0 {
		    assert.Equal(t, 0, len(prefix_sum.Build(slice)))

		    return
	    }

	    got := prefix_sum.Build(slice)
	    assert.Equal(t, got[0], slice[0])

	    expected := slice[0]

	    for i := 1; i < len(slice); i++ {
		    expected += slice[i]

		    assert.Equal(t, expected, got[i])
	    }
    })
}

Now, in this example:

  • my go test -fuzz FuzzBuild -race -v finds no issue, obviously my code would cause race issues if called several times, in parallel, for the same input slice
  • the way I build the []byte, and add that to f ← can I do better, namely build my fuzz test in such a way that I increase my chance of finding more bugs ?

Please give me tips on how to write this fuzz test, thx!

any code reviews ? does it make sense to move this to code reviews ?

Hi @telo_tade,

Your code should error out at f.Parallel(), as this method does not exist. You would need to use t.Run() and t.Parallel() as outlined in the testing doc

1 Like

And regarding the -race flag, this requires an actual scenario that spawns goroutines. Purely serial code never causes data races.

hi @christophberger

Thx for pointing out the mistake in f.Parallel. What about improving the fuzzy tests with regards to its ability to discover bugs ? It will produce a range of byte slices, but for the vast majority of them the code will have no difficulty.

Is there a way to configure my fuzzy test to do things like a) always send the empty slice as one of the inputs, b) send a slice of many very large numbers to test for overflows, etc

Also, running the fuzzy test with -race should by default test for race condition, as the fuzzy test will run in parallel making use of the processors available ? Is this assumption correct ?

While the generated tests have a good amount of randomness included, it should be possible to influcene the generated test cases by supplying suitable data to the seed corpus.

For a) you’d simply call f.Add([]byte{} to add the empty slice cse.

For b) I guess if you supply an array with very large numbers, or a very long array, the fuzz engine will create more test cases with large numbers or longer arrays. (I did not examine the inner workings of the fuzz engine yet, so this is only a guess, but after all, the seed corpus is what the fuzz engine uses as a starting point.)

If you have some very concrete test cases in mind, then you might want to create normal tests for these cases. I would not rely on the fuzz engine to generate all sorts of edge cases. Fuzzing helps to increase the chance of catching test cases that the standard tests have missed, but it cannot guarantee to catch them.

1 Like