Recursive function and memory

I have a recursive function that can be called as many as 800,000,000 times. However the depth of the calls never exceeds 150.

The program slows down and resource manager shows memory usage approching the total available the closer we get to 800mm.

My thinking is that the program makes calls faster than memory compression can recover memory from the stack(?). 64gb on machine

Does that make sense? Would it help to put in a sleep or something every 50,000 calls to let it catch up?

I am just learning about the profiler but my guess is that each call uses at most 5k each call at a depth of 150 that is only .75mb

I also have a map that grows 1 value every 10 calls might it help to create it with an initial size? BTW the map currently has a value struct of only 1 bool (i realize i can reduce that even with a nil vallue map and i will be doing that).

Thank you for reading this.

It looks like you should make the function non-recursive. Recursion has stack overhead.
If performance is affected, you should try another implementation.

The answer is surely to use the profiler / benchmarking. Allocating your map with a size hint can improve performance, but until you know that’s what your problem is, it might not do much. When you say “sleep or something every 50,000 calls to let it catch up” that sounds like you’re using goroutines, which can add extra overhead as well. Have you tried limiting the number of goroutines to something sane and communicating via channels?

One other gotcha to look out for is: maps don’t shrink. Take a look at this article. I’m assuming you aren’t relying on your map shrinking, but that could be something to consider. Also - I’m assuming you’re using a map instead of a slice because lookup performance is important.

I wrote some rudimentary benchmarks to give you ideas on how to get started. First the code:

package main

var chars = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"
var stringKeys []string

func init() {
	// Create unique keys.
	for _, c := range chars {
		for _, c2 := range chars {
			for _, c3 := range chars {
				stringKeys = append(stringKeys, string(c)+string(c2)+string(c3))
			}
		}
	}
}

func main() {
}

func stringMap() int {
	var m = make(map[string]int)
	for i, v := range stringKeys {
		m[v] = i
	}
	return len(m)
}

func intMap() int {
	var m = make(map[int]int)
	for i := range stringKeys {
		m[i] = i
	}
	return len(m)
}

func stringMapCapacityHint() int {
	var m = make(map[string]int, len(stringKeys))
	for i, v := range stringKeys {
		m[v] = i
	}
	return len(m)
}

func intMapCapacityHint() int {
	var m = make(map[int]int, len(stringKeys))
	for i := range stringKeys {
		m[i] = i
	}
	return len(m)
}

type data struct {
	key   string
	value int
}

func sliceAppend() int {
	var s = make([]data, 0)
	for i, v := range stringKeys {
		s = append(s, data{v, i})
	}
	return len(s)
}

func slicePreallocate() int {
	var s = make([]data, len(stringKeys))
	for i, v := range stringKeys {
		s[i] = data{v, i}
	}
	return len(s)
}

And the benchmarks:

package main

import "testing"

func BenchmarkStringMap(b *testing.B) {
	for n := 0; n < b.N; n++ {
		stringMap()
	}
}

func BenchmarkIntMap(b *testing.B) {
	for n := 0; n < b.N; n++ {
		intMap()
	}
}

func BenchmarkStringMapCapacityHint(b *testing.B) {
	for n := 0; n < b.N; n++ {
		stringMapCapacityHint()
	}
}

func BenchmarkIntMapCapacityHint(b *testing.B) {
	for n := 0; n < b.N; n++ {
		intMapCapacityHint()
	}
}

func BenchmarkSliceAppend(b *testing.B) {
	for n := 0; n < b.N; n++ {
		sliceAppend()
	}
}

func BenchmarkSlicePreallocate(b *testing.B) {
	for n := 0; n < b.N; n++ {
		slicePreallocate()
	}
}

… which results in:

$ go test -bench=. -benchmem
goos: windows
goarch: amd64
pkg: mapperformance
cpu: Intel(R) Core(TM) i7-9700K CPU @ 3.60GHz
BenchmarkStringMap-8                          87          13981674 ns/op        15480435 B/op       4725 allocs/op
BenchmarkIntMap-8                            127           9309430 ns/op        10933689 B/op       4773 allocs/op
BenchmarkStringMapCapacityHint-8             206           5936893 ns/op         7241753 B/op          2 allocs/op
BenchmarkIntMapCapacityHint-8                255           4585321 ns/op         5038069 B/op         14 allocs/op
BenchmarkSliceAppend-8                       130           9029515 ns/op        16402408 B/op         29 allocs/op
BenchmarkSlicePreallocate-8                  841           1487319 ns/op         3375104 B/op          1 allocs/op
PASS
ok      mapperformance  11.421s

Anyway, start running benchmarks with -benchmem. And use the profiler. And finally: yeah, any time you can initialize a map with a capacity hint that is relevant, do it.

3 Likes

The stack should usually be reclaimed immediately after function calls. What you describe sounds like you are making allocation, which the garbage collector needs to reclaim.
You can search for “escape analysis” and analyze your code by compiling it with the flag “-m” (or using the VS Code Plugin)
If you can prevent heap allocations you should be able to solve your problem (if my hunch is right).

2 Likes