Generics benchmarks

Similar results were obtained for different platforms and architectures.
It’s not a language problem at all, is it?

type Signed interface {
	~int | ~int8 | ~int16 | ~int32 | ~int64;
}

type Getter interface {
	Get(int) int
}

type GetObj struct {}

func (o *GetObj) Get(i int) int {
	return i * i
}

type GenGetter[T Signed] interface {
	Get(T) T
}

type GenGetObj[T Signed] struct {}

func (o *GenGetObj[T]) Get(i T) T {
	return i * i
}

func reg(obj *GetObj, i int) int {
	return obj.Get(i)
}

func iface(obj Getter, i int) int {
	return obj.Get(i)
}

func gen[T Signed](obj GenGetter[T], i T) T {
	return obj.Get(i)
}

func BenchmarkReg(b *testing.B) {
	var obj = &GetObj{}
	var i = 0
	for b.Loop() {
		reg(obj, i)
		i++
	}
}

func BenchmarkIface(b *testing.B) {
	var obj = &GetObj{}
	var i = 0
	for b.Loop() {
		iface(obj, i)
		i++
	}
}


func BenchmarkGen(b *testing.B) {
	var obj = &GenGetObj[int]{}
	var i = 0
	for b.Loop() {
		gen(obj, i)
		i++
	}
}
BenchmarkReg-4          441981427                2.743 ns/op
BenchmarkReg-4          428439991                2.810 ns/op
BenchmarkReg-4          422647483                2.775 ns/op
BenchmarkReg-4          414114366                2.839 ns/op
BenchmarkReg-4          429933127                2.720 ns/op
BenchmarkIface-4        260450607                4.558 ns/op
BenchmarkIface-4        258018102                4.488 ns/op
BenchmarkIface-4        248518578                4.655 ns/op
BenchmarkIface-4        257133632                4.517 ns/op
BenchmarkIface-4        272843784                4.386 ns/op
BenchmarkGen-4          245088946                4.932 ns/op
BenchmarkGen-4          248562715                4.826 ns/op
BenchmarkGen-4          245683516                4.812 ns/op
BenchmarkGen-4          239086440                4.948 ns/op
BenchmarkGen-4          228736507                5.074 ns/op

And this is probably the wrong version, but I’ll give it anyway:

func BenchmarkReg(b *testing.B) {
	var obj = &GetObj{}
	for i := range b.N {
		reg(obj, i)
	}
}

func BenchmarkIface(b *testing.B) {
	var obj = &GetObj{}
	for i := range b.N {
		iface(obj, i)
	}
}


func BenchmarkGen(b *testing.B) {
	var obj = &GenGetObj[int]{}
	for i := range b.N {
		gen(obj, i)
	}
}
BenchmarkReg-4          1000000000               0.3700 ns/op
BenchmarkReg-4          1000000000               0.3609 ns/op
BenchmarkReg-4          1000000000               0.3643 ns/op
BenchmarkReg-4          1000000000               0.3616 ns/op
BenchmarkReg-4          1000000000               0.3778 ns/op
BenchmarkIface-4        1000000000               0.3492 ns/op
BenchmarkIface-4        1000000000               0.3651 ns/op
BenchmarkIface-4        1000000000               0.3571 ns/op
BenchmarkIface-4        1000000000               0.3629 ns/op
BenchmarkIface-4        1000000000               0.3499 ns/op
BenchmarkGen-4          419146522                2.703 ns/op
BenchmarkGen-4          420824724                2.777 ns/op
BenchmarkGen-4          416835702                2.740 ns/op
BenchmarkGen-4          415282959                2.803 ns/op
BenchmarkGen-4          426988256                2.732 ns/op

I think for these simple use-cases looking at the assembly will probably give more insights than just looking at the numbers. My first guess would be the regular call is heavily optimized by inlining and might even be optimized away completely, if you don’t use the value at all.

1 Like

Well, I’ve obtained different results which prove the opposite:

test-generics % go test -bench=.
goos: darwin
goarch: arm64
pkg: test-generics
cpu: Apple M1 Pro
BenchmarkReg-10         538556432                2.121 ns/op
BenchmarkIface-10       569057384                2.085 ns/op
BenchmarkGen-10         582057588                2.066 ns/op
PASS
ok      test-generics   3.715s

test-generics % go test -bench=. -cpu 1
goos: darwin
goarch: arm64
pkg: test-generics
cpu: Apple M1 Pro
BenchmarkReg    511943661                2.148 ns/op
BenchmarkIface  576335656                2.070 ns/op
BenchmarkGen    574286329                2.066 ns/op
PASS
ok      test-generics   3.745s

What is the platform? How many cores did you run? We need more details

1 Like

Hello!
Thanks for the info I will try to figure it out for more.