Benchmark for actual size of string - wrong results?


(Les Way) #1

Here is my benchmark:

func get() string {
    return string([]byte{0x41, 0x42, 0x43})
}    

var g string

func BenchmarkString(b *testing.B) {
    for i := 0; i < b.N; i++ {
        g = get()
    }

    fmt.Println("unsafe.Sizeof(g)", unsafe.Sizeof(g), b.N, g)
}

Is it correct though ? Here is the output:
unsafe.Sizeof(g) 16 100000000 ABC
100000000 13.7 ns/op 3 B/op 1 allocs/op

It seems unlinkely that the size of a string is 16 bytes and the benchmark only does 3 bytes per operation. What is the explanation ?

I would expect the actual size of string to be:
len(str) + unsafe.Sizeof(str) * 4

Namely the len(str) is the overhead, then of each rune (len(str) = number of runes) another 4 bytes, as a rune has the same size of int32. Please correct me if I am mistaken on this one.


(Jakob Borg) #2

Sizeof(string) is 16 bytes because a string is structure with a pointer (8 bytes) and a length (8 bytes). There’s no allocation for this structure because it’s on the stack. The allocation is for the pointed-to data, which is three bytes (string data isn’t “wide”, four bytes per code point, it’s just data). It gets allocated on the heap because you pass it to fmt.Println, so the compiler can’t guarantee it doesn’t escape beyond the lifetime of the current function call.

You’re doing a lot of microbenchmarking and questioning the results. Is there an overarching point to your explorations? If there’s one conclusion you should arrive at by now it’s that for these sorts of small things it’s all in the details. Whether there is an allocation or not depends on how the data is used, etc.


(Jakob Borg) #3

Oh, and here’s a good article on the memory allocator. It explains many things in a readable manner: https://povilasv.me/go-memory-management/


(Les Way) #4

I have commented the print line, so no printing. It still allocates 3 B/op. If I increase the number of runes in my string, to 4, it will allocate 4 B / op, and so on.

It seems there is something more complex going on here.


(Jakob Borg) #5

You are also doing a conversion from byte slice to string, so that allocation might need to happen regardless. That the allocation is four bytes for a four byte string seems expected.