A question about string for range

charviki · November 22, 2022, 7:42am

A benchmark about string for range:

package main

import (
	"strings"
	"testing"
)

var s string

func init() {
	var (
		builder strings.Builder
		i       int32
	)
	for i = 0; i < 100000; i++ {
		builder.WriteRune(i)
	}
	s = builder.String()
}

func BenchmarkForStr(b *testing.B) {
	for i := 0; i < b.N; i++ {
		for _ = range s {
		}
	}
}

func BenchmarkForConvertedStr(b *testing.B) {
	for i := 0; i < b.N; i++ {
		for _ = range []byte(s) {
		}
	}
}

result:

goos: windows
goarch: amd64
cpu: 12th Gen Intel(R) Core(TM) i7-12700K
BenchmarkForStr-20             	    5216	    226701 ns/op	       0 B/op	       0 allocs/op
BenchmarkForConvertedStr-20    	   21382	     57160 ns/op	       0 B/op	       0 allocs/op
PASS

What confuses me is why the BenchmarkForConvertedStr is faster than the BenchmarkForStr?

ncw · November 25, 2022, 1:33pm

Here Go is iterating the string in unicode code points. These are parsed from their UTF-8 representation which isn’t free.

Here the go compiler is converting the string to a sequence of bytes. I suspect that under the hood, since the []byte is never modifies, the go compiler is just returning a pointer to the string, so the cast is free.

Iterating a slice of bytes is very quick.

charviki · November 26, 2022, 2:49am

Thanks!

system · February 24, 2023, 2:50am

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.