Comparing the efficiency of slice access using index and element in time and memory.
(edit) Accessing slice by index means accesing slice item using square bracket (i.e. slice[i]), while slice by element means accessing slice item using return value from range (i.e. index, item := range slice).
Go version devel +ca47157 Thu Dec 31 00:20:54 2015 +0000 linux/amd64
Result
In this little benchmark we will sum slice of 100000 integer, one will access the slice element using index and another access the element using range.
I write it as benchmark first [1], but then I can’t figure out how to compare the memory usage so I split it and use the profiling.
Here is the output using benchmark,
ms 2 % go test -bench=.
testing: warning: no tests to run
PASS
BenchmarkSumByIndex10000-8 50000 24779 ns/op
BenchmarkSumByElm10000-8 100000 23206 ns/op
BenchmarkSumByIndex1000000-8 1000 2163662 ns/op
BenchmarkSumByElm1000000-8 1000 1977207 ns/op
ok _/home/ms/Unduhan/sandbox/go/benchmark 8.618s
DISREGARD my first post. I run the test again using pkg/profile as recommended above, the result showed that both use the same memory but the slice-by-index access take a little bit longer (still by ~8% margin) than slice-by-element. I am surprised, because I assume it would be otherwise.
The source for each benchmark still in the same links.
Slice by Index
Mem profile output,
ms 0 % go tool pprof -text sumbyindex mem.pprof
788.52kB of 788.52kB total ( 100%)
Dropped 4 nodes (cum <= 3.94kB)
flat flat% sum% ■■■ ■■■%
784kB 99.43% 99.43% 784kB 99.43% main.SumByIndex
4.52kB 0.57% 100% 4.52kB 0.57% runtime.allocm
0 0% 100% 784kB 99.43% main.main
...
CPU profile output,
ms 0 % go tool pprof -text sumbyindex cpu.pprof
24.24s of 25.82s total (93.88%)
Dropped 112 nodes (cum <= 0.13s)
flat flat% sum% ■■■ ■■■%
12.56s 48.64% 48.64% 18.53s 71.77% main.SumByIndex
5.19s 20.10% 68.75% 5.19s 20.10% runtime.memclr
0.97s 3.76% 72.50% 1.10s 4.26% runtime.scanblock
0.94s 3.64% 76.14% 0.94s 3.64% runtime.futex
...
Slice by Element
Mem profile output,
ms 0 % go tool pprof -text sumbyelm mem.pprof
788.52kB of 788.52kB total ( 100%)
Dropped 6 nodes (cum <= 3.94kB)
flat flat% sum% ■■■ ■■■%
784kB 99.43% 99.43% 784kB 99.43% main.SumByElm
4.52kB 0.57% 100% 4.52kB 0.57% runtime.allocm
0 0% 100% 784kB 99.43% main.main
...
CPU profile output,
ms 0 % go tool pprof -text sumbyelm cpu.pprof
22.76s of 24.36s total (93.43%)
Dropped 106 nodes (cum <= 0.12s)
flat flat% sum% ■■■ ■■■%
11.23s 46.10% 46.10% 16.76s 68.80% main.SumByElm
4.53s 18.60% 64.70% 4.53s 18.60% runtime.memclr
1.27s 5.21% 69.91% 1.36s 5.58% runtime.scanblock
0.93s 3.82% 73.73% 0.93s 3.82% runtime.futex
...