Need help with profiling (pprof)

(pprof) list NewTokenPosition
Total: 39.22s
ROUTINE ======================== github.com/z7zmey/php-parser/parser.(*PositionBuilder).NewTokenPosition in /Users/z7zmey/work/go/src/github.com/z7zmey/php-parser/parser/position_builder.go
     3.66s      3.96s (flat, cum) 10.10% of Total
         .          .    104:	return pos
         .          .    105:}
         .          .    106:
         .          .    107:// NewTokenPosition returns new Position
         .          .    108:func (b *PositionBuilder) NewTokenPosition(t *scanner.Token) *position.Position {
         .      300ms    109:	pos := b.PositionPool.Get().(*position.Position)
     3.65s      3.65s    110:	pos.StartLine = t.Position.StartLine
         .          .    111:	pos.EndLine = t.Position.EndLine
         .          .    112:	pos.StartPos = t.Position.StartPos
         .          .    113:	pos.EndPos = t.Position.EndPos
         .          .    114:
      10ms       10ms    115:	return pos
         .          .    116:}

I can not understand why line 110 spend so much time.

to reproduce:

go get -u github.com/z7zmey/php-parser
cd $GOPATH/src/github.com/z7zmey/php-parser
make cpu_pprof

(I ran your test)

So the context here is that this is about 10% of the whole >30s runtime of the benchmark. I guess your position builder gets called for every token, and the rest is not expensive enough to dwarf the pointer chasing and memcopy here to more than 90%.

I expect the four lines get merged to one copy, which is why the profiler just sees it as the first line.

1 Like

If I understood it correctly, Golang allocates memory when we use the object for the first time instead of when we are creating it.

I have created positionBuffer slice, see @commit. After that, the function New in PositionPool spends about 6.5s. It is like a sum of createToken and NewTokenPosition functions that most actively created the positions.

After the commit, allocs/op is decreased from 6281 to 4440, but ns/op is increased a little. And now profiler shows that the time spends to growing slice

have found that this behavior documented at allocation_new

new(T) does not initialize the memory, it only zeros it

The article above explains to me why profiler shows so weird timing.

At scanner/lexer.go:488 we creating a new zeroed storage for position.Position and returns its address, it is quick. After that at scanner/lexer.go:538 and parser/position_builder.go:110 we assign data to zeroed object and Go starts memory initialization, it is slow.

I have tried to change scanner/lexer.go:488 to force initialize memory, see gist

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.