Hello! I’m a Golang newbie and trying to understand various concepts like Golang go-routines, LWPs, async pre-emption, and timing. As such I have made some experiments here [1] and learned a lot
During my adventures I tried using pprof on a very simple Golang program where I already know all the performance characteristics due to scrutinization. The pprof figures I got did not seem to reflect reality that much. So much so that I’m wondering if I run something wrong? Any hints or tips or further explanation greatly appreciated.
Other than that, the execution tracer seems closest to my needs. Here are the three things on my wish list that I have not been able to figure out so far, and that I’m hoping somebody in this forum can point me in the right direction:
Wish list item 1: I’d like to very accurately profile certain function call chains – not all of them – in a larger Golang program. Seems like the execution tracer is a good way to go, but I’m worried that the trace file on disk will end up too big. Is there a way to use the execution tracer but only on certain sections of Golang code? Any examples of how to do that?
Wish list item 2: I really like that the execution tracer shows me (a) how many times a function is called, (b) what the total elapsed wall-clock time is for each call, and © during that elapsed wallclock time, how much time was spend processing other go-routines or GC or other stuff, etc. Is it possible to get hold of this info programmatically at run-time without having to first save to a file and then run go tool trace etc?
Wish list item 3: If I’m forced to go down the go tool trace command line route, is there a way to extract the timing info without launching the web-browser? Ideally I’d like to process this info in an automated way – e.g. for automated run-time performance comparisons between runs – and the web interface gets in the way of that.
And are there any other methods, packages, or techniques that I can use to accurately profile the short example program presented here [1] which can be run in a variety of different ways to change its run-time performance characteristics?
[1] https://gist.github.com/simonhf/351f91aae5366081b7742d25205f7534