I parse a pretty large json files (20-30GB) line by line with bufio package, extract values and do some math on it. Profiling with pprof showed pretty good results, am happy with it, it’s not much philosophy at this end.
I have an 8 cores (16 threads) workstation, so the next logical step is to process 8 files parallely with goroutines, which I did. At first glance, I’d expect close to 8x speedup, but it’s barely 4x. I asked myself what could be the reason. Perhaps limitation of SSD disk I/O in conjunction with bufio package(?!). Thus, I made a dummy test, from a network copied big file (120MB/s load) and in the meantime re-performed the parsin etc, which should have deteriorated the speed further on, but it did not, the same barely 4x speedup. So how could I identify what’s the bottleneck next to it? Why is the speedup only 4x but should be close to 8x? Any suggestions appreciated?