Goroutine access methods seems to get optimized away depending on logging

Hello,

I get some strange behaviour on my code. I have tried this with 4 different machines, 2xManjaros:4.20/5.0 and with two VPS (Debian9/4.15, Ubuntu 16.04lts/4.4).
In three of four machines, this code runs as expected. However, on fourth machine (Debian), one goroutine (which is responsible for collecting and pushing metrics) seems to never work. Same go version (1.11.5) on all machines.

Here’s relevant part of metrics task and handler that calls it:

 
 ### METRICS TASK
 
type MetricsTask struct {
    running     bool
    ticker      *time.Ticker

    counters map[string]float64
    gauges   map[string]float64

    counterBuf chan metricsAction
    gaugeBuf   chan metricsAction
}


func New() *MetricsTask {
// Init maps, chans, ticker
}

type metricsAction struct {
    name string
    val  float64
}

// Call this func from other goroutine to increase counter for given name
func (m *MetricsTask) CounterIncrease(name string, val float64) {
	m.counterBuf <- metricsAction{name: name, val: val}
}

func (m *MetricsTask) loop() {
    for m.running {
		select {
		case item := <-m.counterBuf:
            // Push measurement to named val
			m.ensureCounter(item.name)
			m.counters[item.name] += item.val
		case <-m.ticker.C:
            // Push metrics 
            // ...
		}
	}
}


#### HANDLER

// Try to insert metrics, fails silently on one machine
func handler() {

    // This log entry somehow fixes metrics task, without it CounterIncrease isn't called or is called with empty params
    logrus.Debug("Inserting metrics")
    metrics.CounterIncrease("my_counter", 1)

}

So, I see from logs that metrics task is started and all, but method ‘CounterIncrease’ gets never called. Then only change I make is add log line before calling that method, and somehow it works now. Why is that?
When trying to remote debug this with Goland&delve, it seems like ‘CounterIncrease’ gets called, but with empty parameters, but that might just as well be just what is showed to user? But still the strangest part is why does this work on 3/4 of my machines?

EDIT: Compiled a few times with log and without, seems to always fix/brake the code as explained.

Seems like a race condition. Inserting the log line make the function slightly slower. But the code isn’t complete for example how is handler called? So it is hard to tell what is happening.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.