Synchronization example [bad code]


(Les Way) #1
var a string
var done bool

func setup() {
    a = "hello, world"
    done = true
}

func main() {
    go setup()

    for !done {
    }

    print(a)
}

In this tutorial it claims that the main function above could, in some cases, never end:
“Worse, there is no guarantee that the write to done will ever be observed by main , since there are no synchronization events between the two threads. The loop in main is not guaranteed to finish.”

How can that be possible, is main no longer reading the heap variable done, please explain ?


(Christophe Meessen) #2

The compiler will optimize away the loop because it doesn’t do anything. It does this because it assume that done will never be modified by another go routine. That assumption is allowed by the language specification. The compiler can then aggressively optimize the compiled code.

The consequence is that we then need to be explicit and use special variables and functions when we want to avoid such optimizations.

It is not a very good example to justify the use of synchronization because it doesn’t make it obvious to the reader who isn’t aware of the assumption made by the compiler.


(Holloway) #3

There is a data race conditions for main thread (for loop keeps reading the done variable) and setup thread for writing into done variable without acknowledging one another.

The scenario is like you’re cutting spring onion into small slices and having your sibling is pushing the long vegetable to you. The 2 of you don’t talk and synchronize to one another. Ideally, the 2 of you can get the job done by luck. However, there is a great chance that:

  1. Either you chop your sibling’s hand because he/she pushes it too fast.
  2. Not getting the job done because your sibling thought the work is completed.
  3. The two of you get into a fight when you accidentally chopped your sibling’s fingernail. The job doesn’t get done and possibly you two burn the house down.

It is a concurrency problem, not about heap or stack memories. In concurrency, accessing a memory (variable) atomically or non-atomically should be clearly planned and stated.


Atomic Synchronization

Whenever 2 or more concurrent processes having performing both read and write unto a share memory, you need to do it atomically.

In the example above, notice that setup sets done variable to true in its own pace, while main is polling done at its on discreet. There can be 4 kinds of scenarios may happen:

  1. done is expected false, main is reading it, setup is not writing it
  2. done is expected true, main is not reading it, setup is writing it
  3. done is expected true, main is reading it, setup is writing it
  4. done is expected false, main is reading it, setup is writing it
  5. done is expected true, main is reading it, setup is still writing it

Ideally, we’re expecting case 1 and case 2, in which the sample code mostly likely hit it that way. However, in multitasking, case 3, 4, and 5 are the one that we must handle, which are the fundamentals to race conditions. To handle case 3 and case 4, we want one of them to wait ensure the value is properly read or properly write (thus the word, guarantee). Such guarantees usually follows such:

  1. Check done is free for access, if yes, lock the access to 1 process. If no, go to step 5.
  2. Do read/write operations to the done.
  3. release the lock for others to use.
  4. done processing variable. Goto step 1 again for next process.
  5. enter wait mode. Pending for signal to use. When ready for use, go to step 1.

This is known mutex locking, the most conventional way to do synchronization. Go has channel, which facilitates a different approach for synchronization (using asynchronous messaging). That is also powerful.

Case 5 is very chaotic depending on the system, cpu processor spec, operating system rules etc. In any ways, it will ended up nasty. It is the ultimate race condition scenario that we must avoid at all time.


Relate back to code, to synchronize both main and setup, the easiest way to do is to use channel to done instead of global variable for guarantee data access. Here is the tested amended code:

package main

var (
	a = ""
)

func setup(done chan bool) {
	a = "hello world"
	done <- true
}

func main() {
	done := make(chan bool)
	go setup(done)
	<-done
	print(a)
}

Now notice that it is clear that main is waiting for the done signal before proceeding, leaving setup to have the full freedom to write the signal. Due to asynchronous messaging, once main received the signal, it will then continues to work.


One Vital Note

Looping against a sentinel variable to synchronize 2 or more processes is not an efficient way and potentially crashing most CPU processors (except those that are specifically designed for it). It is still consuming CPU cycles to perform the “wait”, rather than release the CPU for other threads as expected.

That’s why Go always recommend you to use mutex locking or channel, whichever makes sense.

In short, wait is not a simple function (https://golang.org/src/sync/mutex.go - see lockSlow).


Goroutine question