Channel and select - Go tour : tree

FractalWire · August 13, 2017, 12:21pm

Hi everybody,

I’m a new go user and, while doing the go tour, I got stuck with an incomprehensible behaviour, at least for me.

It specifically relate to channel and select behaviour when trying to resolve the tree exercise.

Here is my code :


import (
	"fmt"
	"golang.org/x/tour/tree"
)

// Walk walks the tree t sending all values
// from the tree to the channel ch.
func Walk(t *tree.Tree, ch chan int) {
	walk(t, ch)
	close(ch)
}

func walk(t *tree.Tree, ch chan int) {
	if t != nil {
		walk(t.Left, ch)
		ch <- t.Value
		walk(t.Right, ch)
	}
}

// Same determines whether the trees
// t1 and t2 contain the same values.
func Same(t1, t2 *tree.Tree) bool {
	ch1 := make(chan int)
	ch2 := make(chan int)
	go Walk(t1, ch1)
	go Walk(t2, ch2)

	m1, m2 := map[int]bool{}, map[int]bool{}

	var ok1, ok2 bool
	var v1, v2 int
	for {
		select {
		case v1, ok1 = <-ch1:
			if !ok1 {
				break
			}
			m1[v1] = true
		case v2, ok2 = <-ch2:
			if !ok2 {
				break
			}
			m2[v2] = true
		}
		if !ok1 && !ok2 {
			break
		}
	}
	
	fmt.Println(m1, m2)
	
	if len(m1) != len(m2) {
		return false
	}
	
	for k := range m1 {
		_, exists := m2[k]
		if !exists {
			return false
		}
	}
	return true
}

func main() {
	ch := make(chan int)
	go Walk(tree.New(1), ch)
	/*for v := range ch {
		fmt.Println(v)
	}*/

	fmt.Println(Same(tree.New(1), tree.New(1)))
	fmt.Println(Same(tree.New(1), tree.New(2)))
}

Let zoom in a little bit to the part that is confusing me :

// t1 and t2 contain the same values.
func Same(t1, t2 *tree.Tree) bool {
	ch1 := make(chan int)
	ch2 := make(chan int)
	go Walk(t1, ch1)
	go Walk(t2, ch2)

	m1, m2 := map[int]bool{}, map[int]bool{}

	var ok1, ok2 bool
	var v1, v2 int
	for {
		select {
		case v1, ok1 = <-ch1:
			if !ok1 {
				break
			}
			m1[v1] = true
		case v2, ok2 = <-ch2:
			if !ok2 {
				break
			}
			m2[v2] = true
		}
		if !ok1 && !ok2 {
			break
		}
	}
	
	fmt.Println(m1, m2)
	
	if len(m1) != len(m2) {
		return false
	}
	
	for k := range m1 {
		_, exists := m2[k]
		if !exists {
			return false
		}
	}
	return true
}

In this for select, the case that involve ch1 is never treated. Which lead to m1 never being populated and in return the Same function gives a bad result.

My understanding is that select should choose one of those two channels pseudo randomly if both channels are available. But for this code, it only chooses ch2.

If I remove the case with ch2, the map m1 is correctly populated.

So here it is, I’m not exactly sure what’s going on here, if someone could enlight me.

Thanks,

Olivier

joncalhoun · August 13, 2017, 4:55pm

I’ll reply in more detail later when not on mobile, but at first glance it looks like your issue may be the break statements inside of your for loop. If !ok2 { break } will terminate the entire for loop before ch1 is ever read from if ch2 gets closed early on.

FractalWire · August 13, 2017, 6:00pm

Thanks for the answer.

This !ok2 { break } was to avoid adding a non value (0) to the map when ch2 closes. It only terminate the select statement, not the for loop however.

I can replace that with

if ok1 {
	m1[v1] = true
}

However, the problem (if the problem is here) will still be there next lines with :

		if !ok1 && !ok2 {
			break
		}

This is meant to stop the loop when both channel closed. However, as ok1 is false by default, if the select statement never choose to select the case with ch1, then ok1 will stayed false, breaking the loop when ch2 closes.

And this is precisely what I don’t understand : why select never chooses the ch1 case ? If it’s a random choice, it is unlikely that it chooses 10 times the case with ch2…

To check that the problem is not entirely due to the fact that ch2 closes early on, I can also initialise ok1 and ok2 to true before the loop.

But then the problem is something else : it becomes an infinite loop. As if ch1 had nothing to send (and thus never updating ok1 to false when it closes)…

NobbZ · August 13, 2017, 7:24pm

A closed channel is always ready to receive. Therefore, after c2 has been closed c1 will never have the slightliest chance to get a timeslice in the select. Whatever channel you close first, will never ever give the other one a chance. A nil channel though will never ever have a message, therefore, set the channel to nil on first !ok, also check for channels beeing nil for final break, then we can remove outer declaration of v and ok variables:

for {
	select {
	case v, ok := <-ch1:
		if !ok {
			ch1 = nil
		}
		m1[v] = true
	case v, ok := <-ch2:
		if !ok {
			ch2 = nil
		}
		m2[v] = true
	}
	if ch1 == nil && ch2 == nil {
		break
	}
}

FractalWire · August 13, 2017, 8:33pm

A closed channel is always ready to receive.

Oh.
Well, that’s weird, I was actually expecting the opposite…
(can’t find this in the doc however : The Go Programming Language Specification - The Go Programming Language)

Also, I would expect select to check if ch2 is ready to send, not to receive as it looks like this : case v, ok := <-ch2:. Does select doesn’t care about channel direction when checking on a case ? Or is it that I mixed up the definition of sending and receving ?
(for me ch2 send something to v,ok here)

Finally, I’m still not sure to understand why ch2 get the priority here, as the two goroutine Walk are executed concurrently, there should be some data coming from ch1 and ch2 almost simultaneously.
But here, it’s not the case and ch2 get treated before ch1.

NobbZ · August 13, 2017, 8:40pm

It is hidden well:

A receive operation on a closed channel can always proceed immediately, yielding the element type’s zero value after any previously sent values have been received. (The Go Programming Language Specification - The Go Programming Language)

In a select you receive from the channel and assign the received “message” to the variables on the LHS.

<-ch means “receive from channel” while ch<- means “send into channel”.

Just think about it, that you are the variable and ch is your Radio. When the radio:

you ← radio: You are receiving
radio ← you: you are sending

FractalWire · August 13, 2017, 9:21pm

Thanks it makes more sense now.

joncalhoun · August 14, 2017, 1:15am

You are right. I glanced at the code too quickly.

@NobbZ is correct - the closed channel is likely starving your other goroutine. If you change the var ok1, ok2 bool line to this you can see it more explicitly:

var ok1, ok2 bool = true, true

By default, both of these are false, so as soon as it reads ok2 as false it terminates the for loop (ok1 defaulted to false, ok2 became false due to the closed channel).

On the go playground things are less random iirc (as this is better for looking at examples) so that might be what is happening. Eg Go Playground - The Go Programming Language should always print in the same order for you on the go playground, but won’t necessarily locally.

Given everything covered, my best advice is to rethink how you are doing this. Try to break it into functions that perhaps aren’t concurrent at first, then think about how you could use concurrency and closures to turn them concurrent. Eg you might start with this:

package main

import (
	"fmt"

	"golang.org/x/tour/tree"
)

func main() {
	t := tree.New(1)
	m := ValueMap(t)
	fmt.Println(m)
}

func ValueMap(t *tree.Tree) map[int]bool {
	res := make(map[int]bool, 0)
	intChan := make(chan int)
	go func() {
		walk(t, intChan)
		close(intChan)
	}()
	for {
		val, ok := <-intChan
		if !ok {
			break
		}
		res[val] = true
	}
	return res
}

func walk(t *tree.Tree, ch chan int) {
	if t != nil {
		walk(t.Left, ch)
		ch <- t.Value
		walk(t.Right, ch)
	}
}

(See Go Playground - The Go Programming Language to run it)

This ValueMap function will walk a tree using a goroutine and reads values back into a map then returns the map when it is all done. Since we wrote this specifically for one tree we don’t need to think about both being done, we simply need to think about the simpler use case of one tree.

From there we can expand into the more complicated case - how do we do this for two trees then compare the results? Perhaps walking a tree is complicated work and we don’t want to calculate each map individually, so ideally we have both maps being calculated concurrently.

This is MUCH easier to do (at least for me) with closures using the functions we just wrote. We can also use a sync.WaitGroup to make sure both pieces of code finish before we compare the maps.

The code to calculate both maps would be something like this:


// Same determines whether the trees
// t1 and t2 contain the same values.
func Same(t1, t2 *tree.Tree) bool {
	var m1, m2 map[int]bool
	var wg sync.WaitGroup
	wg.Add(2)
	go func() {
		m1 = ValueMap(t1)
		wg.Done()
	}()
	go func() {
		m2 = ValueMap(t2)
		wg.Done()
	}()
	wg.Wait()
	// Yay our maps are ready!
}

From there we can use the same code you had before to compare the maps, and we get code like this: Go Playground - The Go Programming Language

You could technically make your code you wrote before work, but imo it gets confusing because channel buffer sizes changing could make one correct implementation suddenly break. It is much easier for me to write the simple case - process a single tree and give me a map of its values - then to make that run concurrently.

Hope that helps, and sorry for the bad advice earlier!

FractalWire · August 14, 2017, 9:54am

Thanks for this complete answer and the good advices.

On the go playground things are less random iirc (as this is better for looking at examples) so that might be what is happening. Eg Go Playground - The Go Programming Language should always print in the same order for you on the go playground, but won’t necessarily locally.

After playing a little bit with your exemple, I think the non-randomness of the case selection is actually due to the fact that the work done took no time. As the randomness of Select is probably linked to the current time, it will select one channel then the other.
If the fill function take some time to execute (with a time.Sleep), then select start to mix both channels as expected. (Go Playground - The Go Programming Language)

You could technically make your code you wrote before work, but imo it gets confusing because channel buffer sizes changing could make one correct implementation suddenly break.

I agree your code is cleaner and would avoid cascading effect but I’m not sure to get your point on buffer size. Could you show me how the buffer size could break the code ?

joncalhoun · August 14, 2017, 4:45pm

I’m not sure how easily it could be done on the code I showed you, but here is an example: Go Playground - The Go Programming Language

That code works fine as is because all int channels must be written to before done ever gets a message, but if we were to add a buffer to int channels it would make this code invalid because our fill function could mark work as done before the for doWork { ... } loop actually processed all the data. That is, the program might terminate with data still buffered in our channels.

To see this in action, change the ch1 and ch2 lines to add a buffer like so:

ch1 := make(chan int, 10)
ch2 := make(chan int, 10)

Go Playground - The Go Programming Language - see how this only prints out 11 values now? 9 of them are written to channels, but never get read before the done channel receives a message and is processed in the select block.

FractalWire · August 14, 2017, 8:07pm

Got it, thank you !

system · November 12, 2017, 8:07pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.