I’m fairly new to go, so sorry if I’m asking something stupid here.
I’m using wait groups to make sure goroutines are executed before continuing with some parts of my application as I’m understanding it’s the idiomatic way to do it.
So for example I have this:
package main
import (
"fmt"
"sync"
)
var wg sync.WaitGroup
func main() {
wg.Add(2)
go doStuff()
go doOtherStuff()
wg.Wait()
fmt.Println("Stuff done")
}
func doStuff() {
fmt.Println("Doing stuff")
wg.Done()
}
func doOtherStuff() {
fmt.Println("Doing other stuff")
wg.Done()
}
If I wanted to add a new goroutine, I would also need to make sure to increase the integer passed to wg.Add from 2 to 3. Why was it designed this way instead of letting you pass the routine directly to wg.Add and then don’t force you to call wg.Done() everytime, which is easy to miss? Is there a technical reason for this?
So for example you could write:
package main
import (
"fmt"
"sync"
)
var wg sync.WaitGroup
func main() {
wg.Add(doStuff())
wg.Add(doOtherStuff())
wg.Wait()
fmt.Println("Stuff done")
}
func doStuff() {
fmt.Println("Doing stuff")
}
func doOtherStuff() {
fmt.Println("Doing other stuff")
}
While researching this I found the Paralellizer library, so it seems I’m not alone in trying to make sense of this design decision.
Thank you and again, sorry if this is a terribly stupid question.
You could treat wg.Add(1) as a statement that has to be added before each concurrent function call. Therefore you would do:
wg.Add(1)
go doStuff()
wg.Add(1)
go doOtherStuff()
wg.Add(1)
go doMoreStuff()
wg.Wait()
You can separate the concern of concurrency from the functions by using function literals. For example:
func main() {
var wg sync.WaitGroup
wg.Add(1)
go func() {
defer wg.Done()
doStuff()
}()
wg.Add(1)
go func() {
defer wg.Done()
doOtherStuff()
}()
wg.Wait()
fmt.Println("Stuff done")
}
This way, your functions need to know nothing about concurrency which makes the API nicer. This change also allows to move wg inside main so that it is not global anymore.
The advantages become more apparent when you have to launch many goroutines inside a loop as showcased in the official WaitGroup example.
If you have many functions of the same signature, you could even use the pattern shown in the example:
func main() {
var wg sync.WaitGroup
var fns = []func(){
doStuff,
doOtherStuff,
doMoreStuff,
doEvenMoreStuff,
}
for _, f := range fns {
wg.Add(1)
go func(fn func()) {
defer wg.Done()
fn()
}(f)
}
wg.Wait()
fmt.Println("Stuff done")
}
Last but not least, if you really like the compactness achieved by using the Paralellizer library, you could accomplish a similar result with just 10 lines of code:
func main() {
var wg waitGroup
wg.Go(doStuff)
wg.Go(doOtherStuff)
wg.Go(doMoreStuff)
wg.Go(doEvenMoreStuff)
wg.Wait()
fmt.Println("Stuff done")
}
type waitGroup struct {
sync.WaitGroup
}
func (wg *waitGroup) Go(f func()) {
wg.Add(1)
go func(fn func()) {
defer wg.Done()
fn()
}(f)
}
The existing API is more flexible and thus more general, and can be used to build the API you want. We could not have done the opposite.
For example, imagine we are building a distributed database of some kind. We have n peers that we talk to. We want to track when our peers have applied the update. We can broadcast the update, wg.Add(n), launch a goroutine that does a wg.Wait(). In some other event loop we track incoming acks and wg.Done() for each. This works fine with the current waitgroup API but wouldn’t be doable if the only thing it could do was to wrap goroutines.