I’m writing a program, to scan and find which servers of my ip block hosts a domain. The program is working fine, and now I want to add the possibility to use workers, just to improve the performance.
I was reading a little bit and realize that the jobs channel was block because of numJobs const. Then I did as follow
numJobs := len(ipAddresses2)
now it creates the number of jobs as the length of the slice. I think in this case now the problem is about how to queue all those jobs, because if I use a /20 subnet, it will start a high amount of jobs at the same time, I can end with a DoS sponsor by my lack of knowledg hehehe
Since the core of your tool is doing expensive and slow IO, goroutines are a way to limit the number of concurrent operations. You can start every job it its own goroutunes since goroutines are cheap. You only have to make sure that only a limited number of goroutines is executing the expensive and slow IO operations at the same time.
The Go way is not to use worker pools put semaphores and wait groups to limit concurrency. Try something like this.
package main
import (
"fmt"
"math/rand"
"sync"
"time"
)
// The jobs to execute. Instead of looping over a list, we will do a simple
// loop.
var jobcount = 1000
// We limit the number of concurrently executing jobs.
var limit = 10
func main() {
// The `main` func must not finished before all jobs are done. We use a
// WaitGroup to wait for all of them.
wg := new(sync.WaitGroup)
// We use a buffered channel as a semaphore to limit the number of
// concurrently executing jobs.
sem := make(chan struct{}, limit)
// We run each job in its own goroutine but use the semaphore to limit
// their concurrent execution.
for i := 0; i < jobcount; i++ {
// This job must be waited for.
wg.Add(1)
// Acquire the semaphore by writing to the buffered channel. If the
// channel is full, this call will block until another job has released
// it.
sem <- struct{}{}
// Now we have acquired the semaphore and can start a goroutine for
// this job. Note that we must capture `i` as an argument.
go func(i int) {
// When the work of this goroutine has been done, we decrement the
// WaitGroup.
defer wg.Done()
// When the work of this goroutine has been done, we release the
// semaphore.
defer func() { <-sem }()
// Do the actual work.
result := work(i)
fmt.Printf("[%d] done, result is %d\n", i, result)
}(i)
}
// Wait for all jobs to finish.
wg.Wait()
}
func work(i int) int {
// Here we simulate an expensive and slow operation.
time.Sleep(time.Duration(rand.Intn(100)) * time.Millisecond)
return i * 2
}
thank you very much for such an amazing example, I will check how to implemented with my scanBlock function and iterating on the subnet and let you know.
I did some experiments but so far, it is not running for me.
What I did, was to change the for looping on jobcount to loop on ipAddresses2 slice, in order to get the IP for each job it should execute. Some how my “brilliant” idea didn’t work hehehe.