I have been working on a terminal app and need to load data from a 1,250 URLs currently I am doing the following:
var wg = sync.WaitGroup
var lock = sync.RWMutex
for _, text := range urls {
time.Sleep(time.Millisecond * 1)
wg.Add(1)
go parseData(text, wg, lock)
}
wg.Wait()
func parseData(url string, wg sync.WaitGroup, lock sync.RWMutex) {
defer wg.Done()
defer lock.Unlock()
htmlBody := readHTML(url)
lock.Lock()
addToMap(htmlBody)
}
Is there a better way to spin up these Goroutines so they could execute as fast as the connection allows? Without the sleep function it creates the Goroutines much too fast and gives a ātoo many connectionsā error.
However, if I have the lock around the readHTML
function and let them spin up all āat onceā it takes far too long to get all the data.
Something I noticed with the time.Sleep()
function in this usage is that occasionally the HTML still errors out. Itās rare, maybe 1 in 100, but still undesirable.
Any and all thoughts are welcome, thanks!