I am new to golang. Employer send me a test exercise and turned me down without further explanations.
I’ll appreciate if someone would tell me what’s wrong with my code, there’re only 110 lines of it. Here’s a link with test description and implementation:
I guess it looks reasonable enough, my complaints would be stylistic or niggles.
A timeout of 1e9 is better written as time.Second, and a one second timeout may be too short to begin with.
Silently ignoring errors isn’t great, although there is a comment to that effect.
You could have counted the string instances by reading the response body in chunks. That would be a little trickier (as one chunk might end with G and the next start with o and you need to match that), so would show a little more skill perhaps, and would use less memory than buffering each response in it’s entirety.
On that same note, string(body) causes another copy of the response in memory when there is a bytes.Count you could have used directly on the body.
Maybe gen should have taken a reader instead of hardcoding stdin, or take strings over a channel and have some other place do the reading. Then again, no need to make it more general than it is to fulfill the specifications.
Breaking out the semaphore mechics into a minimal semaphore type would make it clearer what that does, but no big deal.
Error handling again, on bufio.Scanner. Not much to do if a read fails except complain about it, but should be noted somewhere.
All in all, small things. This probably wouldn’t have been a “fail” to me, more “need a little polish”.
Can you please clarify comment on semaphore? What exactly do you mean by “breaking out the semaphore mechics (I guess mechanics?) into a minimal semaphore type”?
I only quickly skimmed over what you posted so if I’ve read some of the requirements wrong then my bad, but based on what you and Jakob are talking about, rather than using a semaphore based approach, I’d probably personally go with something more simple as just using a limited (maxWorkers) number of worker goroutines concurrently at a time. For example:
package main
import (
"bufio"
"bytes"
"fmt"
"io/ioutil"
"log"
"net/http"
"os"
"sync"
)
type response struct {
url string
err error
count int
}
func worker(urls <-chan string, resps chan<- *response, wg *sync.WaitGroup) {
// grab urls until the urls channel is closed.
for url := range urls {
res, err := http.Get(url)
if err != nil {
resps <- &response{url, err, 0}
continue
}
slurp, err := ioutil.ReadAll(res.Body)
res.Body.Close()
if err != nil {
resps <- &response{url, err, 0}
continue
}
resps <- &response{
url,
nil,
bytes.Count(slurp, []byte("Go")),
}
}
wg.Done() // decrement wg
}
func main() {
maxWorkers := 5 // could be flag instead.
urls := make(chan string)
resps := make(chan *response)
var wg sync.WaitGroup
// start maxWorkers workers.
wg.Add(maxWorkers)
for i := 0; i < maxWorkers; i++ {
go worker(urls, resps, &wg)
}
go func() {
scanner := bufio.NewScanner(os.Stdin)
for scanner.Scan() {
urls <- scanner.Text()
}
if err := scanner.Err(); err != nil {
log.Fatalln(err)
}
// input is done, close urls channel.
close(urls)
// wait for wg counter to be 0 and then close
// the resps channel to exit the for loop below.
wg.Wait()
close(resps)
}()
var total int
for r := range resps {
if r.err != nil {
log.Printf("%s: %s\n", r.url, r.err)
continue
}
total += r.count
fmt.Printf("Count for %s: %d\n", r.url, r.count)
}
fmt.Println(total)
}
This is definitely not perfect, but just giving you an example of what I mean by worker goroutines for concurrency limiting as opposed to using semaphores.
P.S. Even in a test I’d probably try to make it a habit to always check errors where necessary (almost everywhere).
Ah ok, didn’t see that, so yeah then you probably wouldn’t go with my type of worker example in that case, but either way it would be pretty easy to create an abstraction around it to only start workers when there’s less than maxWorkers goroutines running at a time.
Also in case anybody else didn’t mention it, one other thing that stuck straight out at me was the fact that every time you are calling get(url) you’re creating a new http client when instead you should just be creating one and then passing it around.
Anyway, I still think that based on the fact that you say you’re new at Go that your effort was pretty good
Edit: By the way, like Jakob was saying, even an abstraction around a semaphore could be easier on the eyes. Even as simple as something like this works (similar to Jakob’s, but I like to fill as I acquire so I don’t need to pre fill, not that it matters either way lol):
type semaphore chan struct{}
func newSemaphore(capacity int) semaphore {
return make(chan struct{}, capacity)
}
func (s semaphore) acquire() {
s <- struct{}{}
}
func (s semaphore) drain() {
<-s
}
// The Client's Transport typically has internal state (cached TCP
// connections), so Clients should be reused instead of created as
// needed. Clients are safe for concurrent use by multiple goroutines.
But there should be no memory leak since http.Clients will be garbage collected, right? So it’s just permofance issue.