Newbie request for (very small) code review

I am new to golang. Employer send me a test exercise and turned me down without further explanations.

I’ll appreciate if someone would tell me what’s wrong with my code, there’re only 110 lines of it. Here’s a link with test description and implementation:

https://gist.github.com/dmokhov/6539a2d920dec7e140ba02ea6f917ed8

If it’s not right place to ask for such help, please advise more appropriate.

I guess it looks reasonable enough, my complaints would be stylistic or niggles.

  • A timeout of 1e9 is better written as time.Second, and a one second timeout may be too short to begin with.
  • Silently ignoring errors isn’t great, although there is a comment to that effect.
  • You could have counted the string instances by reading the response body in chunks. That would be a little trickier (as one chunk might end with G and the next start with o and you need to match that), so would show a little more skill perhaps, and would use less memory than buffering each response in it’s entirety.
  • On that same note, string(body) causes another copy of the response in memory when there is a bytes.Count you could have used directly on the body.
  • Maybe gen should have taken a reader instead of hardcoding stdin, or take strings over a channel and have some other place do the reading. Then again, no need to make it more general than it is to fulfill the specifications.
  • Breaking out the semaphore mechics into a minimal semaphore type would make it clearer what that does, but no big deal.
  • Error handling again, on bufio.Scanner. Not much to do if a read fails except complain about it, but should be noted somewhere.

All in all, small things. This probably wouldn’t have been a “fail” to me, more “need a little polish”.

3 Likes

@calmh Thanks for help!

Can you please clarify comment on semaphore? What exactly do you mean by “breaking out the semaphore mechics (I guess mechanics?) into a minimal semaphore type”?

Hey @dmokhov,

I only quickly skimmed over what you posted so if I’ve read some of the requirements wrong then my bad, but based on what you and Jakob are talking about, rather than using a semaphore based approach, I’d probably personally go with something more simple as just using a limited (maxWorkers) number of worker goroutines concurrently at a time. For example:

package main

import (
	"bufio"
	"bytes"
	"fmt"
	"io/ioutil"
	"log"
	"net/http"
	"os"
	"sync"
)

type response struct {
	url   string
	err   error
	count int
}

func worker(urls <-chan string, resps chan<- *response, wg *sync.WaitGroup) {
	// grab urls until the urls channel is closed.
	for url := range urls {
		res, err := http.Get(url)
		if err != nil {
			resps <- &response{url, err, 0}
			continue
		}
		slurp, err := ioutil.ReadAll(res.Body)
		res.Body.Close()
		if err != nil {
			resps <- &response{url, err, 0}
			continue
		}
		resps <- &response{
			url,
			nil,
			bytes.Count(slurp, []byte("Go")),
		}
	}

	wg.Done() // decrement wg
}

func main() {
	maxWorkers := 5 // could be flag instead.

	urls := make(chan string)
	resps := make(chan *response)

	var wg sync.WaitGroup

	// start maxWorkers workers.
	wg.Add(maxWorkers)
	for i := 0; i < maxWorkers; i++ {
		go worker(urls, resps, &wg)
	}

	go func() {
		scanner := bufio.NewScanner(os.Stdin)
		for scanner.Scan() {
			urls <- scanner.Text()
		}
		if err := scanner.Err(); err != nil {
			log.Fatalln(err)
		}
		// input is done, close urls channel.
		close(urls)

		// wait for wg counter to be 0 and then close
		// the resps channel to exit the for loop below.
		wg.Wait()
		close(resps)
	}()

	var total int
	for r := range resps {
		if r.err != nil {
			log.Printf("%s: %s\n", r.url, r.err)
            continue
		}
		total += r.count
		fmt.Printf("Count for %s: %d\n", r.url, r.count)
	}
	fmt.Println(total)
}

This is definitely not perfect, but just giving you an example of what I mean by worker goroutines for concurrency limiting as opposed to using semaphores.

P.S. Even in a test I’d probably try to make it a habit to always check errors where necessary (almost everywhere).

2 Likes

@radovskyb

Thanx for your comment! And yes, there’s a requirement on workers - if there is no input, than there should be no workers.

I thought I missed something really serious here, but I guess I’m not (posted this here, on Google+ and reddit - everybody says my code is ok).

I mean using something like

type semaphore struct {
	vals chan struct{}
}

func newSemaphore(capacity int) *semaphore {
	s := &semaphore{
		vals: make(chan struct{}, capacity),
	}
	for i := 0; i < capacity; i++ {
		s.put()
	}
	return s
}

func (s *semaphore) take() {
	<-s.vals
}

func (s *semaphore) put() {
	s.vals <- struct{}{}
}

to make the main logic a teensy bit cleaner.

sem := newSemaphore(MaxWorkers)
for ... {
    sem.take()
    go func() {
        // ...
        sem.put()
    }()
}

Ah ok, didn’t see that, so yeah then you probably wouldn’t go with my type of worker example in that case, but either way it would be pretty easy to create an abstraction around it to only start workers when there’s less than maxWorkers goroutines running at a time.

Also in case anybody else didn’t mention it, one other thing that stuck straight out at me was the fact that every time you are calling get(url) you’re creating a new http client when instead you should just be creating one and then passing it around.

Anyway, I still think that based on the fact that you say you’re new at Go that your effort was pretty good :smiley:

Edit: By the way, like Jakob was saying, even an abstraction around a semaphore could be easier on the eyes. Even as simple as something like this works (similar to Jakob’s, but I like to fill as I acquire so I don’t need to pre fill, not that it matters either way lol):

type semaphore chan struct{}

func newSemaphore(capacity int) semaphore {
	return make(chan struct{}, capacity)
}

func (s semaphore) acquire() {
	s <- struct{}{}
}

func (s semaphore) drain() {
	<-s
}
1 Like

Thanx again!

Good point about http.Client, source says

// The Client's Transport typically has internal state (cached TCP
// connections), so Clients should be reused instead of created as
// needed. Clients are safe for concurrent use by multiple goroutines.

But there should be no memory leak since http.Clients will be garbage collected, right? So it’s just permofance issue.

1 Like

few of my inputs regarding the code -

  1. missing tests
  2. try using a struct{} and associate methods to it (like classes in Java)
  3. missing validations for input data
  4. bufio.NewScanner(os.Stdin) should io.Reader , might be better when write tests
  5. ioutil.ReadAll(resp.Body) should be read in chunks
  6. return strings.Count(string(body), "Go") Go should be a constant or part of the method call
1 Like

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.