Concurrent request handling?

I have the following code:

package main

import (
	"net/http"
)

func main() {
	mux := http.NewServeMux()
	mux.HandleFunc("/status", func(w http.ResponseWriter, r *http.Request) {
		w.Write([]byte("OK"))
	})
	http.ListenAndServe(":3000", mux)
}

I have this running with go run main.go, and then I opened 2 terminal instances and execute the following command in each of them:

for i in `seq 1 1000`; do
  curl  localhost:3000/status
done

Some requests eventually took a lot of time to process. I see that it happens sometimes in production (a request take more time than it should), and I was able to easily reproduce this with this recipe.

To help the understanding, here is a video showing what happens.

I actually found this with labstack/echo, so I first opened an issue there, but the behavior is the same with the example provided here without any frameworks.

Am I doing something wrong? How can I fix this?

For further info:

$ go version
go version go1.6 darwin/amd64

$ go env
GOARCH="amd64"
GOBIN=""
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GOOS="darwin"
GOPATH="/Users/carlos/Code/Go"
GORACE=""
GOROOT="/usr/local/Cellar/go/1.6/libexec"
GOTOOLDIR="/usr/local/Cellar/go/1.6/libexec/pkg/tool/darwin_amd64"
GO15VENDOREXPERIMENT="1"
CC="clang"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fno-common"
CXX="clang++"
CGO_ENABLED="1"
1 Like

I suspect ephemeral port exhaustion. (That is, it’s the client being slow, not Go)

So, curl is holding it down?

Can you try your experiment with a real load balancer tool, wrk or rakyll/boom are good choices. Do not use ab, it is known to be broken and Go programmers do not consider the results from that tool reliable.

1 Like

Yeah, it doesn’t seem to happen with rakyll/boom:

$ boom -n 20000 -c 100 http://localhost:3000/status
Summary:
  Total:	1.1593 secs
  Slowest:	0.0746 secs
  Fastest:	0.0002 secs
  Average:	0.0055 secs
  Requests/sec:	17251.9174
  Total data:	40000 bytes
  Size/request:	2 bytes

Status code distribution:
  [200]	20000 responses

Response time histogram:
  0.000 [1]	|
  0.008 [17396]	|∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
  0.015 [1994]	|∎∎∎∎
  0.022 [290]	|
  0.030 [109]	|
  0.037 [51]	|
  0.045 [19]	|
  0.052 [40]	|
  0.060 [73]	|
  0.067 [20]	|
  0.075 [7]	|

Latency distribution:
  10% in 0.0024 secs
  25% in 0.0034 secs
  50% in 0.0045 secs
  75% in 0.0058 secs
  90% in 0.0085 secs
  95% in 0.0117 secs
  99% in 0.0314 secs

I also tried ab with the same params, got apr_socket_recv: Operation timed out (60).

Well, so I guess the problem is within curl. I just would like to understand why exactly this happens…

1 Like

The problem isn’t with Curl per se. This is a fundamental property of tcp that when a connection is closed, and exiting the program will close the connection, the ip:port pair must go into TIME_WAIT and TIME_WAIT2 states before it can be safely reused. During that time the ip:port pair is unavailable to make another outgoing (or incoming if you’re benchmarking over the loopback) so the kernel must find another free ip:port pair, there are 65336 possible when connection over the loopback, not all are available.

It is possible to avoid most of the TIME_WAIT delay with careful coding on both the client and the server. This cannot happen if the client exits, which is what happens when curl is run in a loop in a shell script.

3 Likes

Understood. Thanks for your time :smiley:

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.