Do i need to read the body before close it?

Found mixed answers and the official documentation don’t give me a clear response. If i call the resp.Body.Close() it gonna read the body so the connection can be reused? Or do i need to put a io.Copy(ioutil.Discard, resp.Body) everytime?

Found a good blog post about this: http://tleyden.github.io/blog/2016/11/21/tuning-the-go-http-client-library-for-load-testing/

Tested the sample code on my machine and proves that we need to read all the body before close.

No. Calling body.Close will drain it if required.

1 Like

@dfc, take a look at this sample code, took it from the link posted before, if we just close the body without reading from it, the quantity of time wait connections go through the roof:

package main

import (
  "fmt"
  "html"
  "log"
  "net"
  "net/http"
  "time"
)

func startWebserver() {
  http.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
      fmt.Fprintf(w, "Hello, %q", html.EscapeString(r.URL.Path))
  })

  go http.ListenAndServe(":8080", nil)
}

func startLoadTest() {
  count := 0
  for {
      resp, err := http.Get("http://localhost:8080/")
      if err != nil {
          panic(fmt.Sprintf("Got error: %v", err))
      }
      resp.Body.Close()
      log.Printf("Finished GET request #%v", count)
      count += 1
  }
}

func main() {
  startWebserver()
  startLoadTest()
}

Like this:

root@14952c2356a7:/# netstat -n | grep -i 8080 | grep -i time_wait | wc -l
166
root@14952c2356a7:/# netstat -n | grep -i 8080 | grep -i time_wait | wc -l
231
root@14952c2356a7:/# netstat -n | grep -i 8080 | grep -i time_wait | wc -l
293
root@14952c2356a7:/# netstat -n | grep -i 8080 | grep -i time_wait | wc -l
349
... 

This is expected? Maybe it’s not a leak but seems to be slower because the runtime don’t reuse the connections and keep creating new ones.

resp.body.Close makes the connection available for reuse by others using the same http.Client. It doesn’t close the underlying http or tcp connection.

Would you be able to state the underlying problem you’re investigating. This feels a little X/Y ish.

Last week got tons of alerts of too many open files on a app running on a restricted environment. On this environment there is a hard limit of 5k file descriptors per docker container and the app was always surpassing this limit.

This app does lots, and lots, and lots of requests on several microservices to generate a final object that is indexed on elasticsearch. Most of this requests are from blocks of code like this:

resp, err := c.httpClient.Do(req)
if err != nil {
  return errors.Wrap(err, "error during http request execution")
}
defer resp.Body.Close()

if resp.StatusCode != http.StatusOK {
  return errors.New("invalid response status code")
}

content, err := ioutil.ReadAll(resp.Body)
if err != nil {
  return nil, errors.Wrap(err, "error during response body read")
}

I’m always closing the response body, but, sometimes the services return a non valid status code and the function just returns without reading the response body. The quantity of time wait connections goes up and happens the problem with file descriptors.

I tried to change the MaxIdleConnsPerHost and MaxIdleConns without success, the number of connections still goes up.

I think this happens because there still content on body to be read, so the runtime don’t reuse the connection and create another one every time.

What does lsof -p $PID on a failing process show ?

You issue the defer resp.Body.Close() before checking the status code, so this should not matter.

@dfc, i know that you are saying that resp.Body.Close() will drain the connection if need to later reuse, but the numbers are showing the oposite. I’m not saying that you are a liar, just doing some tests and benchmarks to understand how the golang http runtime works.

Take a look at this snippet, is a little modified version from the first:

package main

import (
	"fmt"
	"html"
	"net/http"
	"sync"
	"time"

	"github.com/kr/pretty"
)

var (
	mutex sync.Mutex
	count int
)

func startWebserver() {
	http.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
		fmt.Fprintf(w, "Hello, %q", html.EscapeString(r.URL.Path))
	})

	go http.ListenAndServe(":8080", nil)
}

func startLoadTest() {
	for {
		resp, err := http.Get("http://localhost:8080/")
		if err != nil {
			panic(fmt.Sprintf("Got error: %v", err))
		}
		// io.Copy(ioutil.Discard, resp.Body)
		resp.Body.Close()

		mutex.Lock()
		count += 1
		mutex.Unlock()
	}
}

func main() {
	startWebserver()
	for i := 0; i < 1; i++ {
		go startLoadTest()
	}
	<-time.After(10 * time.Second)
	pretty.Println("request count: ", count)
}

Using the snippet, with or without the io.Copy, the result of lsof is very similar to this:

lsof -p 23069

COMMAND   PID  USER   FD   TYPE DEVICE SIZE/OFF     NODE NAME
go      23069 diego  cwd    DIR   8,18     4096 14686472 /home/diego/projects/go/1.8.1/src/sample
go      23069 diego  rtd    DIR   8,18     4096        2 /
go      23069 diego  txt    REG   8,18 10073055 13523309 /home/diego/programs/go/1.8.1/bin/go
go      23069 diego  mem    REG   8,18  1981712  8129743 /usr/lib/libc-2.25.so
go      23069 diego  mem    REG   8,18   146568  8129721 /usr/lib/libpthread-2.25.so
go      23069 diego  mem    REG   8,18   168656  8129742 /usr/lib/ld-2.25.so
go      23069 diego    0u   CHR  136,0      0t0        3 /dev/pts/0
go      23069 diego    1u   CHR  136,0      0t0        3 /dev/pts/0
go      23069 diego    2u   CHR  136,0      0t0        3 /dev/pts/0

So, my application has very few open files, but why ss -s show so many connections in closed/time wait state without the io.Copy?

Total: 1602 (kernel 0)
TCP:   14283 (estab 56, closed 14194, orphaned 23, synrecv 0, timewait 14192/0), ports 0

Transport Total     IP        IPv6
*         0         -         -
RAW       0         0         0
UDP       17        8         9
TCP       89        60        29
INET      106       68        38
FRAG      0         0         0

And almost zero with io.Copy?

Total: 1602 (kernel 0)
TCP:   70 (estab 56, closed 4, orphaned 0, synrecv 0, timewait 1/0), ports 0

Transport Total     IP        IPv6
*         0         -         -
RAW       0         0         0
UDP       16        7         9
TCP       66        43        23
INET      82        50        32
FRAG      0         0         0

Also the last line print the quantity of requests done, the version with io.Copy is almost 5x faster on my machine.

When the number of goroutines doing requests are increase to 10, the ss -s is very similar, both versions have very high number of closed/time wait connections (+/- 25k), but the version with io.Copy still is 5x faster.

Can’t understand what is happening.
The connections are being reused?
Why so many time wait/closed connections on ss -s, but they don’t show on lsof?
Why the io.Copy is faster?
Why the time wait/closed connections go from almost 0 to 25k when i increase the concurrency with io.Copy?

Sorry for all the questions, just trying to understand better this situation.

It used to be the case that you needed to drain the body to get connection reuse. Maybe that’s not the case any more.

However, the mechanism to add connections to the set that is available for reuse happens on a background goroutine in the http package. This takes some time and your tight loop may simply be faster than that mechanism, so new connections are opened instead.

Draining the connections yourself slows down your loop. Assuming the http package does this itself on close, not doing it would also slow down the reuse mechanism.

(I filed a bug on this behavior a few years ago, but fixing it is non-obvious.)

I cannot explain it, I’ll bow out and let someone with more experience take
over.

Oh, I think I know. If you call Close before the body has been completely
consumed then it will drain the body then close the connection at the http
level. I think the behaviour differs if you have consumed the whole body;
it puts the connection back into the pool for reuse. This behaviour also
changed around the 1.5 timeframe.

As I said, this is not my area of expertise, I’ll butt out and let someone
with more experience assist you.

Found some issues on github about this problem, this on is the most similar:

Seems the problem is related to readers not doing the read with (n, io.EOF).

bradfitz: As background, Go used to auto-drain (around Go 1.0 and Go 1.1 era) but it was problematic for a number of reasons and we changed the behavior to be more predictable but also pushed io.EOFs earlier for many readers and made the final Read-with-EOF on the connection also auto-recycle the TCP connection for people who forget to Close. It seems there’s a Reader not doing early (n, io.EOF) instead of (n, nil)+(0, io.EOF) somewhere.

This CL has a nice information about the readers in the std lib.

net/http and other things work better when io.Reader implementations
return (n, io.EOF) at the end, instead of (n, nil) followed by (0,
io.EOF). Both are legal, but the standard library has been moving
towards n+io.EOF.

In the snippet, this behaviour is not present, i think, but in my app, they may be, need to investigate more.

Thanks for your help guys.

2 Likes

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.