Differently behavior on reading the last bytes on two kind of reader: gzip.Reader and bytes.Reader

I found that, for two kind of reader: gzip.Reader and bytes.Reader, they have differently behavior on reading the last bytes, maybe I should create an issue on github?

Detail on following

For gzip.Reader, it return 1 and an EOF for last byte read, but for bytes.Reader, it return 1 and nil(error). I write this codes: Go Playground - The Go Programming Language

a function like this

func compareReader(r, b io.Reader) error {
	var bufa [1]byte
	var bufb [1]byte
	for {

		na, erra := r.Read(bufa[:])
		nb, errb := b.Read(bufb[:])

		if erra == nil && errb == nil && na == nb && bufa[0] == bufb[0] {
			continue
		}
		if erra == errb && erra == io.EOF {
			return nil
		}
		if erra != nil {
			if erra == io.EOF && errb != io.EOF {
				return fmt.Errorf("reader b has more data than a")
			}
			return fmt.Errorf("read on a error: %s", erra)
		}
		if errb != nil {
			if errb == io.EOF && erra != io.EOF {
				return fmt.Errorf("reader a has more data than b")
			}
			return fmt.Errorf("read on b error: %s", erra)
		}
		return nil
	}
}

I don’t care about any thing like POSIX , linux man page, or any other, I just think that all std lib should be treated according to uniform standards.

And because the caller only know that this is an io.Reader, he doesn’t care the underlying implement of this reader.

The documentation of io.Reader says that what you’re seeing is expected behavior for some implementations. Specifically from the third paragraph:

When Read encounters an error or end-of-file condition after successfully reading n > 0 bytes, it returns the number of bytes read. It may return the (non-nil) error from the same call or return the error (and n == 0) from a subsequent call. An instance of this general case is that a Reader returning a non-zero number of bytes at the end of the input stream may return either err == EOF or err == nil. The next Read should return 0, EOF.

Though you may not care about POSIX or any other OS interface standards, that doesn’t change the fact that the underlying operating system running your Go code does follow POSIX or some other (maybe internal/proprietary) standard and that the underlying OS-level call to read from a file, network socket, etc., might read to the end of the file and not return an EOF. If you’re reading from a TCP socket, for example, you might not know that you’ve reached the end of file until the connection is closed. Your first read could read n bytes and then you sit and wait until the connection is closed at which point you read 0 additional bytes and just get an EOF.

Alternatively, when reading a file, it can be advantageous to return the last chunk of the file from read and an EOF at the same time instead of one call to read the last chunk and no error, and another call to read 0 bytes and an EOF because it saves a context switch to the OS to retrieve the EOF.

So, tl;dr: Submitting an issue to the Go team will not resolve your issue. I would recommend instead:

  1. Change your code to work with the zero and non-zero bytes read EOF conditions

  2. Use or create an abstraction to translate one scenario into the other:

There’s another standard interface called io.ByteReader that is meant to read singe bytes at a time. You could check to see if b and r implement that interface and wrap them into it if not:

b2, ok := b.(io.ByteReader)
if !ok {
    b2 = bufio.NewReader(b)
}

// same for r

// this:
//	na, erra := r.Read(bufa[:])
//	nb, errb := b.Read(bufb[:])

// changes to:
ba, erra := r2.ReadByte()
bb, errb := b2.ReadByte()

Implementations of this interface do not mix successfully reading bytes with errors; If there’s an error, then no bytes were read and if any bytes were read, there are no errors.

1 Like

Thank you for such a detailed answer, after reading, I successfully understand it !

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.