Stop io.Reader before reaching io.EOF

andrelom · July 5, 2019, 7:17pm

I am working on a custom network protocol and writing the packet reader. I want to create a wrapper on net.Conn so I can control the implementation of io.Reader.

In this case, I would like for example to use io.Copy, to write the data stream to a file, passing as parameter the net.Conn (with the wrapper). The point here is that I would like to return an io.EOF (stop io.Copy) when an ETX (end-of-text) control character is found. This way I can use the same connection and send multiple packets, and have control over io.Reader.

The question would be, according to the code below, would this be a poor performance implementation? Read byte by byte, until you find the ETX.

package main

import "io"

type reader struct {
	end bool
	raw io.Reader
}

func newReader(raw io.Reader) *reader {
	return &reader{
		end: false,
		raw: raw,
	}
}

// Read reads up to len(buffer) bytes into buffer. It returns the number of
// bytes read (0 <= sum <= len(buffer)) and any error encountered. After the
// ETX control character is found, it returns the io.EOF error on the next Read
// attempt.
func (rea *reader) Read(buffer []byte) (int, error) {
	sum := 0
	buf := make([]byte, 1)
	for i := 0; i <= len(buffer); i++ {
		if rea.end {
			return 0, io.EOF
		}
		if _, err := rea.Read(buf); err != nil {
			return 0, err
		}
		if rea.end = buf[0] == ETX; rea.end {
			return sum, nil
		}
		sum++
		copy(buffer, buf)
	}
	return sum, nil
}

I know we have bufio.Reader, but this one depends on io.Reader, and it is exactly this method that I want to control the reading of data, so I can use it safely with the other methods. And reading the implementation of bufio.Reader, I noticed that it uses an internal buffer, and in the case of the above code, we would be reading directly from the io.Reader of net.Conn.

Any help and opinion will be welcome.

kync · July 19, 2019, 1:29pm

I would read chunks not one byte and check whether my delimiter exists in the received bytes by using bytes.IndexByte.

Let’s say: you are reading 32 bytes from the file and the operation you will do by implementations:
One by one: 1 seek, 1 read for every read operation. the cost is 64
16 bytes a time: 1 seek 1 read it makes and the cost is 4,

The above example doesn’t mean you will get best performance when you have larger chunk size, there are many parameters that affect performance like bandwidth, network buffers, open connections…

system · October 17, 2019, 1:29pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.