Using zip, xml, io and fs together

Hi I am new to Go and want to read a struct from an XML file in a zip archive. I am having trouble fitting the pieces together - I can get something to work but it feels like there should be a more straightforward way.

To start I believe I need to use zip.OpenReader to get a zip.ReadCloser. At the end I need to have some []byte to pass to xml.Unmarshall.

In the middle I had two thoughts:

  1. Use zip.ReadCloser.Open to open a file - but this returns an fs.File which doesn’t have an easy way to read its entire contents as bytes, that I can see. Also this seems to need to copy the bytes which is not needed, because the zip file is already decompressed in memory.
  2. Iterate through the zip.ReadCloser.Files searching for the file name I want. Then the zip.Reader.File does have an easy way to read it, calling Open and then io.ReadAll. But it seems unfortunate I need to do a linear iteration of the files myself.

So I am just looking for comments really, is there a smarter way to do this that I haven’t spotted?

Thanks in advance.

Here is some code, I know I’ve skipped all the error handling and defer close stuff for simplicity.

func one() {
	// var r *zip.ReadCloser
	r, _ := zip.OpenReader("na.zip")
	// var f fs.File
	f, _ := r.Open("custom.xml")
	// var bytes []byte
    // ** Any way to avoid implementing this function myself?
	bytes, _ := readfsfile(f)
	// var custom Custom
	custom := Custom{}
	xml.Unmarshal(bytes, &custom)
	fmt.Println(custom)
}
func two() {
	// var r *zip.ReadCloser
	r, _ := zip.OpenReader("na.zip")
	// var file zip.Reader.File
	// ** Any way to avoid this iteration?
	for _, file := range r.File {
		if file.Name == "custom.xml" {
			// var reader io.ReadCloser
			reader, _ := file.Open()
			// var bytes []byte
			bytes, _ := io.ReadAll(reader)
			// var custom Custom
			custom := Custom{}
			xml.Unmarshal(bytes, &custom)
			fmt.Println(custom)
		}
	}
}

Hello there. As far as I can see with the first option, Open returns ReadCloser as well, so you can use it’s Read method same as in the second option.

f, _ := r.Open("file in archive")
defer f.Close()

data, _ := io.ReadAll(f)

Imho I prefer the second option, since the first one limits you to knowing file path in advance.

Great, thanks. I really was missing something!
Is it correct to say that fs.File is a Reader (and hence can be passed to io.ReadAll) just because it has the Read(p []byte) (n int, err error) method? I read something about this but didn’t quite grasp it obviously. I was expecting the docs to mention “implements Reader” or something.

Also in my case I do know the file path within the zip in advance, it’s always the same, so that’s OK.

Yeah, sometimes docs do not specify this directly. But you can check the source code. If it implements methods for the interface. Or, like in this case, returns interface with embedded methods

Just a quick note if you want to try to simplify things a bit. Take a look at what xml.Unmarshal is doing:

func Unmarshal(data []byte, v any) error {
	return NewDecoder(bytes.NewReader(data)).Decode(v)
}

And if you track down the function signature of xml.NewDecoder you will see it takes an io.Reader:

func NewDecoder(r io.Reader) *Decoder

Your io.ReadCloser is a reader. So you can just pass it to NewDecoder and not read all the bytes into memory first:

if file.Name == "custom.xml" {
	// var reader io.ReadCloser
	reader, _ := file.Open()
	// var custom Custom
	custom := Custom{}
	err := xml.NewDecoder(reader).Decode(&custom)
	// Handle err
	fmt.Println(custom)
}
2 Likes