Transferring files over network

First off, Hi!
Thanks for your help in advance

I have done research for this and tried a lot of things but now I have reached a point where I have no idea what I did wrong and what to do right. So I’m asking for your help. I don’t know what exactly to tell you and don’t want to bother you with an enormous wall of text, so please ask for further information.

So what I’m trying to do is essentially simply transferring a (fairly large) file over the network (then do some processing and send them back, but that’s not my first problem). Yes, I could just use Windows (Samba) shares but I took the chance to learn some golang.
First I fooled around a bit, seeing how far I could get (not very far). Then I googled and found out the “proper” way to do it is to use io.Copy() - which I did and it actually worked. Somewhat. If it worked it was great, but either our network is wonky or sth else is wrong, but about 50%-80% of the time the transfer would fail somewhere in the middle (it takes ~10+minutes per file).
So I wanted to make something more robust - but I didn’t manage to get anything that worked. At all.
Long story short, I tried manually .Read() and .Write(), .CloseWrite(), copying chunks of the file by reading and sending []byte slices of limited size, sending MD5 checksums, bufio, bytes.Buffer and what have you not, in a lot of different combinations and configurations.
But everything resulted in either 0 bytes getting sent, the receiver just waiting endlessly (I guess for an EOF) or receiving garbage in any form and shape - I’ll spare you of all the connection errors, spaghetti code while troubleshooting and cursing involved (not to forget copious amounts of coffee).
Lastly I even managed to get it to runtime panic for no apparent reason to me. That’s where I started to really feel like a moron and realizing I am probably missing something really obvious or doing something really stupid and needed help.

So instead of asking you to troubleshoot all of my ■■■■■■ code, what way would you go to tackle such a problem? I want to reliably transfer big files over a network connection which may be prone to errors and handle these properly, noticing junk and retrying, preferably not the whole file as to waste bandwidth for 10 minutes.

Sorry, it turned out to be a wall of text anyway… but at least I spared you from the pages and pages of code that didn’t work. If you need any further information, please ask.

Again thanks for your help!

I’d start with doing what your initial guess was; establish a connection and use io.Copy. Get this to work reliably for small files of a few hundred kilobytes or so. When that works, and it doesn’t with larger files, post your code example and whatever errors or problems you’re running into and we’ll help you.

Note that the receiver will need some way to note that the transfer is complete. You can do this for example by prefixing the transfer with the expected length, or by closing the connection when you’re done and reacting to the EOF. In either case you’ll probably want to involve some sort of hash verification and acknowledgement to ensure the file was actually transferred successfully.

You can also break the file download up into chunks, requesting offsets from a file server. And retry chunks that fail.

1 Like

Thanks for your replies!

@kardianos That was one of my many tries (I did it with []byte slices) but I couldn’t get a single file or even a single chunk to transfer correctly. Something of my approach of sending must be wrong.

@calmh I tried again and yet again feel really stupid… See what happened:

So I closed all files to go at the problem unbiased. And I didn’t get it to work. But it’s a good example of the weird stuff i get and have no idea why, so I figured I show you.

Over here is a repository so you don’t see a wall of code here.

The “new” folder is what I just wrote from scratch.
While the sender gives:
2016/01/29 15:42:08 Successfully sent file: {test [some arguments] 136407597}
The receiver goes:
2016/01/29 15:30:18 Failed to receive file: expected size: 136407597 received size: 136403610
The received size stays the same on every retry btw - so it’s not an error but I’ve done something wrong. (The timestamps are irrelevant btw, I just copied one of countless tries for each)

I couldn’t figure out why (that’s exactly the type of problem I had with every approach I took before). So I dug out the first try I had at this (that’s what you find in the “old” folder, I just cleaned it up a bit, but it’s still cluttered with my debug comments).
And hella-why it works:
2016/01/29 16:07:29 Sucessfully received Job and File: {somefile [] 136407597}
even with the new receiver (the old one looked a little bit different), so I figured the problem has to be with the sender.

So I tried in the new sender the following:

  • use net.DialTCP instead of net.Dial (in order to use c.CloseWrite )
  • use os.OpenFile instead of os.Open
    Didn’t work. Then I started taking apart the old sender which had a lot more clutter in it till it looked like the one you see in the repo. Now I can see virtually no significant difference between the two and yet one works and one doesn’t.

At this point I give up and hope you can tell me why I am stupid and it won’t work and just am sorry for those long explanations of what I’ve tried.

My guess is that the gob.Decode() is reading past the end of your struct on the receive side. Probably using a buffer or something.

To keep with what you’re trying, perhaps gob.encode to temporary buffer, get the size, and send that first. On the receive, read the size first, read only that size into a temporary buffer, then only pass the temporary buffer to gob decode.

Gob is padding the data written on the wire, so the first few bytes received after the gob header are junk.

Quick and dirty solution

You might get some performance wins with using instead of io.Copy.

Of course if you are sending large files, you may want to consider building up the whole system on being able to resume and/or add throttling.

But wouldn’t then the file be too big (include the padded bytes)? I am receiving too small a file, so I’d think gob is “gobbling” (badum tss) too much data, as @howeyc suggested?

I don’t quite understand your code. Why are you using encoding/binary instead of just reading/writing the file? Should I just pass on my other fields from my “Job” struct one by one like filename and fileSize?

That actually worked. But I ran into some more weirdness.
I simply let gob .Encode() into a bytes.Buffer and then io.Copy() that - and the receiver suddenly worked (really, I checksummed the file manually).
Then I thought ‘Why would it work, the receiver could still just read as much as it wants’. So I added a buffer to the receiver in a similar fashion, and (to no real suprise) it just kept on reading till dawn and gobbled up the file in the process aswell. (I’ve hade a bit of a problem with (missing) EOFs in the past already.)
So I tried (probably in a dirty hack fashion) to send the buffer length ahead of time - and it works since I found reading in a fixed size []byte works and found io.CopyN() which i must have missed somehow in the past.
(also see the repo, I made 3 commits for each step)

Now I am one step further and will continue building up to see where it breaks next. But I still don’t know WHY it did or did not work (pic).
How were my old and new senders different so one would work but not the other?
Why did just sending a buffer work without actually restricting the receiver?
I feel like I need to understand what is going on or I’ll just hang at the next problem around the corner…

Again thank you to all, you’ve helped me freshen up my mind which was stuck in the matter!

What I’m doing:

  1. Write the size of the filename
  2. Write the filename bytes
  3. Write the size of the file
  4. Write the file content.

On the receiving side:

  1. Read size of the filename
  2. Read the filename
  3. Read the size of the file
  4. Create a file with appropriate size
  5. Copy content from the connection to file

I’m writing the sizes with encoding/binary to ensure that the int64 is encoded the same way on all platforms. There are other ways to do it as well. i.e. Write([]byte{byte(size>>24), byte(size>>16), byte(size >>8), byte(size)}, and read from the other side.

With regards to struct. Instead of sending a single filename, send JSON serialized data.

1 Like

i use this code to transfer large files over the network. work for files around 2Gb. the speed is also good.


This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.