Is it possible to reuse an HTTP Request Body multiple times without reading it into memory?

cxy77 · February 25, 2023, 7:58pm

I am using curl -T <FILE> to upload a file, which could be as large as 5GB or even larger. The server implementation could be an HTTP server in Go, Rust, or Java, based on HTTP/1.1.

I need to calculate the checksums of the file on the server side, including MD5, SHA1, and SHA256. This requires three calculations, so it is best to calculate them in parallel, and I must not read the file into memory because this would cause the service to OOM directly (and also not write temporary files). However, the biggest problem we encountered is that a HTTP Request Body cannot be read multiple times: after using it to calculate the checksum once, trying to read it again will become empty. We have tried to use Go’s io.Pipe() to do this, but it doesn’t seem very efficient.

After calculating the checksums, we need to determine based on the business logic whether to upload the file to some S3, which requires the file stream again.

Is this requirement possible to implement? Could you please give some specific code?

skillian · February 25, 2023, 8:13pm

You can’t re-read the body just like you can’t re-read from os.Stdin: You’re reading directly from the stream input (in the case of stdin: Whatever the user types at the keyboard or is piped in from another application, and in the case of the request: Whatever the server is sending in its response) and once it’s been consumed, it’s gone. You have to store somewhere if you want to re-read it, or re-request the information (e.g. read it once to hash it, then request it again and send that directly to S3).

Can you clarify what you mean by it not seeming very efficient?

skillian · February 26, 2023, 2:10am

I found your other question on the Rust language forum where you provided a bit more context about the motivation for your question:

And while that does help me understand why you don’t want to upload to S3 while you’re receiving the data, I still don’t understand why you cannot use local temporary files.

system · May 27, 2023, 2:10am

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.