How to make a 32-bit machine encrypt a 2GB file in AEAD mode with Go

Based on NIST SP 800-38D (GCM) section 5.2.1.1, it seems that the maximum length of plaintext is 2^39-256 bits ~ 64 GB. But files larger than 1GB will result in a memory error with any AEAD operating mode, it is not possible to encrypt 1GB files:

package main
import (
	"crypto/cipher"
	"crypto/rand"
	"crypto/aes"
	"bytes"
	"encoding/hex"
	"flag"
	"fmt"
	"crypto/sha256"
	"golang.org/x/crypto/pbkdf2"
	"io"
	"log"
	"os"
)

	var dec = flag.Bool("d", false, "Decrypt instead Encrypt.")
	var iter = flag.Int("i", 1024, "Iterations. (for PBKDF2)")
	var key = flag.String("k", "", "128-bit key to Encrypt/Decrypt.")
	var pbkdf = flag.String("p", "", "PBKDF2.")
	var salt = flag.String("s", "", "Salt. (for PBKDF2)")

func main() {
    flag.Parse()

        if (len(os.Args) < 1) {
	fmt.Println("Usage of",os.Args[0]+":")
        flag.PrintDefaults()
        os.Exit(1)
        }
	
	var keyHex string
	var prvRaw []byte
	if *pbkdf != "" {
	prvRaw = pbkdf2.Key([]byte(*pbkdf), []byte(*salt), *iter, 32, sha256.New)
	keyHex = hex.EncodeToString(prvRaw)
	} else {
	keyHex = *key
	}
	var key []byte
	var err error
	if keyHex == "" {
		key = make([]byte, 32)
		_, err = io.ReadFull(rand.Reader, key)
		if err != nil {
                        log.Fatal(err)
		}
		fmt.Fprintln(os.Stderr, "Key=", hex.EncodeToString(key))
	} else {
		key, err = hex.DecodeString(keyHex)
		if err != nil {
                        log.Fatal(err)
		}
		if len(key) != 32 {
                        log.Fatal(err)
		}
	}

	block, err := aes.NewCipher(key)
	if err != nil {
		panic(err.Error())
	}

	aead, err := cipher.NewGCM(block)
	if err != nil {
		panic(err.Error())
	}

	if *dec == false {
		buf := bytes.NewBuffer(nil)
		data := os.Stdin 
		io.Copy(buf, data)
		msg := buf.Bytes()

		nonce := make([]byte, aead.NonceSize(), aead.NonceSize()+len(msg)+aead.Overhead())

		out := aead.Seal(nonce, nonce, msg, nil)
		fmt.Printf("%s", out)

	        os.Exit(0)
	}

	if *dec == true {
		buf := bytes.NewBuffer(nil)
		data := os.Stdin 
		io.Copy(buf, data)
		msg := buf.Bytes()

		nonce, msg := msg[:aead.NonceSize()], msg[aead.NonceSize():]

		out, err := aead.Open(nil, nonce, msg, nil)
		if err != nil {
			panic(err)
		}
		fmt.Printf("%s", out)

	        os.Exit(0)
	}
}

runtime: out of memory: cannot allocate 1073741824-byte block (1077837824 in use)
fatal error: out of memory

runtime stack:
runtime.throw(0x4d9c26, 0xd)
c:/go/src/runtime/panic.go:617 +0x64
runtime.largeAlloc(0x3ffffe00, 0x11c10101, 0x11c1e000)
c:/go/src/runtime/malloc.go:1057 +0x10f
runtime.mallocgc.func1()
c:/go/src/runtime/malloc.go:950 +0x39
runtime.systemstack(0x447245)
c:/go/src/runtime/asm_386.s:396 +0x53
runtime.mstart()
c:/go/src/runtime/proc.go:1153

goroutine 1 [running]:
runtime.systemstack_switch()
c:/go/src/runtime/asm_386.s:357 fp=0x11c56d94 sp=0x11c56d90 pc=0x447300
runtime.mallocgc(0x3ffffe00, 0x4c0340, 0x1, 0x11c56e00)
c:/go/src/runtime/malloc.go:949 +0x65b fp=0x11c56de8 sp=0x11c56d94 pc=0x40968b
runtime.makeslice(0x4c0340, 0x3ffffe00, 0x3ffffe00, 0x527ffe00)
c:/go/src/runtime/slice.go:49 +0x4f fp=0x11c56dfc sp=0x11c56de8 pc=0x43521f
bytes.makeSlice(0x3ffffe00, 0x0, 0x0, 0x0)
c:/go/src/bytes/buffer.go:232 +0x61 fp=0x11c56e10 sp=0x11c56dfc pc=0x491821
bytes.(*Buffer).grow(0x11c3bb60, 0x200, 0x10000000)
c:/go/src/bytes/buffer.go:145 +0x12a fp=0x11c56e38 sp=0x11c56e10 pc=0x4913ca
bytes.(*Buffer).ReadFrom(0x11c3bb60, 0x4f5920, 0x11c380d8, 0x3880e0, 0x11c3bb60, 0x1, 0x75)
c:/go/src/bytes/buffer.go:205 +0x45 fp=0x11c56e74 sp=0x11c56e38 pc=0x491645
io.copyBuffer(0x4f5880, 0x11c3bb60, 0x4f5920, 0x11c380d8, 0x0, 0x0, 0x0, 0x0, 0x0, 0x4d8408, …)
c:/go/src/io/io.go:388 +0x29a fp=0x11c56eb4 sp=0x11c56e74 pc=0x44e65a
io.Copy(…)
c:/go/src/io/io.go:364
main.main()
H:/PGMM/crypter/aes-gcm/main.go:93 +0x5c6 fp=0x11c56fd0 sp=0x11c56eb4 pc=0x4a9366
runtime.main()
c:/go/src/runtime/proc.go:200 +0x1d7 fp=0x11c56ff0 sp=0x11c56fd0 pc=0x427937
runtime.goexit()
c:/go/src/runtime/asm_386.s:1321 +0x1 fp=0x11c56ff4 sp=0x11c56ff0 pc=0x448b21

How to proceed?

Thanks in advance!

Slice the files into multiple pieces like a cake, do the magic for each of them, then package them nicely back into 1 single file?

How?

(1) You got 1 big cake (>1GB)
(2) Slice them based on edible sizes (~500MB?)
(3) Do the magic for each of them (n pieces at 1 time depending on available resources?)
(4) Package them back into 1 file (like, json array it?)

My question is about the possibility of encrypting large files with non-AEAD modes, but with AEAD modes this memory error occurs.

It is possible to treat the file in chunks, but that would not be the correct way, I mean, to achieve compatibility with other tools of the type I would need to use a standard encryption scheme that does not include an extra step by chopping the file into several. I believe that by splitting the file, in AEAD mode each piece would have a different tag.

It is possible to encrypt a 2GB file in CTR or CBC modes, but with AEAD modes the error occurs.

Anyway, it is possible to increase the buffer size to 3GB with buf.Grow(3*1024*1024*1024) but this exceeds the eighth Mersenne prime making it impossible on 32-bit machines. I would need the buffer size dynamically allocated according to the file size:

	buf := bytes.NewBuffer(nil)
	var data io.Reader
	var size int64
	data = os.Stdin
	fi, err := os.Stdin.Stat()
	if err != nil {
		log.Fatal(err)
	}
	size = fi.Size()
	buf.Grow(int(size))
	io.Copy(buf, data)
	msg := buf.Bytes()

But it doesn’t work either.

Could anyone give me a technical explanation on why it is not possible to encrypt large files with AEAD modes?

I mean, it’s possible to encrypt files up to 1.5GB (that I’ve tested) with non-AEAD modes of operation, however, with AEAD modes the maximum plaintext size is only 250MB. That is, I cannot encrypt files of more than 250MB with AEAD modes on 32-bit machines. Larger files result in memory error.

NIST SP800-38D section 5.2.1.1, says that the maximum length of plaintext is 2^39-256 bits ~ 64 GB.

Assuming buf.Grow(3 * 1024 * 1024 * 1024) increases buffer size to 3GB, buf.Grow(64 * 1024 * 1024 * 1024) also causes memory error, not able to encrypt 64GB .

How to proceed?

Thanks in advance.

Then you’ll need to study what are the decrypter’s algorithm you want to be compatible with. I sincerely do not think you way (1 bulky file) work. Otherwise, AES-XTS wouldn’t work for encrypting an entire disk.

AFAIK, confidentiality covers down to encrypter/decrypter. Otherwise, the encryption does not make any sense.

Yes, I know the algorithm, like the GCM which, published in 2007 (NIST SP 800-38D section 5.2.1), claimed to encrypt 64GB of plain text, despite the OCB that would encrypt 4PT. In 2023 I’m killing doubts about the main reasons why this is pure nonsense or a limitation of Go’s own implementation of GCM.

In fact, this question arose because of a user of my application who had this difficulty. I personally don’t usually encrypt anything, I just study about it.

Thanks!

Good. Are you aware 32-bit machine only support up-to a maximum of 4GB of RAM memory? What if a 32GB payload arrives in?

Go is 100% definitely not problem neither is the AES ciphers (GCM, XTS, etc). Otherwise, Go would be murdered by a lot of unforgiving CVE tickets and folks would stay away from it.

It’s the application algorithm adopting a full file payload without much design consideration rather than taking a streaming approach in case of large size.

If we go back to the cake analogy:

  1. The person had decided to eat the entire cake in 1 go.
  2. Don’t blame the fork or spoon design (ciphers) for their small size.

In 2007 64-bit machines were rare, however it doesn’t work on 64-bit machines either.

"In a way, you could say it’s a kludge. While it is possible to split the data into smaller chunks and encrypt them separately, this is not an ideal or recommended solution.

Furthermore, by breaking the data into smaller blocks, the security of the encryption can be compromised, as the information contained in the smaller blocks can be analyzed independently, which increases the likelihood of successful attacks.

Therefore, it is important to follow the recommendations and limitations set by regulatory bodies and cryptocurrency manufacturers to ensure data security and avoid performance or security issues."

Fed up with incongruity. The algorithm documentation limits the maximum plaintext size to 64GB.

According to the solution provided here, this number can be extrapolated. That is, with the solution given here, there is no maximum plaintext size since it will be divided infinitely, leaving it up to the hardware to define a maximum plaintext limit.

This solution is a workaround and does not explain why the plaintext size was limited to 64GB in the algorithm documentation.

Thank you anyway for trying.

There is no spoon size if we are going to drink the soup directly from the pot. My question is not about cooking.

In the case of the analogy, it means that each piece of cake was at most 64GB, not the whole cake 64GB. This is what the documentation refers to.

Modes of operation from the 70’s encrypt without problems. I don’t see why they would lie in the GCM documentation. It means that AEAD modes aid but do not supplant or replace legacy modes.

This limitation is based on the mathematical properties of the encryption algorithm, which uses a 128-bit counter to generate the encryption blocks.

The maximum size of 64GB is determined by the fact that the maximum counter size is 2^128, i.e. there are a total of 2^128 cipher blocks available. Since each cipher block is 128 bits long, the maximum plaintext size is 2^128 x 128 bits, or approximately 68.7 billion gigabytes. However, the NIST SP 800-38D documentation sets a 64GB limit to ensure the integrity and authenticity of encrypted data. This is done to avoid the possibility of an attacker being able to force an overflow of the counter, which could lead to errors in authenticating the data.

Therefore, although GCM mode can theoretically (just) support a maximum plaintext size of 64 GB, in practice this is not possible due to out-of-memory issues.

It was this explanation I expected.

You are using code, that encrypts a blob of bytes in memory.

And as far as I can tell, this happens “out of place”.

So even under optimal circumstances (which you do not have) you need at least twice as much memory as you want to encrypt.

As a 32 bit machine can at most address 4GiB, your maximal input size is (4 GiB - memory used by the OS - memory used by other software - memory used by other parts of your program) / 2. So at most 2 GiB, assuming nothing else uses memory (unrealistic). Realistically this will probably be within the range of 1 to 1.5 GiB.

The common solution to this, is called “chunked” or “streamed” processing.

You always only have some small “window” of the input file in memory, and encrypt that, carrying over any “state” required by the algortihm to consider the next “window” being still part of the original input data.

Writing back the result of the encryption window by window. The first byte will not change depending on the last…

Please do not confuse this with individual encryption of chunks!

I do not though understand enough of AEAD or Gos implementation of it to say how to do streaming encryption/decryption properly in Go.

Forget about 32-bit. Assuming it’s a 64-bit machine.

Will this generate a single nonce and a single tag?

The Go team refuted this for a while:

I don’t think they will implement

Basically the GCM mode works with Seal and Open can’t be used along side with StreamWriter or StreamReader, which forces me to allocate the entire file into memory. Obviously, this is not ideal with large files.

I argue that it is the duty of the crypto/cipher library team to implement StreamWriter and StreamReader, to be used as the CTR mode:

	stream := cipher.NewCTR(ciph, iv)
	buf := make([]byte, 128*1<<10)
	var n int
	for {
		n, err = os.Stdin.Read(buf)
		if err != nil && err != io.EOF {
			panic(err)
		}
		stream.XORKeyStream(buf[:n], buf[:n])
		if _, err := os.Stdout.Write(buf[:n]); err != nil {
			panic(err)
		}
		if err == io.EOF {
			break
		}
	}

But it seems to me that they are not very imbued with that purpose, and I don’t want to have to reimplemet my tools.