Hi - my lack of low-level knowledge is biting me a bit with this one, so I’d appreciate some pointers.
- I have a flat struct containing a number of int and float64 values.
- I have a sorted slice containing around 100,000,000 of these structs
- I want to cache the slices to disk for convenient re-use as they are expensive to build
- This is a write once, read many operation so read speed is the priority
- Files will be written and read on the same workstation - safety and portability are not an issue.
Failed attempts to date…
I have this working with glob, but it’s painfully slow. Zipping the file doesn’t help much.
There are a wide range of serialization packages, most of them rather poorly documented. Benchmarks are contradictory. Trying them is proving a painful experience and I suspect they aren’t the right solution in any case.
Can I write from memory direct to disk?
Given that I don’t need safety or portability,is there is a more direct way to write from memory to disk?
Unfortunately binary files are outside my comfort zone and googling has come up blank - I found a couple of blogs but the code doesn’t compile.
So I’d very much appreciate any pointers that would help speed up the search for the fastest solution.
PS: Done a bit more digging and realised working with the struct as a whole is becoming memory-bound. So I’m going to have to break up the files. Gob works relatively better with smaller files, but I’m still looking for the fastest possible route as this is now the key performance bottleneck in my system.