Why memory usage is triple the downloaded content?

I’m downloading some data via requests/REST and it ends up that computer’s memory gets full. So I download about 12GB of data, my computer has 32GB RAM and it gets full. I did pprof profiling on the code and it seems there’s no accumulation of data in some other variable than the one I store. It maybe sth with capacity, but I do not predetermine it because generally I do not know its size beforehand. The main code that is run million times is this:

func downloadStats(stor *Fields, field string, dateFromMs int64, dateToMs int64) {
	var cas int64
	var dateLast int64 = dateFromMs

loop2:
	for {
		//! --- get statistics
		statsData, err := req.Get("URLtoEndpoint" + field + "xy"+dateLast)
		defer statsData.Response().Body.Close()
		if err == nil {
			break
		}
		status := gjson.Get(statsData.String(), "status").String()
		queryCount := gjson.Get(statsData.String(), "count").Int()
		if status == "OK" && err == nil && queryCount == 0 {
			fmt.Printf("\n%5s   RETURNED_Q 0-counv\n", field)
			return
		}

		//! --- JSON ARRAYS: w/gjson lib
		results := gjson.Get(statsData.String(), "results")
		for _, r := range results.Array() {
			cas = r.Get("time").Int()
			if cas >= dateToMs { // break the forever loop
				break loop2
			}

			var cnd = []int{}
			for _, a := range r.Get("conditions").Array() {
				cnd = append(cnd, int(a.Int()))
			}
			//! --- Store stats' fields !!!!! HERE IS DATA STORAGE, THERE ARE THREE LINES LIKE THIS IN REAL CASE
			stor.Prop1 = append(stor.Prop1, Property1{Ev: "turbine1", Volt: r.Get("voltage").Float(), Rot: r.Get("rotation").Float(), Temp: r.Get("temperature").Float(), Pres: r.Get("presure").Float(), T: cas, C: cnd, Axis: int(r.Get("axNb").Int()), tSort: cas})
		}
		
		time.Sleep(250 * time.Millisecond)
	}
}

store.Prop1 = append(store.Prop1,…)

is the line collecting data, there are 3 like this in real case, so maybe that’s something related with 3x memory usage than the content size. How could I keep the RAM usage close to the downloaded data size, in this case 12GB. Also, would like to understand the behavior of RAM management, newbie here, any explanation appreciated. Thank you for your thoughts.

Hi @Rok_Petka,
I believe that you receive raw data ([]byte) and then it gets converted to a string and this string value is also stored:

// Result represents a json value that is returned from Get().
type Result struct {
	// Type is the json type
	Type Type
	// Raw is the raw json
	Raw string
	// Str is the json string
	Str string
	// Num is the json number
	Num float64
	// Index of raw value in original json, zero means index unknown
	Index int
	// Indexes of all the elements that match on a path containing the '#'
	// query character.
	Indexes []int
}

It could be that this package stores double the data size you receive.

Then you iterate on the resulting array and you create one extra “copy” of it, for a total of 3x the size:

for _, r := range results.Array() {
    cnd = append(cnd, int(a.Int()))
    stor.Prop1 = append(stor.Prop1, Property1

So your package allocates 2xActualData the size of (actual read data), plus you also allocate 1xActualData. This is just a theory, could be worth testing though.

Thanks for the reply. But the variable “results” is overwritten on every iteration, so I do not see how it can be problematic?

What I further tested is that I commented out the ‘stor.Prop1’ (_ = stor.Prop1), which actually stores the data and the computers RAM remains constant and low during the process, meaning that this one piles up so much storage, but why 2-3x more than the data?! Yes, this one stores in strings, so may be 1x storage and bytes 2x… Is there a way to improve the storage?

Hi @Rok_Petka ,

Try this, lets see how much it helps:

			var cnd = []int{}
			for _, a := range r.Get("conditions").Array() {
				cnd = append(cnd, int(a.Int()))
			}

You already know the max capacity of the slice, preallocate it.
Similarly, you already know the size of result.Array(), but for each of its elements you do:

			stor.Prop1 = append(stor.Prop1, Property1{Ev: "turbine1", Volt: r.Get("voltage").Float(), Rot: r.Get("rotation").Float(), Temp: r.Get("temperature").Float(), Pres: r.Get("presure").Float(), T: cas, C: cnd, Axis: int(r.Get("axNb").Int()), tSort: cas})

This is again wasteful, due to the slice reallocation problem. I do not think this is enough to make you need 3x more memory by itself. But you you first preallocate the two slices, you will need less memory.

Please try it and post the results, I am really curious what decrease in memory usage this will cause.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.