Padding is hard

From, http://dave.cheney.net/2015/10/09/padding-is-hard

I left a number of open questions for the reader. I thought it might be fun to discuss them here.

6 Likes

Thanks for the post!

I followed everything (“yep, makes sense”, “yep, that’s what I expected”, “yep”, “yep”). Or so I thought, because now the footnotes have me confused.

The first one in particular…

The answer is that while empty struct{} values consume no storage, you can take their address. That is, if you have a type

type T struct {
      X uint32
      Y struct{}
}

var t T

It is perfectly valid to take the address of t.Y, and as such, the address of t.Y would point beyond the end of the struct! (The explanation for why this is not a violation of Go’s memory safety guarantee is left as an exercise for the reader.)

Are you saying that it’s valid to take the address of t.Y before or after runtime: pointer to struct field can point beyond struct allocation · Issue #9401 · golang/go · GitHub was fixed?

Before reading the footnote, I interpreted that as “before the fix”, which is what I guess the footnote is getting at: taking the address of t.Y is safe because writes to that invalid address cause zero bytes to be written.

What got me confused is this:

var t *struct{}
*t = struct{}{}

does generate a panic because the nil pointer dereference check still happens.

It’s interesting that in 9401 the problem is not the access beyond the end of the struct as such, but the fact that this was causing a crash in the garbage collector because the pointer was not found.

Am I lost?

1 Like

Let me rewrite your example

var t *struct{ A, B, C int }
*t = struct{ A, B, C int }{ 1, 2, 3}

t is a pointer to `struct{ A, B, C int}, it is uninitalised so has the zero value for a pointer, nil. *t will deference nil and panic.

What you probably wanted to write was

var t struct{}
tp := &t // tp is a pointer to t
*tp = struct{}{} // deference tp, and overwrite it
1 Like

I actually wanted to write what I wrote :smile:

What I didn’t manage to convey is that even if zero bytes are written to the nil address (there’s no “put these bytes at address zero”), the nil pointer dereference still takes place. I’m probably thinking too much in terms of C, where memcpy(0, 0, 0) won’t generate a segmentation fault because neither the src nor the dst pointers are dereferenced.

But don’t leave me hanging!

Is it valid because zero bytes are ever written? Or is it something else?

How is that different to

var i *int
*i = 0

The compiler is going to break down every part of the expression

*i = 0

becomes

tmp := *i  // boom
tmp = 0

memcpy(0, 0, 0) won’t generate a segmentation fault because neither the src nor the dst pointers are dereferenced.

This is a side effect of the number of bytes to copy being used to drive the loop, the check that n > 0 will cause the loop inside memcpy to be executed zero times, so nothing will be loaded from the source addr. I suspect there are probably programs that depend on this side effect.

1 Like

Yep, I get that. I was trying to illustrate…

I didn’t want to go straight for

	typedef struct {} S;
	S *i = 0;
	S j;
	*i = j;

because that’s a GNU extension.

In that program, there’s no access to an invalid memory address because there are no bytes to copy. Memory is not written.

The program in your counter argument in invalid because it’s writing something to an invalid address. This is why I’m (still) guessing that what you left as an exercise to the reader is legal because in Go’s memory model it doesn’t matter that you can take an address into that specific memory location because there’s no way to write something there.

Like I said, I might be thinking too much in terms of C.

The part that was left for the reader is why is it safe and legal to take the address of t.Y even if it is known to point beyond the end of the struct, and possibly beyond mapped memory ?

And my answer is still because it points to a memory region of 0 width, so no accesses are possible.

In other words, what difference does it make that a pointer points to an invalid address, if that pointer cannot be used to access that address?

But the fact you still haven’t said that that’s the reason makes me guess that that’s not the reason :smile:

The fact that a dereference of a pointer that points to a memory region of 0 width (making the dereference useless) is nevertheless checked reinforces my guess that this is not the reason.

And my answer is still because it points to a memory region of 0 width, so no accesses are possible.

Sure you can access it

type T struct {
      A int
      B struct{}
}

var t T
x := &t.B
b := *x
fmt.Println(b)

In other words, what difference does it make that a pointer points to an invalid address, if that pointer cannot be used to access that address?

It can be used, see above.

No, it’s not used. The fact you have a pointer to an invalid address doesn’t mean you are accessing the invalid address. And in the specific case we are talking about, the fact that you are dereferencing a pointer to that memory address, doesn’t mean that you are accessing the contents at that address.

Consider this:

	type T struct { }
	p := (*T)(unsafe.Pointer(uintptr(1)))
	*p = T{}

yes, I know, I just forfeited Go’s memory safety guarantee by reaching for unsafe, I understand that, but I couldn’t figure out a way of explicitly assigning a specific and invalid address to p without going thru unsafe. For argument’s sake, let’s just assume that an invalid address like that landed in p, as it was the case before 1.5.

I absolutely hate to reach for this argument, but this program runs:

http://play.golang.org/p/W5JE0ZltJn

while this:

	type T struct { n int }
	p := (*T)(unsafe.Pointer(uintptr(1)))
	*p = T{}

doesn’t:

http://play.golang.org/p/xLbQHrYSuk

It is clear to me why the second one panics, no need to go there.

Where we do not seem to agree is on why the first one doesn’t.

What I’m saying is that the first program does not panic because the program never ever touches the invalid memory address.

As far as I can tell, this is not just an optimization. No code is generated for the “copy” in *p = T{} because literally nothing is getting copied. The code that is generated is for checking that a nil pointer is not dereferenced. Same goes for reading.

Am I misunderstanding what you are saying?

I think the disagreement is I use examples like this

var t struct{}
tp := &t // address of t

but you use examples of the form

var t *struct{} // nil

The question for the reader is

The part that was left for the reader is why is it safe and legal to take the address of t.Y even if it is known to point beyond the end of the struct, and possibly beyond mapped memory ?

Which starts with the structure

type T struct {
       X int
       Y struct{}
}

vat t T

t.X consumes some space, and t.Y consumes no space, but by definition the address of t.X and t.Y are not the same, so &t.Y points to some location directly after the end of t.X. Why is that not a problem?

Note that &t.X points to valid memory, because t is of type T, not *T – it’s a value.

I’m sorry if you’re asking a different question, I don’t understand that question.

Ok, i see what the problem is.

To quote the blog post

It is perfectly valid to take the address of t.Y, and as such, the address of t.Y would point beyond the end of the struct!
The explanation for why this is not a violation of Go’s memory safety guarantee is left as an exercise for the reader.

And the answer is, the type of t.Y is struct{}, so even if it &t.Y did overlap some other memory, it would do so with a type that has zero width, so writes to that value would overwrite zero bytes.

I think what you are showing with your examples is what caused the garbage collector to trip up in issue 9401.

1 Like

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.