ChatGPT解决这个技术问题 Extra ChatGPT

Stack vs heap allocation of structs in Go, and how they relate to garbage collection

I'm new to Go and I'm experiencing a bit of cognitive dissonance between C-style stack-based programming where automatic variables live on the stack and allocated memory lives on the heap and Python-style stack-based-programming where the only thing that lives on the stack are references/pointers to objects on the heap.

As far as I can tell, the two following functions give the same output:

func myFunction() (*MyStructType, error) {
    var chunk *MyStructType = new(HeaderChunk)

    ...

    return chunk, nil
}


func myFunction() (*MyStructType, error) {
    var chunk MyStructType

    ...

    return &chunk, nil
}

i.e., allocate a new struct and return it.

If I'd written that in C, the first one would have put an object on the heap and the second would have put it on the stack. The first would return a pointer to the heap, the second would return a pointer to the stack, which would have evaporated by the time the function had returned, which would be a Bad Thing.

If I'd written it in Python (or many other modern languages except C#) example 2 would not have been possible.

I get that Go garbage collects both values, so both of the above forms are fine.

To quote:

Note that, unlike in C, it's perfectly OK to return the address of a local variable; the storage associated with the variable survives after the function returns. In fact, taking the address of a composite literal allocates a fresh instance each time it is evaluated, so we can combine these last two lines. http://golang.org/doc/effective_go.html#functions

But it raises a couple of questions.

In example 1, the struct is declared on the heap. What about example 2? Is that declared on the stack in the same way it would be in C or does it go on the heap too? If example 2 is declared on the stack, how does it stay available after the function returns? If example 2 is actually declared on the heap, how is it that structs are passed by value rather than by reference? What's the point of pointers in this case?


S
Sonia

It's worth noting that the words "stack" and "heap" do not appear anywhere in the language spec. Your question is worded with "...is declared on the stack," and "...declared on the heap," but note that Go declaration syntax says nothing about stack or heap.

That technically makes the answer to all of your questions implementation dependent. In actuality of course, there is a stack (per goroutine!) and a heap and some things go on the stack and some on the heap. In some cases the compiler follows rigid rules (like "new always allocates on the heap") and in others the compiler does "escape analysis" to decide if an object can live on the stack or if it must be allocated on the heap.

In your example 2, escape analysis would show the pointer to the struct escaping and so the compiler would have to allocate the struct. I think the current implementation of Go follows a rigid rule in this case however, which is that if the address is taken of any part of a struct, the struct goes on the heap.

For question 3, we risk getting confused about terminology. Everything in Go is passed by value, there is no pass by reference. Here you are returning a pointer value. What's the point of pointers? Consider the following modification of your example:

type MyStructType struct{}

func myFunction1() (*MyStructType, error) {
    var chunk *MyStructType = new(MyStructType)
    // ...
    return chunk, nil
}

func myFunction2() (MyStructType, error) {
    var chunk MyStructType
    // ...
    return chunk, nil
}

type bigStruct struct {
    lots [1e6]float64
}

func myFunction3() (bigStruct, error) {
    var chunk bigStruct
    // ...
    return chunk, nil
}

I modified myFunction2 to return the struct rather than the address of the struct. Compare the assembly output of myFunction1 and myFunction2 now,

--- prog list "myFunction1" ---
0000 (s.go:5) TEXT    myFunction1+0(SB),$16-24
0001 (s.go:6) MOVQ    $type."".MyStructType+0(SB),(SP)
0002 (s.go:6) CALL    ,runtime.new+0(SB)
0003 (s.go:6) MOVQ    8(SP),AX
0004 (s.go:8) MOVQ    AX,.noname+0(FP)
0005 (s.go:8) MOVQ    $0,.noname+8(FP)
0006 (s.go:8) MOVQ    $0,.noname+16(FP)
0007 (s.go:8) RET     ,

--- prog list "myFunction2" ---
0008 (s.go:11) TEXT    myFunction2+0(SB),$0-16
0009 (s.go:12) LEAQ    chunk+0(SP),DI
0010 (s.go:12) MOVQ    $0,AX
0011 (s.go:14) LEAQ    .noname+0(FP),BX
0012 (s.go:14) LEAQ    chunk+0(SP),BX
0013 (s.go:14) MOVQ    $0,.noname+0(FP)
0014 (s.go:14) MOVQ    $0,.noname+8(FP)
0015 (s.go:14) RET     ,

Don't worry that myFunction1 output here is different than in peterSO's (excellent) answer. We're obviously running different compilers. Otherwise, see that I modfied myFunction2 to return myStructType rather than *myStructType. The call to runtime.new is gone, which in some cases would be a good thing. Hold on though, here's myFunction3,

--- prog list "myFunction3" ---
0016 (s.go:21) TEXT    myFunction3+0(SB),$8000000-8000016
0017 (s.go:22) LEAQ    chunk+-8000000(SP),DI
0018 (s.go:22) MOVQ    $0,AX
0019 (s.go:22) MOVQ    $1000000,CX
0020 (s.go:22) REP     ,
0021 (s.go:22) STOSQ   ,
0022 (s.go:24) LEAQ    chunk+-8000000(SP),SI
0023 (s.go:24) LEAQ    .noname+0(FP),DI
0024 (s.go:24) MOVQ    $1000000,CX
0025 (s.go:24) REP     ,
0026 (s.go:24) MOVSQ   ,
0027 (s.go:24) MOVQ    $0,.noname+8000000(FP)
0028 (s.go:24) MOVQ    $0,.noname+8000008(FP)
0029 (s.go:24) RET     ,

Still no call to runtime.new, and yes it really works to return an 8MB object by value. It works, but you usually wouldn't want to. The point of a pointer here would be to avoid pushing around 8MB objects.


Excellent thanks. I wasn't really asking "what's the point of pointers at all", it was more like "what's the point of pointers when values appear to behave like pointers", and that case is rendered moot by your answer anyway.
A short explanation of the assembly would be appreciated.
So does new actually always allocate on the heap?
Very nice answer at the beginning, but couldn't get the point. So, at least in this implementation, all three examples use heap? What's the difference between the presence/absence of runtime.new? Also, why is it good if runtime.new is gone?
p
peterSO
type MyStructType struct{}

func myFunction1() (*MyStructType, error) {
    var chunk *MyStructType = new(MyStructType)
    // ...
    return chunk, nil
}

func myFunction2() (*MyStructType, error) {
    var chunk MyStructType
    // ...
    return &chunk, nil
}

In both cases, current implementations of Go would allocate memory for a struct of type MyStructType on a heap and return its address. The functions are equivalent; the compiler asm source is the same.

--- prog list "myFunction1" ---
0000 (temp.go:9) TEXT    myFunction1+0(SB),$8-12
0001 (temp.go:10) MOVL    $type."".MyStructType+0(SB),(SP)
0002 (temp.go:10) CALL    ,runtime.new+0(SB)
0003 (temp.go:10) MOVL    4(SP),BX
0004 (temp.go:12) MOVL    BX,.noname+0(FP)
0005 (temp.go:12) MOVL    $0,AX
0006 (temp.go:12) LEAL    .noname+4(FP),DI
0007 (temp.go:12) STOSL   ,
0008 (temp.go:12) STOSL   ,
0009 (temp.go:12) RET     ,

--- prog list "myFunction2" ---
0010 (temp.go:15) TEXT    myFunction2+0(SB),$8-12
0011 (temp.go:16) MOVL    $type."".MyStructType+0(SB),(SP)
0012 (temp.go:16) CALL    ,runtime.new+0(SB)
0013 (temp.go:16) MOVL    4(SP),BX
0014 (temp.go:18) MOVL    BX,.noname+0(FP)
0015 (temp.go:18) MOVL    $0,AX
0016 (temp.go:18) LEAL    .noname+4(FP),DI
0017 (temp.go:18) STOSL   ,
0018 (temp.go:18) STOSL   ,
0019 (temp.go:18) RET     ,

Calls In a function call, the function value and arguments are evaluated in the usual order. After they are evaluated, the parameters of the call are passed by value to the function and the called function begins execution. The return parameters of the function are passed by value back to the calling function when the function returns.

All function and return parameters are passed by value. The return parameter value with type *MyStructType is an address.


Thanks very much! Upvoted, but I'm accepting Sonia's because of the bit about escape analysis.
peterSo, how are you and @Sonia producing that assembly? You both have the same formatting. I can't produce it regardless of command/flags, having tried objdump, go tool, otool.
Ah, got it - gcflags.
G
Gustavo Chaín

According to Go's FAQ:

if the compiler cannot prove that the variable is not referenced after the function returns, then the compiler must allocate the variable on the garbage-collected heap to avoid dangling pointer errors.


u
user

You don't always know if your variable is allocated on the stack or heap. ... If you need to know where your variables are allocated pass the "-m" gc flag to "go build" or "go run" (e.g., go run -gcflags -m app.go).

Source: http://devs.cloudimmunity.com/gotchas-and-common-mistakes-in-go-golang/index.html#stack_heap_vars


m
muthukumar selvaraj
func Function1() (*MyStructType, error) {
    var chunk *MyStructType = new(HeaderChunk)

    ...

    return chunk, nil
}


func Function2() (*MyStructType, error) {
    var chunk MyStructType

    ...

    return &chunk, nil
}

Function1 and Function2 may be inline function. And return variable will not escape. It's not necessary to allocate variable on the heap.

My example code:

   package main
   
   type S struct {
           x int
   }
   
   func main() {
           F1()
           F2()
          F3()
  }
  
  func F1() *S {
          s := new(S)
          return s
  }
  
  func F2() *S {
          s := S{x: 10}
          return &s
  }
  
  func F3() S {
          s := S{x: 9}
          return s
  }

According to output of cmd:

go run -gcflags -m test.go

output:

# command-line-arguments
./test.go:13:6: can inline F1
./test.go:18:6: can inline F2
./test.go:23:6: can inline F3
./test.go:7:6: can inline main
./test.go:8:4: inlining call to F1
./test.go:9:4: inlining call to F2
./test.go:10:4: inlining call to F3
/var/folders/nr/lxtqsz6x1x1gfbyp1p0jy4p00000gn/T/go-build333003258/b001/_gomod_.go:6:6: can inline init.0
./test.go:8:4: main new(S) does not escape
./test.go:9:4: main &s does not escape
./test.go:14:10: new(S) escapes to heap
./test.go:20:9: &s escapes to heap
./test.go:19:2: moved to heap: s

If the compiler is smart enough, F1() F2() F3() may not be called. Because it makes no means.

Don't care about whether a variable is allocated on heap or stack, just use it. Protect it by mutex or channel if necessary.


You can always use //go:noinline before a function to prevent inlining to test out the code. Question is really more of clarification of a concept in case compiler doesn't opt for inlining.