ChatGPT解决这个技术问题 Extra ChatGPT

Slices of structs vs. slices of pointers to structs

I often work with slices of structs. Here's an example for such a struct:

type MyStruct struct {
    val1, val2, val3    int
    text1, text2, text3 string
    list                []SomeType
}

So I define my slices as follows:

[]MyStruct

Let's say I have about a million elements in there and I'm working heavily with the slice:

I append new elements often. (The total number of elements is unknown.)

I sort it every now and then.

I also delete elements (although not as much as adding new elements).

I read elements often and pass them around (as function arguments).

The content of the elements themselves doesn't get changed.

My understanding is that this leads to a lot of shuffling around of the actual struct. The alternative is to create a slice of pointers to the struct:

[]*MyStruct

Now the structs remain where they are and we only deal with pointers which I assume have a smaller footprint and will therefore make my operations faster. But now I'm giving the garbage collector a lot more work.

Can you provide general guidelines of when to work with structs directly vs. when to work with pointers to structs?

Should I worry about how much work I leave to the GC?

Is the performance overhead of copying a struct vs. copying a pointer negligible?

Maybe a million elements is not much. How does all of this change when the slice gets much bigger (but still fits in RAM, of course)?

Your example struct is 12 words (1 per int, 2 per string, 3 for the slice), the pointer is 1. It's the deletes that concern me most, because each will require shifting, on average, half the array. If you could delete an element by swapping it with the last one in the slice and shrinking the slice by 1, or by zeroing a struct field or pointer, those would be constant-time. My intuition is also pointers if the struct is largish and you're doing much with the array.
FWIW, at the bottom here are some considerations for choosing between []T and []*T--most rehash what folks have said here, but maybe some others factor in (say the concern about holding on to a pointer into a slice after it is reallocated by append).
Thank you for these hints. That last discussion (via @twotwotwo) is particularly helpful as it lists common scenarios and pitfalls to watch out for.

R
Russ Egan

Just got curious about this myself. Ran some benchmarks:

type MyStruct struct {
    F1, F2, F3, F4, F5, F6, F7 string
    I1, I2, I3, I4, I5, I6, I7 int64
}

func BenchmarkAppendingStructs(b *testing.B) {
    var s []MyStruct

    for i := 0; i < b.N; i++ {
        s = append(s, MyStruct{})
    }
}

func BenchmarkAppendingPointers(b *testing.B) {
    var s []*MyStruct

    for i := 0; i < b.N; i++ {
        s = append(s, &MyStruct{})
    }
}

Results:

BenchmarkAppendingStructs  1000000        3528 ns/op
BenchmarkAppendingPointers 5000000         246 ns/op

Take aways: we're in nanoseconds. Probably negligible for small slices. But for millions of ops, it's the difference between milliseconds and microseconds.

Btw, I tried running the benchmark again with slices which were pre-allocated (with a capacity of 1000000) to eliminate overhead from append() periodically copying the underlying array. Appending structs dropped 1000ns, appending pointers didn't change at all.


I took this a step further (in addition to preallocating lists) and appended non-empty structs with random data, and points were ~10% slower than structs: BenchmarkAppendingStructs-8 5000000 387 ns/op BenchmarkAppendingPointers-8 3000000 422 ns/op
So there is no con of using slice of pointers, why i don't see they use it frequently more than slice of struct?
How do you bench mark methods like this? Just time them?
The Golang toolkit has a benchmarking feature baked in. The “Benchmark***” functions above are recognized by go test as benchmarks. See the Golang docs.
The benchmarking reveals the immediate benefits of using a pointer over a direct struct - but how does one measure the longterm GC impact?
E
Evan

Can you provide general guidelines of when to work with structs directly vs. when to work with pointers to structs?

No, it depends too much on all the other factors you've already mentioned.

The only real answer is: benchmark and see. Every case is different and all the theory in the world doesn't make a difference when you've got actual timings to work with.

(That said, my intuition would be to use pointers, and possibly a sync.Pool to aid the garbage collector: http://golang.org/pkg/sync/#Pool)


B
Big_Boulard

Unlike maps, slices, channels, functions, and methods, struct variables are passed by copy which means there's more memory allocated behind the scene. On the other hand, reducing pointers result in less work for the garbage collector. From my perspective, I would think more about 3 things: the struct complexity, the quantity of data to handle, and the functional need once you'd have created your var (does it need to be mutable when it's being passed into a function? etc..)


关注公众号,不定期副业成功案例分享
Follow WeChat

Success story sharing

Want to stay one step ahead of the latest teleworks?

Subscribe Now