Slices are flexible and efficient, but their implementation has some quirky behaviours, leading to unexpected results.
Let us see how they really work and how to avoid running into problems.
Understanding slices
A couple of days ago, this answer appeared on my X feed:
https://x.com/likely_a_DFS/status/1793399293312966777
It made me wonder whether the internal operation of slices is actually well known by many Go developers, especially beginners. I figured that would make for an interesting explanation, so here we go, line by line.
What’s in a slice, really ?
The actual runtime implementation of slices lives in runtime/slice.go
:
|
|
The runtime also defines a runtime.notInHeapSlice
similar type,
for limited use cases outside the control of the garbage collector.
These two equivalent types, both of which represent slices at run time,
carry no method, and are hence just plain three-field structs :
- a pointer, either using the public
unsafe.Pointer
type, or the runtime-internalnotInHeap
type len
: the slice length as elements, not bytescap
: the slice capacity as elements too. It is always larger than or equal tolen
.
The expected
We will be going through that code fragment, one statement at a time,
and see how the memory allocation changes with each instruction.
The explanation is taken from both tracing the runtime code and
checking the generated assembly code on Go 1.22.4 for arm64
, amd64
, wasm
, loong64
, and ibm390
for consistency.
Create a slice from a literal
sliceA := []int{1, 2, 3}
- builds the slice manually at compile time:
- allocates an object:
runtime.newobject(typ)
- where
typ.Size == 24
andtyp.Kind_ == abi.Array
- so this is an
array
for 24 bytes = 3 64-bit integers,
- initializes a
runtime.slice
structure as :- the allocated result, for the
array unsafe.Pointer
field len == 3
cap == 3
- and the 3 literal integers 1, 2, and 3
- the allocated result, for the
Append a single element to a full slice
Our sliceA
is full, meaning its len
equals its cap
.
We create sliceB
by appending a single int
to sliceA
:
sliceB := append(sliceA, 4)
- grows the slice invoking
runtime.growslice
witholdPtr
from the existing slice pointernewLen == 4
because we append one valueoldCap == 3
which is the current slice capacitynum == 1
which is the number of elements being appended
- internally,
growslice
- invokes
runtime.nextslicecap
to get the capacity the appended slice needs, getting twice the existing capacity because our slice is small enough - does some checks and concludes that the proper
newCap
is indeed 6 64-bits integers, hence 48 bytes - accordingly, allocates a new 48 bytes space
- clears the part of that space that will not be copied over from the old slice using
runtime.memclrNoHeapPointers
- copies the values in the old array to the new space using
runtime.memmove
- returns a new
runtime.slice
instance allocated from that new space, the receivednewLen
and adjustednewCap
.
- invokes
sliceB
is full too, andsliceA
is unchanged.
Append a single element to a full slice - again
If we do it again to add a different element to sliceA
, the same mechanism happens :
sliceC := append(sliceA, 5)
- goes through the same motions, allocating a new space for the slice array too
sliceC
is full too, andsliceA
is still unchanged.
Update an element in the append
-ed slice
sliceC[0] = 0
- modifies the first
int
in the (new) array backingsliceC
- because that array is not shared, nothing happens to
sliceA
andsliceB
.
Which gives us the first set of results, which pretty much any developer would expect.
|
|
The unexpected
For the second set, the results are a bit different because of the way the initialization is performed.
Create a slice by appending an element to a slice literal
This time, instead of using a three-element slice literal, we will initialize our slice by appending an element to a two-element slice literal instead.
sliceD := append([]int{1, 2}, 3)
- builds the
[]int{1, 2}
slice manually at compile time, following the same steps as the one initializingsliceA
- at this point, the slice has
cap == len == 2
- now the append happens the same way it did to build
sliceB
andsliceC
, resulting in slice[]int{1, 2, 3}
now having a double-size backing array, hence 4 bytes, not just three likesliceA
did
Append a single element to a non-full slice
Our sliceD
is not full, meaning its len
is strictly less that its cap
(3 vs 4).
We create sliceE
by appending a single int
to sliceD
:
sliceE := append(sliceD, 4)
- here, however, the compiler is smart enough to generate code checking whether the size of the backing array in
sliceD
is less than 4 before invokinggrowslice
- because the array is actually of size 4, that call is omitted and the existing array is reused, code only setting the value 4 in the last position of that array
- the new slice reuses that array as its backing store
sliceD
andsliceE
are now sharing that same array with contents[1 2 3 4]
Append a single element to a non-full slice - again
If we do it again to add a different element to sliceD
, the same mechanism happens :
sliceF := append(sliceD, 5)
- the same compiler check again notices that the backing array has length 4, so a call to
growslice
is not needed either - the new slice reuses that array as its backing store, just like
sliceD
andsliceE
- code then sets the value 4 in the last position of that array to 5, meaning all three slices are backed with array
[1 2 3 5]
Update an element in the append
-ed slice
sliceF[0] = 0
- because this is just an in-place update, the code sets position 0 in the array backing
sliceF
, making it contain[0 2 3 5]
- but because that array is also the backing store for
sliceD
andsliceE
they also get that same update.
And that is how we get these possibly surprising results:
|
|
The cautious ways
The slices.Clip
function
Now, if you do not want to care about what is really going on, and prefer having a little bit of extra work done by your code, there are solutions, and one of those is mentioned by the author of the initial post:
https://x.com/likely_a_DFS/status/1793629501911609691
So, how does slices.Clip
work around this problem, exactly ?
As its documentation says, Clip removes unused capacity from the slice
.
In this example, our problem stems from the fact that we initialized sliceD
with some extra capacity,
thus allowing the subsequent append()
calls to reuse the backing array for sliceD
unchanged.
Now, we could either have avoided that extra capacity by building the slice like we did for sliceA
. But sometimes, we have to extend slices we receive and do not control their prexisting backing
array.
In such a case, we could write something like
|
|
Now, with that extra slices.Clip
call, our memory model changes:
And with that we are back at the exact same situation as we had for sliceA
and we get the expected results:
sliceD
, sliceE
, and sliceF
all have independent backing arrays.
The “full slice expression”
Now, could we have done without that extra function call ,
especially considering that slices.Clip
uses generics ?
See, slices.Clip
is actually a tiny function. As it currently exists in Go 1.22.4:
|
|
That slicing syntax with two colons is what the Go spec call a full slice expression . It performs both slicing and truncation of the backing array.
So we could just as well have written our clipping ourselves:
|
|
And that’s it: no extra import, no generic function call.
But since that complete form of the slicing operator appears to be so little known among Go developers,
using the slices.Clip
function may be a better choice for code readability.