The Hidden Cost of append: Why Pre-allocating Slices and Maps Matters in Go
If you've been writing Go for a while, you've probably used append thousands of times without thinking twice. It just works. But under the hood, that innocent-looking function might be silently killing your application's performance.
In this post, I'll show you exactly what happens when you skip pre-allocation, demonstrate the performance impact with real benchmarks, and give you practical rules for writing faster Go code.
What Actually Happens When a Slice Grows
Slices in Go are backed by arrays. When you call append and the slice doesn't have enough capacity, Go must:
- Allocate a new, larger array (roughly 2x for small slices, ~1.25x for larger ones)
- Copy all existing elements to the new array
- Add your new element
- Let the garbage collector clean up the old array
Here's what that looks like visually:
Appending to a slice at capacity:
Before: [A][B][C][D] cap=4, len=4
↓ append(E)
Step 1: [_][_][_][_][_][_][_][_] allocate new array (cap=8)
Step 2: [A][B][C][D][_][_][_][_] copy 4 elements
Step 3: [A][B][C][D][E][_][_][_] finally add new element
[A][B][C][D] → garbage collection
For a single append, this overhead is negligible. But in a loop adding thousands of elements? You're paying this cost over and over.
The Benchmark
I wrote a comprehensive benchmark to measure the real-world impact. Here's the core comparison:
// Bad: No pre-allocation - slice grows dynamically
func SliceNoPrealloc(n int) []int {
var s []int // len=0, cap=0
for i := 0; i < n; i++ {
s = append(s, i)
}
return s
}
// Good: Pre-allocated with exact capacity
func SlicePrealloc(n int) []int {
s := make([]int, 0, n) // len=0, cap=n
for i := 0; i < n; i++ {
s = append(s, i)
}
return s
}
// Alternative: Direct index assignment (fastest)
func SliceDirectAssign(n int) []int {
s := make([]int, n) // len=n, cap=n
for i := 0; i < n; i++ {
s[i] = i
}
return s
}
Running go test -bench=. -benchmem reveals the damage:
Slice Results (integers)
| Elements | Method | Time | Memory | Allocations |
|---|---|---|---|---|
| 100 | No prealloc | 1,077 ns | 2,040 B | 8 |
| 100 | Prealloc | 420 ns | 896 B | 1 |
| 10,000 | No prealloc | 154 μs | 357 KB | 19 |
| 10,000 | Prealloc | 29 μs | 82 KB | 1 |
| 1,000,000 | No prealloc | 15.5 ms | 41.7 MB | 38 |
| 1,000,000 | Prealloc | 1.6 ms | 8 MB | 1 |
At one million elements, pre-allocation is 10x faster and uses 5x less memory. The non-pre-allocated version triggers 38 separate allocations, each one copying increasingly large amounts of data.
Map Results
Maps behave similarly. They rehash and redistribute buckets when the load factor exceeds a threshold:
// Bad: No size hint
m := make(map[int]int)
// Good: Size hint provided
m := make(map[int]int, expectedSize)
| Elements | Method | Time | Memory | Allocations |
|---|---|---|---|---|
| 100 | No prealloc | 7,065 ns | 5,364 B | 16 |
| 100 | Prealloc | 3,777 ns | 2,908 B | 6 |
| 10,000 | No prealloc | 769 μs | 687 KB | 276 |
| 10,000 | Prealloc | 392 μs | 322 KB | 11 |
| 100,000 | No prealloc | 7.7 ms | 5.8 MB | 4,016 |
| 100,000 | Prealloc | 4.4 ms | 2.8 MB | 1,681 |
The difference is less dramatic than slices (maps are more complex internally), but still a consistent 2x improvement in speed and memory.
The Real-World Pattern: Transforming Slices
The most common scenario where this matters is transforming one slice into another with some filtering:
// This pattern appears everywhere
func GetActiveUsers(users []User) []User {
var active []User
for _, u := range users {
if u.IsActive {
active = append(active, u)
}
}
return active
}
You have three options to optimize this:
// Option 1: No pre-allocation (baseline)
func TransformNoPrealloc(input []int) []int {
var result []int
for _, v := range input {
if v%2 == 0 {
result = append(result, v*2)
}
}
return result
}
// Option 2: Two-pass with exact count
func TransformPreallocExact(input []int) []int {
count := 0
for _, v := range input {
if v%2 == 0 {
count++
}
}
result := make([]int, 0, count)
for _, v := range input {
if v%2 == 0 {
result = append(result, v*2)
}
}
return result
}
// Option 3: Estimate based on heuristics
func TransformPreallocEstimate(input []int) []int {
// Assume ~50% will match
result := make([]int, 0, len(input)/2)
for _, v := range input {
if v%2 == 0 {
result = append(result, v*2)
}
}
return result
}
The benchmark results surprised me:
| Method | Time | Memory | Allocations |
|---|---|---|---|
| No prealloc | 912 μs | 1.96 MB | 25 |
| Two-pass exact | 334 μs | 401 KB | 1 |
| Estimate (len/2) | 194 μs | 401 KB | 1 |
The estimate method wins because it avoids iterating twice. Slight over-allocation is cheaper than an extra pass through your data. This is a useful insight: when in doubt, over-allocate slightly rather than under-allocate or iterate twice.
How to Detect Pre-allocation Issues
1. Benchmark with Memory Stats
go test -bench=. -benchmem
Look for high allocs/op values. If your function is doing 10+ allocations for what should be a single slice build, you have a problem.
2. Escape Analysis
go build -gcflags="-m" yourfile.go
This shows where allocations happen:
./main.go:19:11: make([]int, 0, n) escapes to heap
./main.go:27:11: make(map[int]int) escapes to heap
3. Static Analysis with prealloc
go install github.com/alexkohler/prealloc@latest
prealloc ./...
This linter specifically catches the var s []T followed by append in a loop pattern.
4. Runtime Profiling
For production systems, use pprof to find allocation hotspots:
import _ "net/http/pprof"
// Access via:
// go tool pprof http://localhost:6060/debug/pprof/allocs
Practical Rules
After years of optimizing Go code, here are my rules of thumb:
Rule 1: Known exact size → pre-allocate exactly
result := make([]T, 0, len(input))
// or if filling sequentially:
result := make([]T, len(input))
Rule 2: Upper bound known → pre-allocate to upper bound
// Filtering from input - output can't exceed input length
result := make([]T, 0, len(input))
Rule 3: Estimate available → use the estimate
// If you know ~30% typically match
result := make([]T, 0, len(input)/3)
Rule 4: Building from external source → estimate reasonably
// Reading lines from a file? Guess based on typical file
lines := make([]string, 0, 1000)
Rule 5: Maps with predictable keys → always hint
// Indexing users by ID
userMap := make(map[int64]*User, len(users))
When It Doesn't Matter
Pre-allocation isn't always worth the cognitive overhead. Skip it when:
- The slice will have fewer than ~10 elements
- The code isn't in a hot path
- You genuinely have no idea about the size (rare in practice)
- Readability suffers significantly
The goal is writing fast, maintainable code—not obsessing over micro-optimizations in cold paths.
Conclusion
Pre-allocating slices and maps is one of the simplest performance wins in Go. The pattern is easy to remember: if you know (or can estimate) the size, tell Go upfront with make([]T, 0, n) or make(map[K]V, n).
The benchmarks speak for themselves: 10x speed improvements and 5x memory reduction for large slices. In high-throughput systems processing millions of events, this adds up fast.
Next time you write var s []T followed by a loop with append, pause and ask: do I know roughly how big this will get? If yes, a single make call will save you allocations, copies, and GC pressure.
Your future self (and your servers) will thank you.
Want to verify these numbers yourself? Run the benchmarks with go test -bench=. -benchmem -benchtime=2s and see the results on your own hardware.
Member discussion