Maps internals in Go
Table of Contents
Idea
Map is passed as a value, but it consists of a pointer to hmap which has all the details on map implementation. So, if you change/add to map, it will be reflected everywhere. And that’s why you cannot assign to an uninitialized map (no memory allocated, no hash seed generated yet).
runtime/map.go
1 2 3 4 5 6 7 8 9 |
type hmap struct { count int // # live cells == size of map. Must be first (used by len() builtin) B uint8 // log_2 of # of buckets (can hold up to loadFactor * 2^B items) hash0 uint32 // hash seed buckets unsafe.Pointer // array of 2^B Buckets. may be nil if count==0. oldbuckets unsafe.Pointer // previous bucket array of half the size, non-nil only when growing nevacuate uintptr // progress counter for evacuation (buckets less than this have been evacuated) } |
Buckets
buckets/oldbuckets
point to arrays of buckets. Each bucket can store up to 8 elements. On overflow, it points to the next bucket (it should not happen often as it leads to performance degradation).
Buckets are addressed by LOB Hash (Low Order Bits).
Inside a bucket, keys are hashed by HOB (High Order Bits).
Bucket Overflow
If we add more than 8 values to one buckets, we will run into bucket overflow. In this case, we create a new bucket, and link it from the original one. If we have too many overflows, it slows down operations on a map (O(1) → O(n)). To avoid it, we arrange new buckets set with more buckets. It is called data evacuation.
Data Evacuation
When an average number of filled slots per bucket exceeds the load factor of 6.5, we start the data evacuation. A new bucket set (sizeX2) is arranged, and is linked by hmap.buckets
from now on. The old buckets are linked as hmap.oldbuckets
. Data from old buckets is not moved immediately, as it will degrade the performance in that moment.
Instead, evacuation process is started: on any modification of the map, some portion of data is moved from old buckets to new ones.
This is the reason why adding to a map is not thread safe and why you cannot take a pointer to a map value: it is not guaranteed to have the same memory address over time.
That’s why it’s important to allocate enough memory on a map creation. It will let you avoid the performance penalties caused by data evacuation.
Resources
Good article in Russian:
https://habr.com/ru/articles/457728/
https://habr.com/ru/companies/avito/articles/774618/ (RU)
Nice video (also in Russian):
https://www.youtube.com/watch?v=P_SXTUiA-9Y
Similar Posts
LEAVE A COMMENT
Для отправки комментария вам необходимо авторизоваться.