May 26, 2024 . 6 min

How does sync.WaitGroup work

One of my colleagues asked this question to a candidate during an interview. Although I've heard it asked many times, this time it felt different. I sat there thinking, "Do I really know how sync.WaitGroup works?" The question was about using it, but I realized I didn't fully understand how it worked internally. This weekend, I took some time to look into it (and decided to write about it).

Starting with documentation #

I started by going to the documentation on pkg.go.dev. Although I’m sure I’ve been to this page, I was not sure I have given the text here much attention. The following is the first paragraph of the documentation for WaitGroup.

A WaitGroup waits for a collection of goroutines to finish. The main goroutine calls Add to set the number of goroutines to wait for. Then each of the goroutines runs and calls Done when finished. At the same time, Wait can be used to block until all goroutines have finished.

I knew Wait is used to wait for something (well, waiting for state of the WaitGroup to be 0) to finish, but I did find them specifying that we wait for “goroutine” to finish over and over kinda odd. Ahh, whatever, lets look into the codebase. Unlike documentation, code is never vague.

Let’s take a look at the codebase #

I’m gonna simplify things a bit so that it does not get overwhelming and because I’m a bit lazy.

The documentation led me to the source on cs.opensource.google. Somehow I never noticed that .google was a valid TLD 😅. Anyways, back to our problem at hand. This is what the sync.WaitGroup struct definition looks like:

type WaitGroup struct {
	noCopy noCopy

	state atomic.Uint64
	sema  uint32
}

Well, looks small enough to be understood by a mere mortal. Let’s see what each of these things are.

`noCopy` #

What in the world is that?

You might have seen messages like the following from go vet. These are due to this noCopy thingy.

./main.go:9:17: doStuff passes lock by value: sync.WaitGroup contains sync.noCopy

./main.go:15:10: call of doStuff copies lock value: sync.WaitGroup contains sync.noCopy

Since this is a syncronization primitive, we don’t want the value to be copied when used from different location. We want it to be passed by “reference” so that all of the instances are pointing to the same underlying object.

Well, if you search for noCopy in the codebase, it shows you a bunch of implementations, but this is what is basically is. See https://golang.org/issues/8005#issuecomment-190753527 for details. go vet gives the warning when it sees a Lock implementation on an interface and we just have a struct which a Lock interface here just to make sure go vet gives a warning when copied.

// noCopy may be embedded into structs which must not be copied
// after the first use.
type noCopy struct{}

// Lock is a no-op used by -copylocks checker from `go vet`.
func (*noCopy) Lock()   {}
func (*noCopy) Unlock() {}

`state` #

OK, now that the thing that you have not seen is out of the way, let’s look at a type you might be more familiar with. state, as you might have guessed, stores the "state" of the WaitGroup. It is of type atomic.Uint64, which means, it is a uint64 which you can do atomic operations on.

But what does it store mate? The value we pass in Add?

Well, you are half way there. The top 32 bits store this value and the bottom 32 bits are used to store the number of waiters, ie the number of goroutines where we have called wg.Wait. The reason why we need this information will become apparent when we look at the implementation of Wait.

Before we move on, I want to make sure you are aware of two operations that you can do on atomic values. If you want to atomically add, ie add without ensuring no race conditions you can call Add function and pass in a uint64. If you want to check if a value is equal to something, and only then update to a new value, you can call CompareAndSwap.

`sema` #

This is a semaphore value. Here is a simple explanation those who might not know how it works. I was supposed to have learnt this is college and I remember hearing it, but I had to look it up again to be sure how it worked.

OK, enough backstory, here is what you need to know. You can think of a semaphonre as an integer. If the integer is positive, you have resources available, if the integer is zero (or negative) you don’t.

There are two general operations you would want to do here.

acquire: this check if the value of semaphore is >0 and if so decrements and returns true(success). If not returns false(count not acquire) or go into a hot loop checking again and again. (code)
release: this increments the value of a semaphore to indicate that a resource is available. (code)

In both cases, the actual implementation in is a little more involved in the go codebase so as to park(pause) and unpark(unpause) the goroutines so that we are doing work other than waiting to acquire the semaphore.

Somewhat of a tangent, but Semaphores in Plan 9 paper mentioned in sema.go was an interesting read in case you are interested.

The methods #

OK, now that we have seen the stuct definition, let’s look at the different methods.

`Add(delta int)` #

This, as you might know, is used to add the counter value. Here is a simplified annotated version of the Add function. I’ve removed some code used for race checks and profiling.

You can view the full code here.

func (wg *WaitGroup) Add(delta int) {
	// Add delta to state (PS: delta can be negative)
	state := wg.state.Add(uint64(delta) << 32)

	v := int32(state >> 32) // upper 32 bit is counter
	w := uint32(state)      // lower 32 bit is count of waiters

	// If counter is >0, no need to do anything
	if v > 0 {
		return
	}

	// If there are no waiters, no need do anything
	if w == 0 {
		return
	}

	// Reset waiters count to 0 as we will be releasing all of them
	wg.state.Store(0)

	// Call release as many times as there are waiters so
	// that all of them can "acquire"
	for ; w != 0; w-- {
		runtime_Semrelease(&wg.sema, false, 0)
	}
}

Here is a quick explanation of the code.

First you add the delta value you pass in into state.

The resulting state value is split into counter and waiter count. The upper 32 bits, obtained by doing state >> 32 and then converting to int32 is the value of the counter and the lower 32 is the value of number of waiters.

If value of v > 0, ie counter is > 0, we’ll not do anything else as we only release once it hits 0(as in when we have Done() all the items we added). If it is 0, then we check if we have any waiters to release, if we don’t (ie w == 0), we return.

Now if the counter is 0 and we have waiters to release, first we set the state to 0 to set the number of waiters to 0 as we are going to release all of them.

Once this is done, we call runtime_Semrelease as many times as there are waiters so that every single wg.Wait calls gets an opportunity to acquire the semaphore.

`Done()` #

This is just simply Add(-1). In fact, that is literally the only thing in the code.

func (wg *WaitGroup) Done() {
	wg.Add(-1)
}

`Wait()` #

And the last one. This one, as you might have guessed, tries to acquire the semaphore value. Here, like before, is a simplified annotated version of the code.

Full code available here

func (wg *WaitGroup) Wait() {
	for {
		// Load the state value
		state := wg.state.Load()

		v := int32(state >> 32) // counter
		w := uint32(state)      // waiter count

		// if counter == 0, no need to wait
		if v == 0 {
			return
		}

		// ... else, increment the waiter count
		if wg.state.CompareAndSwap(state, state+1) {
			// and try to acquire the semaphore
			runtime_Semacquire(&wg.sema)
			return
		}
	}
}

Let’s first look at what is inside the for loop. First we get the value of the state and separate them into the value of the counter and the waiter count like in Add.

If the counter value is already 0, there is no reason for us to wait and we just return. If not, we first try to do a CompareAndSwap to increment the value of the counter and then call runtime_Semacquire which will block until we can acquire the semaphore.

Now, the for loop is there just in case CompareAndSwap was unsuccessful and we have to retry doing the increment of worker count. CompareAndSwap operation can fail if the value of state changed between when we did a Load and when we called CompareAndSwap as the value we give to it to compare against is the one we got from Load.

And that is it. Now you(and I) understand how sync.WaitGroup works.

← Home

How does sync.WaitGroup work

Starting with documentation #

Let’s take a look at the codebase #

noCopy #

state #

sema #