Concurrency in Go

Filed under golang on February 26, 2020

Something I’ve really come to appreciate over the last 2 years or so I’ve been using Go is its concurrency constructs. It’s trivial to push something out into another goroutine, and it takes virtually no effort to synchronise your workers.

The most obvious place to start is with the go keyword

package main

import (
    "log"
)

func worker(id int) {
    log.Printf("Hello from worker %d", id)
}

func main() {
    for i := 0; i < 10; i++ {
        go worker(i)
    }
    log.Printf("Goodbye from main")
}

Simply prepend it to a function call and it will push it off to the background for you. Much easier than trying to spin off a bunch of pthreads or writing a large amount of boilerplate to set up a worker pool.

The other cool thing about goroutines is how lightweight they are, you rarely have to think too much about how many you can feasibly spin up (unless you’re interacting with AWS services, I find myself frequently using a worker pool pattern when doing that).

Of course the above code won’t work so great (check it out in the go playground) since the main thread may finish and exit before all our workers are done, so let’s synchronise it

package main

import (
    "log"
    "sync"
)

func worker(id int, wg *sync.WaitGroup) {
    defer wg.Done()
    log.Printf("Hello from worker %d", id)
}

func main() {
    wg := &sync.WaitGroup{}
    for i := 0; i < 10; i++ {
        wg.Add(1)
        go worker(i, wg)
    }
    wg.Wait()
    log.Printf("Goodbye from main")
}

Go playground link

Here, we initialise a WaitGroup object, tell it we have a number of workers, get the workers to report when they’re done, then get the main thread to wait for everyone to finish before exiting.

What about if goroutines need to message each other? We use channels.

Channels are an incredibly simple to use synchronous communication mechanism. Let’s say we needed a control channel to tell our workers to either finish up or keep going

package main

import (
    "log"
    "sync"
    "time"
)

func worker(id int, wg *sync.WaitGroup, control chan bool) {
    defer wg.Done()
    var keepGoing bool
    for {
        keepGoing =<- control
        if !keepGoing {
            break
        }
        log.Printf("Hello from worker %d", id)
    }
}

func main() {
    wg := &sync.WaitGroup{}
    control := make(chan bool)
    workerCount := 10
    for i := 0; i < workerCount; i++ {
        wg.Add(1)
        go worker(i, wg, control)
    }
    for i := 0; i < 5; i++ {
        time.Sleep(time.Second * 1)
        for j := 0; j < workerCount; j++ {
            control <- true
        }
    }
    for j := 0; j < workerCount; j++ {
        control <- false
    }
    wg.Wait()
    log.Printf("Goodbye from main")
}

Go playground link

The main thread simply writes the control message onto the channel, control <- true, while the worker threads read it off keepGoing =<- control. Probably one of the easiest constructs I’ve seen in a while, almost like a pipe or socket except with types.

Go also provides some more traditional constructs like mutexes to give you better control over your memory.

package main

import (
    "log"
    "sync"
)

var shared int

func worker(id int, wg *sync.WaitGroup, lock *sync.Mutex) {
    defer wg.Done()
    log.Printf("Hello from worker %d", id)
    lock.Lock()
    log.Printf("Worker %d has the lock", id)
    defer lock.Unlock()
    shared += id
}

func main() {
    wg := &sync.WaitGroup{}
    lock := &sync.Mutex{}
    for i := 0; i < 10; i++ {
        wg.Add(1)
        go worker(i, wg, lock)
    }
    wg.Wait()
    log.Printf("Goodbye from main. Shared=%d", shared)
}

Go playground link

Here we pass our workers a common mutex to lock on before modifying the shared variable, which ensures a consistent state between the workers.

I’ve found just these features alone have been enough to cover most of my use cases, anything extra I need is usually something pretty esoteric and weird. The added benefit with splitting up your processing like this as well is that it becomes easier to distribute computation across multiple nodes if needed, with a wrapper around the channels providing the network capability.