Goroutine is perhaps the main reason why people choose golang for their projects. It is a lightweight thread managed by the go runtime. It is also easy to use. You can simply use go keyword to start a goroutine. However, if you need to run a large number or cpu intensive goroutines, you might want to use a goroutine pool to manage them and make sure you don’t run out of memory or cpu resources. So how to implement one?

Asking Chatgpt

Implement a goroutine pool is much simpler with chatgpt, there are a few bugs to be honest but after a few rounds of tweaking I got it working. Here is the final code snippet:

package main

import (
	"fmt"
	"sync"
)

func worker(id int, tasks <-chan func(), wg *sync.WaitGroup) {
	defer wg.Done()
	for task := range tasks {
		task()
	}
}

func main() {
	const numWorkers = 4
	const numTasks = 10

	var wg sync.WaitGroup
	tasks := make(chan func(), numTasks)

	// Create worker goroutines
	for i := 0; i < numWorkers; i++ {
		wg.Add(1)
		go worker(i, tasks, &wg)
	}

	// Submit tasks to the channel
	for i := 0; i < numTasks; i++ {
		taskID := i
		tasks <- func() {
			fmt.Printf("Task %d is running\n", taskID)
		}
	}

	// Close the task channel to signal workers to exit
	close(tasks)

	// Wait for all workers to finish
	wg.Wait()
}

The code snippet is quite self-explanatory if you have a bit knowledge of channel and wait group. It creates a channel to hold the tasks and a wait group to wait for all the tasks to finish. Then it creates a number of workers to consume the tasks. The tasks are actually functions. The worker will run the function when it receives the task. you can replace the anonymous function with any function you want, typically a function that does some cpu intensive or time consuming work.

The running result is like this:

Task 0 is running
Task 3 is running
Task 4 is running
Task 5 is running
Task 6 is running
Task 7 is running
Task 8 is running
Task 9 is running
Task 1 is running
Task 2 is running

You can tell the tasks are running with no order, it is because with multiple workers running in parallel, the tasks are consumed by the workers in a random order.

So the goroutine pool looks very straigtforward, but is it enough for serious production use?

Open source libraries

I have a routine of searching github for existing solutions before I decide to create a wheel myself. So If you search goroutine pool you will find a few libraries . among them ants , tunny and Pond gain more than 1k stars. Compared to the one I showed above, there should be some reasons why they are more popular. So let’s take a look at them.

ants

Ants have some features which I think come from the author’s experience of using goroutine pool in production.such as

  • Purge overdue goroutines periodically and do recycling
  • Handle panic gracefully and recover from panic
  • Administration APIs like getting the number of running goroutines, adjusting the size of the pool on the fly and so on
  • nonblocking option which makes tasks submitting never be blocked

The author claims it’s the high-performance and low-cost goroutine pool and said it is even faster than spawing goroutines without limitation. So that means you should be able to use it in any circumstances where you need to spawn a large number of goroutines using native implementation.

tunny

Tunny is a neat goroutine pool implementation compared to ants. The core library is implemented in less than 500 lines of code. yet it offers the essential features like setting the timeout for the tasks, changing the pool size and managing goroutine states.

Pond

Pod seems even more ambitious than ants, besides the basic features like setting the pool size, timeout and so on, it also offers a few advanced features like sofisticated APIs for managing the pool and pool metrics using prometheus. Also of cource high performance and low memory usage are the selling points of this library. In its benchmark https://github.com/alitto/pond-benchmarks it claimes slightly better than Ants and closes to unbouded goroutines.

Some thoughts

When you walk over the github related repositories you might get lost on which one to choose. I think in practice it’s good to try some streightforword implementation like the one I showed in the beginning. When you get more understanding on how it works, you can then think about what your use case really needs. If you do need more sophistcated implemetation and more features you can go to libraries like ants and pond which would save you quite a lot of time.

Article references

These two articles which mentioned a few times in above libraries are helpful to understand how a production goroutine pool is implemented. Here are the links:

http://marcio.io/2015/07/handling-1-million-requests-per-minute-with-golang/ http://nesv.github.io/golang/2014/02/25/worker-queues-in-go.html