Replacing Golang’s Standard Package Functions with Concurrent Functions: Winning Big By Starting Small

Harness the power of concurrency in Golang to improve performance, and efficiently manage resources with goroutines and semaphores.

Photo by Michael Dziedzic on Unsplash

While I love creating projects from scratch, the reality is often that I inherit projects. Sometimes, it’s to join a team and extend the capabilities of the current codebase. More often, it’s more akin to: “Hey, uh, something is really broken, can you fix whatever it is?!”

Today’s story revolves around the following code (the actual names and functions have been changed to protect the innocent…). The idea is a simple, standard function that recursively scans a given directory, and collects file and directory information in a struct:

// ScannedFile holds the path and information about a file or directory.
type ScannedFile struct {
Path string // Path is the location of the file or directory.
Info os.FileInfo // Info is the file information.
}

// ScanResult holds a collection of scanned files and directories, providing thread-safe access.
type ScanResult struct {
Files []ScannedFile
sync.Mutex
}

// AddFile safely adds a ScannedFile to the ScanResult.
func (result *ScanResult) AddFile(file ScannedFile) {
result.Lock()
defer result.Unlock()
result.Files = append(result.Files, file)
}

// scanUsingWalkDir scans the directory tree rooted at rootDir using filepath.WalkDir.
// It returns a slice of ScannedFile and any error encountered.
func scanUsingWalkDir(rootDir string) ([]ScannedFile, error) {
var result ScanResult

err := filepath.WalkDir(rootDir, func(path string, entry os.DirEntry, err error) error {
if err != nil {
return err
}
info, err := entry.Info()
if err != nil {
return err
}
result.AddFile(ScannedFile{Path: path, Info: info})
return nil
})
return result.Files, err
}

This code is clear and simple, and uses Golang’s standard package filepath in order to recursively scan a directory with WalkDir. While the WalkDir function provided by the language does it’s job, it doesn’t necessarily do it’s job in the most efficient ways possible. Go does a great job of providing necessary standard functions, but it doesn’t know your system:

How many processors/cores you haveWhat other concurrent things are going on in your OSIn general, it doesn’t know your system’s resources, how you make use of them, when they spike, etc.

For those reasons, Go doesn’t force a standard model of concurrency on you. And it shouldn’t. What it does do, is provide an easy way of implementing concurrency if you decide that it fits your situation. Goroutines and concurrency aren’t a silver bullet in every situation — in fact, they can often be detrimental if you’re not paying attention— so it’s left up to the software engineer to decide what could be most efficient given their specific environment.

For a more basic understanding of concurrency in Golang, see my ealier article:

Demystifying Golang Channels, Goroutines, and Optimal Concurrency

The Problem

The function above was part of a bigger job queue in charge of scanning thousands of users ever-changing, cloud file storage containers. While the overall concept and execution needed rethinking, the company wanted a quick win in reducing infrastructure resources/costs first.

The scanning job queue at the time required 10 instances of the job queue in order to handle the almost continuous scanning load of thousands of user directories. After looking at the code and infrastructure, it became apparent that there was a potential win in examining if any efficiencies could be found in examining the WalkDir section of the code. As this is function doing the heavy lifting in this particular job, it’s worth considering alternative options.

A Simple but Effective Transition to Concurrency

Concurrency in Golang is a powerful feature that allows developers to write programs that can handle multiple tasks simultaneously. This can be particularly beneficial for tasks like directory scanning, where the ability to process multiple files or directories in parallel can significantly reduce overall execution time. Goroutines, lightweight threads managed by the Go runtime, make it easy to implement concurrency without the overhead typically associated with traditional threading models. By using goroutines, we can distribute the workload across multiple processors, making better use of the system’s resources and improving the program’s performance.

However, implementing concurrency is not without its challenges. It requires careful consideration of factors such as synchronization, data sharing, and potential race conditions. Misuse of concurrency can lead to issues like deadlocks and increased complexity, making the code harder to maintain and debug. Therefore, it’s essential to understand when and how to apply concurrency effectively. In the following sections, we will explore how to enhance the provided directory scanning code with goroutines, addressing these challenges and demonstrating how to achieve significant performance gains in Golang applications.

Go does a great job of simplifying concurrency when writing applications, but just because your system may be able to spawn 1,000,000 goroutines, you certainly shouldn’t. You should always be in control of how many goroutines your application is executing.

A small number of goroutines can be managed effectively, providing considerable performance boosts. But as the number of goroutines increases, the system’s resources are increasingly diverted to managing them, reducing the net performance benefits.

The Solution

Using a combination of semaphores, goroutines and channels allows for scanning multiple sections of a directory simultaneously. The following code, while providing a significant performance gain, it certainly is a bit more complicated. But it uses some important basic concepts:

Concurrent Directory Scanning: Uses goroutines to scan directories and subdirectories concurrently, improving efficiency by parallelizing the task.Resource Management with Semaphore: Implements a semaphore to limit the number of concurrent goroutines, preventing excessive resource consumption.Thread-Safe Data Collection: Employs a ScanResult struct with a mutex to safely collect and store the scanned files and directories across multiple goroutines.Error Handling and Signaling: Utilizes channels to capture errors (errChan) and signal the completion of all scanning operations (doneChan).Recursive Function for Directory Traversal: Defines a recursive scanDirectory function to handle the scanning of directories, ensuring all entries, including subdirectories, are processed.

I’ve added inline comments to detail the code:

// ScannedFile holds the path and information about a file or directory.
type ScannedFile struct {
Path string // Path is the location of the file or directory.
Info os.FileInfo // Info is the file information.
}

// ScanResult holds a collection of scanned files and directories, providing thread-safe access.
type ScanResult struct {
Files []ScannedFile
sync.Mutex
}

// AddFile safely adds a ScannedFile to the ScanResult.
func (result *ScanResult) AddFile(file ScannedFile) {
result.Lock()
defer result.Unlock()
result.Files = append(result.Files, file)
}

// scanUsingGoroutines scans the directory tree rooted at rootDir using concurrent goroutines.
// It returns a slice of ScannedFile and any error encountered.
func scanUsingGoroutines(rootDir string, numCPU int, concurrencyMultiplier int) ([]ScannedFile, error) {
var result ScanResult
var wg sync.WaitGroup

// Baseline
if numCPU == 0 {
concurrencyMultiplier = 1
numCPU = 1
}

semaphore := make(chan struct{}, numCPU*concurrencyMultiplier) // Buffered channel to limit concurrent goroutines.
errChan := make(chan error, 1) // Channel to capture errors.
doneChan := make(chan struct{}) // Channel to signal completion.

// scanDirectory is a recursive function to scan directories.
var scanDirectory func(string)
scanDirectory = func(dirPath string) {
defer wg.Done() // Decrement the WaitGroup counter when the function returns.
semaphore <- struct{}{} // Acquire a slot in the semaphore.
defer func() { <-semaphore }() // Release the slot in the semaphore when done.

entries, err := os.ReadDir(dirPath)
if err != nil {
select {
case errChan <- err: // Send error to errChan if possible.
default:
}
return
}

buffer := make([]ScannedFile, 0, len(entries)) // Buffer to store scanned files.
for _, entry := range entries {
info, err := entry.Info()
if err != nil {
select {
case errChan <- err: // Send error to errChan if possible.
default:
}
continue
}
// Append the scanned file to the buffer.
buffer = append(buffer, ScannedFile{Path: filepath.Join(dirPath, entry.Name()), Info: info})

if entry.IsDir() {
wg.Add(1) // Increment the WaitGroup counter for the new goroutine.
go scanDirectory(filepath.Join(dirPath, entry.Name())) // Scan the subdirectory.
}
}

// Append the buffer to the result safely.
result.Lock()
result.Files = append(result.Files, buffer…)
result.Unlock()
}

// Get information about the root directory.
rootInfo, err := os.Stat(rootDir)
if err != nil {
return nil, err
}
result.AddFile(ScannedFile{Path: rootDir, Info: rootInfo}) // Add the root directory to the result.

wg.Add(1) // Increment the WaitGroup counter for the initial scan.
go scanDirectory(rootDir) // Start scanning from the root directory.

// Wait for all goroutines to finish and close doneChan.
go func() {
wg.Wait()
close(doneChan)
}()

// Wait for completion or an error.
select {
case <-doneChan:
return result.Files, nil // Return the result if done.
case err := <-errChan:
return nil, err // Return the error if encountered.
}
}

The implementation of scanUsingGoroutines provides significant performance and flexibility improvements over Go’s standard filepath.WalkDir package by leveraging concurrent goroutines. Unlike WalkDir, which processes directories sequentially, scanUsingGoroutines utilizes goroutines and a semaphore to limit concurrency, allowing it to take full advantage of multi-core processors. This concurrency model enables the function to perform multiple directory and file operations in parallel, significantly reducing the time needed to traverse large directory structures. The use of a buffered semaphore ensures that the number of active goroutines is kept at an optimal level, balancing CPU utilization and preventing system overload.

Furthermore, scanUsingGoroutines handles I/O-bound operations more efficiently. While WalkDir processes one directory at a time, blocking on I/O operations, the concurrent approach allows other goroutines to continue scanning while some are waiting for I/O completion. This design reduces idle time and maximizes throughput, especially in scenarios involving slow disk access or network file systems. The error handling mechanism in scanUsingGoroutines, using an error channel, ensures that the first encountered error is promptly reported while allowing ongoing operations to complete gracefully. This concurrent and non-blocking design makes scanUsingGoroutines a better choice for high-performance file system traversal in Go in this situation.

This simple change represented an average speed improvement of 5.6x in the current environment. Since all of the job servers weren’t running at capacity and had some idle time, this simple code change allowed a reduction for 10 job servers to just 2. Technically, we could have gone down to 1 server, but breathing room can be important while you observe the updated code in its new resources limits.

This represented a significant reduction in infrastructure resources by modifying a Golang standard package function.

Sample Code

I’ve provided a simplified example of the code I implemented at: https://github.com/mwiater/golangconcurrentdirscan. The performance results will depend on the number of CPUs/Cores on your system. The application will loop from 1 to runtime.NumCPU() and execute each version of the function.

In Linux you can run it via: go run . -path=”/home/matt/projects”

In Windows, just make sure your path slashes are inline with whichever shell you are using: go run . -path=”C:UsersMattprojects”

The following example output shows the effects of adding concurrency, one step at a time, resulting in 5.2x performance increase with a concurrency of 8 when compared to the baseline.

Baseline: No Concurrency | Directory Scan Comparison: /home/matt/projects
FUNCTION NUMCPUS CONCURRENCY FILES DIRECTORIES OBJECTS SIZE MEMORY EXECUTION TIME SPEED INCREASE
WalkDir N/A N/A 477859 85515 563374 11650846047 482278744 1.754791989s 1.3x
Concurrent Read 8 N/A 477859 85515 563374 11650846047 519758552 2.308008297s 1.0x

Test Number: 1 | Directory Scan Comparison: /home/matt/projects
FUNCTION NUMCPUS CONCURRENCY FILES DIRECTORIES OBJECTS SIZE MEMORY EXECUTION TIME SPEED INCREASE
WalkDir N/A N/A 477859 85515 563374 11650846047 482193672 1.895348317s 1.2x
Concurrent Read 8 1 477859 85515 563374 11650846047 511525136 2.237825253s 1.0x

Test Number: 2 | Directory Scan Comparison: /home/matt/projects
FUNCTION NUMCPUS CONCURRENCY FILES DIRECTORIES OBJECTS SIZE MEMORY EXECUTION TIME SPEED INCREASE
WalkDir N/A N/A 477859 85515 563374 11650846047 482159056 1.783704648s 1.0x
Concurrent Read 8 2 477859 85515 563374 11650846047 495747136 1.166968233s 1.5x

Test Number: 3 | Directory Scan Comparison: /home/matt/projects
FUNCTION NUMCPUS CONCURRENCY FILES DIRECTORIES OBJECTS SIZE MEMORY EXECUTION TIME SPEED INCREASE
WalkDir N/A N/A 477859 85515 563374 11650846047 482210408 1.797338872s 1.0x
Concurrent Read 8 3 477859 85515 563374 11650846047 515456544 849.914464ms 2.1x

Test Number: 4 | Directory Scan Comparison: /home/matt/projects
FUNCTION NUMCPUS CONCURRENCY FILES DIRECTORIES OBJECTS SIZE MEMORY EXECUTION TIME SPEED INCREASE
WalkDir N/A N/A 477859 85515 563374 11650846047 482200984 1.792817841s 1.0x
Concurrent Read 8 4 477859 85515 563374 11650846047 504578416 603.238314ms 3.0x

Test Number: 5 | Directory Scan Comparison: /home/matt/projects
FUNCTION NUMCPUS CONCURRENCY FILES DIRECTORIES OBJECTS SIZE MEMORY EXECUTION TIME SPEED INCREASE
WalkDir N/A N/A 477859 85515 563374 11650846047 482151680 1.745606908s 1.0x
Concurrent Read 8 5 477859 85515 563374 11650846047 520826120 511.29657ms 3.4x

Test Number: 6 | Directory Scan Comparison: /home/matt/projects
FUNCTION NUMCPUS CONCURRENCY FILES DIRECTORIES OBJECTS SIZE MEMORY EXECUTION TIME SPEED INCREASE
WalkDir N/A N/A 477859 85515 563374 11650846047 482210200 1.723066673s 1.0x
Concurrent Read 8 6 477859 85515 563374 11650846047 513077144 430.436392ms 4.0x

Test Number: 7 | Directory Scan Comparison: /home/matt/projects
FUNCTION NUMCPUS CONCURRENCY FILES DIRECTORIES OBJECTS SIZE MEMORY EXECUTION TIME SPEED INCREASE
WalkDir N/A N/A 477859 85515 563374 11650846047 482178016 1.775414641s 1.0x
Concurrent Read 8 7 477859 85515 563374 11650846047 520799128 397.080972ms 4.5x

Test Number: 8 | Directory Scan Comparison: /home/matt/projects
FUNCTION NUMCPUS CONCURRENCY FILES DIRECTORIES OBJECTS SIZE MEMORY EXECUTION TIME SPEED INCREASE
WalkDir N/A N/A 477859 85515 563374 11650846047 482200984 1.946071665s 1.0x
Concurrent Read 8 8 477859 85515 563374 11650846047 520798416 371.251936ms 5.2x

On my laptop with 8 cores, the scanning time was reduced from 2067ms to 363ms, executing the scan 5.7x faster.

In the code in the repository and inline above, there is a line:

semaphore := make(chan struct{}, numCPU*concurrencyMultiplier) // Buffered channel to limit concurrent goroutines.

When you’re running the code or the compiled binary, along with the required -path flag, you can also us the -concurrencyMultiplier flag. This flag defaults to 1, but if you want to test it with higher than runtime.NumCPU() you can use this flag. E.g.:

go run . -path=”/home/matt/projects” -concurrencyMultiplier=2

For a CPU with 8 cores, instead of running the default loop above for concurrency levels 1–8, the multiplier of 2 would loop through concurrency levels of 2–16. For the projects I’ve been working on, more than runtime.NumCPU() goroutines usually don’t produce much benefit. But I wanted a quick way to adjust the number of goroutines while testing.

Considerations and Pitfalls

While the concurrent approach to directory scanning in Golang offers significant performance benefits, it may not be suitable for all situations. The increased complexity of the code can make it harder to maintain and debug, introducing challenges such as race conditions, deadlocks, and synchronization issues. Additionally, the overhead of managing multiple goroutines can sometimes negate performance gains, especially in environments with limited CPU resources or for tasks that are not I/O-bound.

Another potential pitfall is resource exhaustion. If the semaphore is not appropriately sized, the system might spawn too many goroutines, overwhelming the CPU and memory, leading to degraded performance or crashes. This approach assumes that the directory structure is large and complex enough to benefit from parallel processing. For smaller or simpler directory trees, the overhead of managing concurrency might outweigh the benefits. Proper error handling in a concurrent environment is also more complex, requiring careful management to ensure timely and accurate reporting. Therefore, evaluating the applicability of concurrency based on the specific context and requirements of the task is essential.

Conclusion

By replacing Golang’s standard directory scanning capabilities with a new function utilizing concurrency, we achieved substantial performance improvements and resource optimization. The transition to using goroutines and semaphores allowed for efficient concurrent processing, reducing the execution time and server load significantly. This approach not only demonstrates the power of Golang’s concurrency model but also highlights the importance of tailoring standard functions to better utilize system resources.

Matt Wiater – Medium

Replacing Go‘s Standard Package Functions with Concurrent Functions: Winning Big By Starting Small was originally published in Level Up Coding on Medium, where people are continuing the conversation by highlighting and responding to this story.

​ Level Up Coding – Medium

about Infinite Loop Digital

We support businesses by identifying requirements and helping clients integrate AI seamlessly into their operations.

Gartner
Gartner Digital Workplace Summit Generative Al

GenAI sessions:

  • 4 Use Cases for Generative AI and ChatGPT in the Digital Workplace
  • How the Power of Generative AI Will Transform Knowledge Management
  • The Perils and Promises of Microsoft 365 Copilot
  • How to Be the Generative AI Champion Your CIO and Organization Need
  • How to Shift Organizational Culture Today to Embrace Generative AI Tomorrow
  • Mitigate the Risks of Generative AI by Enhancing Your Information Governance
  • Cultivate Essential Skills for Collaborating With Artificial Intelligence
  • Ask the Expert: Microsoft 365 Copilot
  • Generative AI Across Digital Workplace Markets
10 – 11 June 2024

London, U.K.