Standard-Solution to protect a variable for concurrent access

falco467 · May 19, 2023, 12:11pm

Hi!
I write a lot of webserver-code, where often some central state is accessed by multiple go-routines (from concurrent web-requests). I usually safeguard access to shared state by using an sync.RWMutex like this:

var services = map[string]Service
var serviceMutex = sync.RWMutex{}

func UpdateValue(name, value Service) {
  serviceMutex.Lock()
  defer serviceMutex.Unlock()
  services[name] = value
}

But I find this has several downsides:

One could access the variable without locking the mutex (simple forget it)
One could write a long running method, accidentally locking the variable the whole time (because defer is only called after the whole method returns)
If the access is hidden behind Get/Set Methods one could accidentally write something like v.Set(v.Get()+1) and introduce a race condition.
One could accidentally pass a reference to the variable via return (allowing unlocked access).

So I have written a small library to wrap the access to such variables and was thinking if something like this should be part of the standard library (since it seems universally useful)

I have posted the code for review and discussion on codereview.stackexchange:

And have also created a share on GoPlayground:

I’m happy for any feedback and comments

Kind Regards

christophberger · May 20, 2023, 2:27pm

Hi @falco467,

These are valid points about the drawbacks of mutexes, and it is always good to see that someone is not only aware of these drawbacks but actively searches for a solution.

However, to me, the solution you propose seems a bit convoluted.

I have to define closures and pass them as arguments to Read or Update.
The Read operation relies on a side effect to get the data out of the protected data structure:
```
    var path string
    // possibly quite some lines of code between these lines
    sv.Read(func(v config) { path = v.path })
```
As a result, the data flow may become obscured. I would prefer a straightforward syntax like path = sv.Read().

More importantly, the solution does not seem to actively prevent problems #1, #2, and #3. I played the devil’s advocate with your playground code and found the following:

I am able to create a pointer to the protected data and access it outside a mutex.
I am able to call long-running functions inside a Read or Update function.
I am able to call Read inside Update and cause a deadlock.

Here is my modified version. Search for // *** comments to find the modifications I made.

I think the problems you point out are inherent to the way concurrent access to shared memory works. Avoiding these problems either requires programming discipline or replacing concurrent access to shared memory by communication (that is, using channels and goroutine-local data).

jarrodhroberson · May 21, 2023, 6:11am

The most correct solution if you must maintain a shared state, which I would argue can always be done in a better non-shared manner, is to use a concurrent cache that handles all the access controls for you and all your code has to do is normal map like operations and not be worried about shared state. A simple Google search turns up quite a few to choose from that are open source.

falco467 · May 23, 2023, 7:59am

Hi @christophberger
thank you for the feedback. I want to address your points.

Closures: I was thinking closures are preferrable to defining your own functions for small actions which are only used inside a single bigger function. So you can have the benefits of a function (early returns, defer-handling on panic), but without thinking of a function name and moving the code somwhere else.
As a best practice you should usually define variables as close to usage as possible and in the narrowest scope you can. If you want to get a copy of the protected value you can just use x := sv.getCopy() but if you don’t want to copy the whole data type (might be a big map or struct) you can use a closure to extract only a single data-point
You are absolutely right I cannot stop wrong usage of the package - but I thought it is harder to accidentally get it wrong with this syntax. If you want to extract a pointer to the shared variable you have to move it out of the closure explicitly. It is clearly visible in the source code by indentation where the variable is locked - so calling a long running operation inside the Read-block should be more obvious and thus easier avoided.

Thank you for your input! I’m trying to build APIs which encourage correct usage and are harder to use wrong.

falco467 · May 23, 2023, 8:21am

Hi @jarrodhroberson
I’m interested how you would implement a shared state like a map with configuration-options in a non-shared manner. There are Rest-endpoints which can change the configuration and endpoints which need information by the configuration to operate. So a number of these readers/writers might run concurrently, but they may also run after some time. Would you create a central go-routine with a shared channel, which is “asked” by any call to deliver the configuration over another channel? I think this could be a lot of code for little benefit compared to a shared variable?

Thank you for the input on concurrent caches. I tried searching for implementations for quire some while. But all I found where either focused on cache with eviction (not a permanent shared state) or were quite overengineered (lots of code & dependencies, for something which can be solved save in less than 100LOC) and would usually only work on maps and not e.g. on structs or other custom data-types which might be used as shared state.

Can you point me to a simple straightforward implementation I might have overlooked? I’m trying to learn and improve by looking at professional solutions.

lemarkar · May 30, 2023, 5:09pm

Have you tried or checked atomic.Value, or just atomic package itself. It allows to create concurrently saved variables of different types. The example of usage is context.cancel function, where it allows us to use Done() in select

falco467 · June 1, 2023, 9:44am

I have worked with atomic.Value, which is a nice fit for certain use-cases, but it only works for simple Values. If you put a Map into atomic.Value, you will still get a reference to the same map, not a copy. You can not safely update an entry in a map or a field in a struct with atomic.Value.

jarrodhroberson · June 4, 2023, 8:11pm

you can if the entries are atomic.XXX

lemarkar · June 5, 2023, 11:53am

Tbh I’ve read the tread again and all the described problems look very irrelevant to me. Because if you are the maintainer of the code you write with this kind of shared state control, that means that only you can make such errors described in downsides. At the same time as far as I remember go prefers to not share states with global variables. Instead of inventing another wheel it can be simply wrapped into the struct, as you tried to do in suggested library. It reduce the access to data, since the field is not imported and allows you to add as many methods which can control data or manipulations with this data as you like.

type Value struct {
	sync.RWMutex

	data map[string]any
}

func (v *Value) Add() {
    v.Lock()
    defer v.Unlock()

    // code here
}

func (v *Value) Read() {
    v.RLock()
    defer v.RUnlock()

    // code here
}

func (v *Value) Delete() {
    v.Lock()
    defer v.Unlock()

    // code here
}

I completely disagree that some func outside the struct operations should hold the lock on the value. The idea is to get it out of the cache, whatsoever, and pass it further.

If you need specific copy methods, you will have to write them yourself. I know it can be frustrated to manually write all fields of e.g. struct, but that how it works. At the same time you can add closures for copy functions and play with it as you would like to. But again, there is no common solution because it depends on the use case. Some of your ideas or specific needs might be unnecessary for others.

lemarkar · June 5, 2023, 12:12pm

Another way of the global lock is to use sync.WaitGroup. I have an util which makes a lot of ssh connection, performs some operations and writes results into Redis db. To stop all goroutines of sending data to “AddToRedis” channel I did this:

type RedisConn struct {
    sync.WaitGroup

    // connection, etc...
}

func (rc *RedisConn) Ping(ctx context.Context) {
    // Start as a goroutine.
    // ping db connection here and if it was lost call `reconnect` func...
    rc.reconnect(ctx)
}

func (rc *RedisConn) reconnect(ctx context.Context) {
    rc.WaitGroup.Add(1)
    defer rc.WaitGroup.Done()

    // reconnect here...
}

func (rc *RedisConn) AddToRedis(k, v string) error {
    // If ping goroutine locks the group, all goroutines where this func is called
    // will wait for the release.
    rc.WaitGroup.Wait()
    
    // Add here...
}

In the topic of the discussion this solution also can be change to apply for the global shared state control if you need to lock execution of some code if you need updated state all the time. And coming back to the atomic, there is a Map type realisation which serves the same purpose.

falco467 · June 5, 2023, 4:35pm

Thank you for your time and suggestions. I’m also using wait groups for synchronization of routines, it helps a lot to keep state clear and succinct.

sync.map is prefixed by the following warning:

// The Map type is specialized. Most code should use a plain Go map instead,
// with separate locking or coordination, for better type safety and to make it
// easier to maintain other invariants along with the map content.
//
// The Map type is optimized for two common use cases: (1) when the entry for a given
// key is only ever written once but read many times, as in caches that only grow,
// or (2) when multiple goroutines read, write, and overwrite entries for disjoint
// sets of keys. In these two cases, use of a Map may significantly reduce lock
// contention compared to a Go map paired with a separate Mutex or RWMutex.

So it is most likely not the right choice for most use cases like a shared mutable configuration.

falco467 · June 5, 2023, 5:07pm

In a perfect world maybe. But I usually work in a team and projects tend to need updates several months later… So wheter it is a junior dev or myself months later - we need all the help we can get to write good code. So I’m using linters and libraries which discourage mistakes by their APIs and I also try to provide basic packages/libs for my team, so they will hopefully avoid avoidable errors.

lemarkar · June 5, 2023, 5:18pm

Summing up the discussion I think you might like to check if it’s possible to implement this services with the use of the struct. If you reuse the same code in several projects then you’ll be able to create an internal package and maintain it with your team (as an idea)

system · September 3, 2023, 5:19pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.