Will this concurrent operation for map lead to panic or other problem?

sqrt7 · September 17, 2021, 11:03am

In the scene, some configuration will be loaded into the map regularly, and the requirements for real-time performance is not required, meaning that reading outdated data is allowed.

So will this concurrent operation lead to panic or other problems ?
Will this operation ensure that I can read the new loaded data after a period of time ?
(For example: new data loaded at time 0s,
when data is read at time 0s, the old data may be returned,
but when data is read at time 3s, the new data must be returned.)

// global variable
var myMap map[int]string
...
// load regularly
go func() {
    tmpMap := loadMap()
    myMap = tmpMap
} ()

// read 
go func() {
    ...
    v, ok := myMap[key]
    ...
} ()

In more detail, is it OK if I replace the map by a struct with a map? like follws:

type myStruct struct {
    m map[int]string
}
// global variable
var s *myStruct

...

// load regularly
go func() {
    tmp := loadStruct()
    s = tmp
} ()

// read
go func() {
    ...
    v, ok := s.m[key]
    ...
} ()

hollowaykeanho · September 18, 2021, 1:48am

You have a data race condition (both read and write are racing to access the global variable).

No guarantee. If you want guarantees, you need to implement some kinds of sync like Mutex in your myStruct. Then ensure you acquire the lock in both of your write and read goroutine before accessing the global variable. Alternatively, if your overall program executions allows the use of Channel, you can use that instead of the conventional mutex locking.

If you’re using mutex, your goroutine responsible for read needs to know how to handle absent or outdated data on its side like waiting by polling, and etc.

If you’re using channel, it does not matter because the read will always wait for a data to arrive or closing signal.

No strict rules for using mutex or channel. Choose the one that makes your concurrent program less complicated/Frankenstein.

sqrt7 · September 18, 2021, 3:04am

Thanks for your reply ! Yes, the mutex and channel can guarantee reading the new data after a period time. However, I worry about that using the mutex or the channel may extend the reading time. In fact, the map is a configuration file, when the other data is required, the configuration map will be read to determine which data should be returned. If using the mutex and channel, I think the process of requesting other data may be blocked, because at this time the map may be writed and mutexed. In this case, the request of other data may lead to the timeout, which is not allowed.

hollowaykeanho · September 18, 2021, 4:47am

Not to a significant effect. That is entirely how you code your read handling. Also, the locking or channel transmission is necessary evil for syncing between goroutines.

The extreme concerns where many concurrent processes saturates the lock is a different issue (usually related to your overall algorithm). You need to experiment it out with test codes on your hardware before concluding the concern is valid for non real-time application.

Anyways, you have to sync all 3 goroutines no matter what because of the data race condition.

This is entirely related to how your handle waiting while syncing. As I mentioned earlier, the read must know how to handle absent of data, outdated data, corrupted data, when to timeout the waiting, should you retry, and etc.

Both sync and channel will not hog the CPU resources like “delay counter algorithm” during waiting so don’t worry about it.

hollowaykeanho · September 18, 2021, 5:11am

If the long waiting time is due to the write to file, perhaps you can change your approach where:

the map controls the file
1.1. load the configurations into memory;
1.2. either push to other goroutine via channel OR
1.3. setup a method for other goroutine to pull the configurations;
1.4. Only write to file when there is a solid update to the config data in memory;
1.5. facilitate a “update” method to tell map to pull update from file.

Instead of:

everyone reading the same file over and over again;
Racing against write file.

The first approach only causes short wait time when there is any changes in config data. The rest of the time, the config is served via memory so you only read/write from disk whenever needed.

The second approach will reduce your disk lifetime because of the frequent read/write.

sqrt7 · September 18, 2021, 6:27am

Sorry, maybe I explain the code not clearly because of my poor English.

// global variable
var myMap map[int]string
...
// load regularly
go func() {
    tmpMap := loadMap()
    myMap = tmpMap
} ()

In this code, func loadMap() loads a configuration file into local variable tmpMap (memory).
Then using the configuration in tmpMap (memory) to update the old configuration in myMap (also memory).

myMap = tmpMap

I think you mean that I should use mutex like follows, and because the update is “myMap = tmp Map”, only using the memory, so this update process will not cost a long time and will not have a siginificant effect on the read process “v, ok := myMap[key]”.

m.lock()
myMap = tmpMap
m.unlock()

m.lock()
v, ok := myMap[key]
m.unlock()

In my view, this concurrent method is standard, of course right.
But on the other hand, I want to know what will happen if the mutex is removed, like the code in first floor.

As I known, the map (variable tmpMap and myMap) is a pointer in golang. So the operation “myMap = tmpMap” is the assignment of pointer. This operation only hand over the “actual map” they point to, during which the “actual map” is not modified.
In golang, the concurrent operation of a pointer variable will not cause panic. (while the concurrent operation of a map will cause panic, like this:

go func() { 
    myMap[key] = value 
}
...
go func() { 
    v, ok :=  myMap[key] 
}

)
Then, I want to discuss what will happen when we concurrently operate a pointer variable. In my opinion, if the pointer is written and read at the same time, the only problem may happen is that: the old address and the old data it points to may be read, and the write operation will be excuted currectly. In other words, write and read at the same time will only lead to the “dirty read”. And after a period of time, write and read at the same time will guarantee that the new data is written into pointer, so that read after a long time will return the new data.
So I think this process is also for map. Since I can bear the “dirty read”, it is unnecessary to add the mutex. But I still have some uncertainty, because I don’t have a deep understanding of the underlying principles of map and golang. Therefore, I create this topic and ask for help. As you said this concurrent operation cannot guarantee the above process, would you please tell me what problem this concurrent operation will lead to and why the problem will happen? Thanks a lot !

hollowaykeanho · September 18, 2021, 4:26pm

Pointer is a data type (memory address) so the above still applies regardless of its underlying pointing data type. Also, map is indeed uses pointer under the hood but bear it mind that the pointer of map is &map[...]... (if viewing from low-level, it is pointer of pointer) where the use case in Go is very rare and usually being frown upon.

Regardless whether is map or map pointer, you can still corrupt the memory address (as in the data) of the pointer with the race condition resulting to reading some unknown illegal memory location.

That would be non-atomic concurrency.

The reason is each goroutine does not always start in sequential order as coded since it depends on the scheduler scheduling the thread at a runtime (e.g. the code can have write goroutine run before read goroutine but in reality scheduler can have read goroutine run before write goroutine) creating possibility of alteration.

Hence, I would expect the following usual result:

If read is before write: outdated value
If write is before read: latest value
If write and read at the same time:
3.1. either of the above OR
3.2. memory related error/panic is raised OR
3.3. crash the program

In your case, the concerning part of the race should be during writing the memory address of the pointer for (myMap = tmpMap, alongside reading data from it (v, ok := myMap[key]), this may lead to error/panic or crashing the program entirely.

That is 1 of the possibility where the write operation is always completed. There is another possibility where you read the memory address from the pointer when the write is still writing, leading to reading an incorrect memory location pointing to neither the outdated or updated map data.

Chances of this possibility is unpredictable. I do not experience this because I would never want a non-guaranteed algorithm in any of my program so my guess would be an error/panic is raised OR crash due to access into illegal memory location (e.g. the read address is pointing to some part of kernel).

I looked hard over your code snippet in your last post and I really can’t see a good way to sacrifice a sync since both goroutines are accessing the same memory location (stuck at myMap with memory location corruption tendency).

The only exception is you have a guaranteed way to have all write to myMap happens in a serialized manner (not concurrent write) before any read and then leave myMap as read-only. In this case, your read goroutines would no longer need any sync.

EDIT: Not sure is this what you’re looking for but just in case: Go has a thread-safe map called sync.Map. It has built-in locking mechanism under the hood. If you’re using sync.Map, you can avoid manual locking.

sqrt7 · September 22, 2021, 3:06am

Thanks a lot ! After reading your reply, I decide to add a RWMutex because the write operation is very rare and the read operation is usual.
Another little problem is: which pattern would your prefer ?

m.Lock()
// write operation 
m.UnLock

or

m.Lock()
defer m.UnLock()
// write operation

hollowaykeanho · September 22, 2021, 3:15am

This is my default because you don’t need to track you locking state in your mind when you’re programming your function/method. defer is a really great buddy.

This is used only when I need the manual locking control for conditions or need to pass the lock to somewhere else. Example:

If I bump into some error, I will pass to my custom handleError(...) function entirely alongside with the lock as parameter and handleError(...) will execute the unlock steps.

When using this, you need to track the locking state as you develop your function/method like any other programming language.

Common mistakes are usually:

forgot to unlock or
did not mental checking through the lock state across your codes.

EDIT 1:
This method is also used for high traffic access (e.g. the variable is too busy getting read/write). defer will only execute after the function ended so if your function is very busy, the locking period will be unnecessarily longer.

sqrt7 · September 22, 2021, 6:58am

OK, Thanks a lot !

system · December 21, 2021, 6:59am

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.