Caching results would affect memory usage, but the result could go either way; you’d need to benchmark/profile it. You said you’re using this function in a loop. Are the input strings ever repeated? If so, you might actually save memory by caching the results. For example, if your loop is something like this right now:
inputs := []string{
"test?",
"test?",
"test?",
}
outputs := make([]string, len(inputs))
for i, input := range inputs {
outputs[i] = fixAnnotationKey(input)
}
Then fixAnnotationKey
will create three separate "test_"
strings in memory. If you cached the results, you could re-use that resulting “test_” string:
type fixAnnotationsState struct {
memo map[string]string
}
func (s fixAnnotationsState) fix(key string) string {
if value, ok := s.memo[key]; ok {
return value
}
if reInvalidAnnotationCharacters.MatchString(key) {
// only allocate for ReplaceAllString if we need to
value = reInvalidAnnotationCharacters.ReplaceAllString(key, "_")
}
s.memo[key] = value
return key
}
Then the loop could be something like:
inputs := []string{
"test?",
"test?",
"test?",
}
outputs := make([]string, len(inputs))
fixer := fixAnnotationsState{make(map[string]string, len(inputs))}
for i, input := range inputs {
outputs[i] = fixer.fix(input)
}
fixer.memo = nil // now GC can reclaim the map.
Also, for your particular regular expression and how you’re using ReplaceAllString
, you can switch the implementation of fixAnnotationKey
to use strings.Map
instead:
func fixAnnotationKey(key string) string {
return strings.Map(func(r rune) rune {
switch {
case '0' <= r && r <= '9':
fallthrough
case 'A' <= r && r <= 'Z':
fallthrough
case 'a' <= r && r <= 'z':
return r
default:
return '_'
}
}, key)
}
Even though this does not cut-down on allocations, it results in about an 80% speed-up when I compare the two functions against the text of Moby â– â– â– â– that I found on the Internet.
If I add memoization, the strings.Map
-based implementation gets a little bit slower, but the regular expression-based implementation speeds up to the same speed as the strings.Map
implementation:
regex : 74.725386ms
string.Map : 12.360646ms
memo(regex) : 50.877408ms
memo(string.Map) : 30.786012ms
regex : 86.295341ms
string.Map : 13.819604ms
memo(regex) : 15.419945ms
memo(string.Map) : 15.442404ms
regex : 75.028974ms
string.Map : 12.5782ms
memo(regex) : 14.493085ms
memo(string.Map) : 14.277293ms
regex : 79.612822ms
string.Map : 12.577042ms
memo(regex) : 12.368686ms
memo(string.Map) : 15.611474ms