I understand that this could be somewhat subjective so I’ll try to outline the parts that I’m mostly sure about to narrow the focus as much as possible.
I want to do some rudimentary login name checking. My goal would be to allow all unicode characters that would be used in a name including in non-English languages + numbers and basic punctuation that could be in a name but not writing punctuation like !, ?
nor control character like CR, LF and so on and white space is allowed in this case.
The obvious things to allow would be: unicode.IsDigit
/unicode.IsNumber
, unicode.IsLetter
and to not use IsControl
but I am unclear with IsMark
as it looks like a subcategory would be useful but another not and have no idea about IsGraphic
and unsure about IsPunct
.
Has anyone delved into unicode enough to have an idea on what would be a reasonable selection of allowed unicode?
Update:
After playing around with words from different countries I have so far come up with this:
switch {
case unicode.IsDigit(chr), unicode.IsLetter(chr), unicode.IsSpace(chr), unicode.Is(unicode.Dash, chr), unicode.Is(unicode.Hyphen, chr):
case unicode.IsPunct(chr):
if !strings.ContainsAny(string(chr),"’'ʼ՚ߴߵߵ’'") { // Apostrophes
logger.Error(`Message`)
}
default :
logger.Error(`Message`)
}