# Naming convention: differentiate between N and n

Hi,

I am writing a package for some survey sampling stuff, related to stratification. In all related formulas, the following statistical features are used:

• `N` for population size
• `n` for sample size from the population
• `Nh` for the size of the `h`th stratum
• `nh` for the sample size from the `h`th stratum

In wrote similar programs in several other languages, and I always used the corresponding naming convention for variables representing the above features:

• `N` for `N` (an integer)
• `n` for `n` (an integer)
• `N_h` for `Nh` (a slice)
• `n_h` for `nh` (a slice)

For anyone knowing the stratification stuff (what the algorithms are all about) this would be clear as sun. Now, how to do this in Go? I cannot use small `n`, but I must differentiate between `n` and `N`. I can make it `PopulationN` and `SampleN`, though it’s rather wordy. But what about differentiating slices `N_h` and `n_h`? I should not be using underscores, so I I would make them `PopulationNh` and `SampleNh`, though both have capital `N` though the all the formulas make a clear distinction between `N` (for population size) and `n` (for sample size).

I have been working with this stuff for over 15 years and I think I should do fine with the following four variable names: `PopulationN`, `SampleN`, `PopulationNh` and `SampleNh`, but they will not read naturally and for the very first time a language’s naming convention makes variable names sound unnatural.

Maybe `PopNh` and `SampNh`? Do not read well. So maybe even `PNh` and `Snh`? Nope, these won’t do: `P` stands for proportion and `S` stands for standard deviation (and I am talking about the corresponding formulas, not just generally about statistics). Thus I think I should use the longer names, though the shortest version has one advantage: I am differentiating capital and lower “n”, something representing the actual names in the statistical formulas. But I don’t think this would work in `PopulationNh` and `Samplenh`, since they do not read well. Unless I would do them `Population_Nh` and `Sample_nh`…? But this is like selling the same information twice, and so the prefixes (`Population_` and `Sample_`) are redundant, being there only because I cannot start a variable’s name with a lower letter.

Are there any situations in which I could break Go’s naming convention, like that of not using the underscore? Or maybe you have some ideas how to make these particular names better? I understand this is a peculiar situation because I need to find the representation in Go of formulas in which `n` differs from `N`, something I cannot directly do in Go.

Sure, I can use various names, like above, but I always pay much attention to good naming, one that well represents the phenomenon (here, statistical formulas) and that at the same time reads well.

Hm, last thought: maybe `_Nh` and `_nh` will do?

Forgive me, for I’m not the slightest bit familiar with “sampling stuff, related to stratification,” so I’m a little unclear on this. What’s the problem with using `Nh` and `nh`? Are these package-level constants?

They are not constants, they are slices. They would both form a struct’s fields, but both need to be exported, hence both need to start with capital letters.

I see you said that; sorry. Got it!

Do you ever need to mutate the slices? What about a function with multiple returns?

``````type S struct {
vars struct {
Nh []int
nh []int
}
}

func (s *S) Nhnh() (Nh, nh []int) {
return s.vars.Nh, s.vars.nh
}
``````

Thanks. I will need to think about this, or rather try if this works as expected. But for the moment it seems like quite an idea. Thanks! I will play with this and return here later to share my experience.

Thank you, Sean, once more. Here’s how I did it (initially, though — I will see how it goes later):

``````type Stratification struct {
Stratum    []int     // stratum assignment (length of N)
Nh         []int     // stratum sizes
Wh         []int     // stratum weights
Sh         []float64 // stratum-wise standard deviation
OptFun     float64   // the value of the optimization function
Conditions bool      // does the stratification meet all the conditions?
Population           // representation of the population
Sample               // representation of a sample for a given stratification
}

type Population struct {
X    []float64 // auxiliary variable
N    int     // population size
L    int     // number of strata
Mean float64 // overall mean of X
}

type Sample struct {
n  int     // assumed overal sample size
cv float64 // assumed coefficient of variation
nh []int   // sample sizes from the strata
}

``````

(I use three structs because each of them represents a different part of the problem to be solved.)

From what you wrote it follows that the upper-letter export does not relate to fields in nested structs, right? So, here, I will be able to do the following:

``````var S Stratification
S.nh = []int{5, 6, 7}
``````

and `S.nh` will be exported anyway, even though the `nh` field starts with a lower letter. This would not have worked, however, had I tried to import the `Sample.nh` field? Please correct me if I get this wrong.

No, unfortunately, that lower case `nh` field is not exported.

Oops. But this was the source and context of my question: I need to export both `Nh` and `nh`, and this makes all the issues. Otherwise I would simply use them, but I really need to expert both.

The short answer is you cannot. I was trying to come up with alternatives for you. You cannot, under any conditions, have lower-cased fields that are exported.

Can’t you call your `Sample.nh` field just `Sample.Nh`? I recognize that that can potentially be confusing, but because it’s a field on a struct called “Sample”, does that make it clear enough that you’re talking about the sample `nh` and not the population `Nh`?

Well, this is life . So, clearly I must come up with something. Those wordy names are not good, so perhaps I will have to change a better solution. If I fail, I will have to use them.

I have another idea, something maybe not perfect, but perhaps good. I am thinking of using `nh` and related fields (starting with lower caps) anyway, since they will be heavily used in computation and should well represent the corresponding for statistical formulas — but no need to expert them, since all the computation is done internally. Once the final solution is derived, I will assign the resulting optimal values of these lower-cap fields (like `nh`) to a different struct (e.g. `FinalSolution`), which will include the final stratification. I will do with wordy field names in this struct, since no need anymore to make it reflect any formula. I am not saying this is the ideal solution, but it’s one that will use names that very well represent the corresponding statistical formulas.

I think this issue nicely shows that one does have to think quite a lot about code design. Maybe this all is your daily stuff, but I am accustomed to working with Python and R and naming in them seems simpler. But this simplicity often leads to rather nasty names, if one does not pay too much attention to naming and just uses what has come to one’s mind. Go’s naming convention discourages using long names (and I like descriptive names), and so one needs to ponder a lot which names to use, and this is a good thing, something that can greatly enhance code design.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.