What is the idiomatic way of reading and parsing/validating configurations?

jtuchel · July 22, 2023, 9:55am

I want to setup my configuration using environment variables. These should contain valid inputs, so I want to validate them but even better parse them. ( Parse, don’t validate )

I’m using Viper and started with

package configuration

import "github.com/spf13/viper"

func init() {
	viper.SetEnvPrefix("MY_APP")
	viper.AutomaticEnv()
}

Given an example for a logging configuration

package logging

type Configuration struct {
	Level int `mapstructure:"LOGGING_LEVEL"`
}

I’m using a function to read the configuration

package logging

import "github.com/spf13/viper"

func GetConfiguration() (Configuration, error) {
	var configuration Configuration

	err := viper.Unmarshal(&configuration)

	return configuration, err
}

and a function to validate it

package logging

func ValidateConfiguration(configuration Configuration) error {
	// validate configuration here

	return nil
}

but there are several things I don’t like

I think it’s better to parse the variables
How can I set useful fallback values ( / defaults )

I think I’m looking for a way to describe a configuration schema and parse against it. How do you handle your configurations?

Dean_Davidson · July 22, 2023, 3:33pm

I just read that article and find little value in it personally. It’s arguing semantics mostly. And it’s arguing incorrect semantics in my opinion.

These two functions are nearly identical: they check if the provided list is empty, and if it is, they abort the program with an error message. The difference lies entirely in the return type: validateNonEmpty always returns (), the type that contains no information, but parseNonEmpty returns NonEmpty a, a refinement of the input type that preserves the knowledge gained in the type system. Both of these functions check the same thing, but parseNonEmpty gives the caller access to the information it learned, while validateNonEmpty just throws it away.

You can’t parse that something is not empty, but you can validate that it is not empty or assert that it is not empty. You can’t parse that something looks like a valid email address, but you can validate/verify/assert that it looks like a valid email address. In the same way, just because something is called “validate” doesn’t automatically mean it can’t return information along with whatever errors it discovered along the way.

Why do you think that? Can you give me an example in your specific instance where “parsing” would be preferable to “validating” them? What types of bugs/problems are you trying to avoid in your specific case?

I don’t have a lot of experience with Viper but you can set defaults.

At a high level, it depends on my target. For cloud/containerized apps I prefer environment variables. For closer-to-metal / local dev without containers, I prefer config files (I default to JSON because it’s so ubiquitous but many go devs like toml for whatever reason). But regardless of where my config comes from, what I do with it is the same:

Attempt to read config. This includes doing things like falling back from environment variables to config files in the event that env vars aren’t set, etc. This includes things like parsing strings to make sure they are whatever data type the target is (which is NOT the same as validating them IMO).
Attempt to validate the config to get it to a state where I’m mostly satisfied that the app at least has some chance of running (like if I need a SQL connection string and it’s empty, for example, I know the app can’t run; if log level isn’t set, I can just use some default).
If I know the app can’t run, log.Fatalf something useful to the console. The idea is when Google Cloud Run won’t start you should hopefully be able to check the logs and see that you forgot to use the secret manage to set APP_CONNECTION_STRING env var or something. Again, try to be useful/specific.

To be honest, I usually have used nothing but the stdlib. If you really want to, you can go crazy with struct tags to define data about your config:

type appConfig struct {
	LogLevel int    `json:"logLevel" env:"MY_APP_LOG_LEVEL" default:"0" desc:"Can be set to 0, 1, 2."`
	Env      string `json:"env" env:"MY_APP_ENVIRONMENT" default:"Dev" desc:"Dev, Test, or Production."`
}

… and then use reflection to read them:

func main() {
	conf := appConfig{}
	printStructTags(reflect.ValueOf(conf))
}

func printStructTags(f reflect.Value) {
	// Iterate over our fields and grab tags
	for i := 0; i < f.NumField(); i++ {
		field := f.Type().Field(i)
		tag := field.Tag
		fmt.Printf("Field: %v.\n\tExpected JSON tag: %v.\n\tEnvironment variable: %v.\n\tDefault: %v.\n\tDescription: %v.\n", field.Name, tag.Get("json"), tag.Get("env"), tag.Get("default"), tag.Get("desc"))
	}
}

This prints:

Field: LogLevel.
	Expected JSON tag: logLevel.
	Environment variable: MY_APP_LOG_LEVEL.
	Default: 0.
	Description: Can be set to 0, 1, 2..
Field: Env.
	Expected JSON tag: env.
	Environment variable: MY_APP_ENVIRONMENT.
	Default: Dev.
	Description: Dev, Test, or Production..

For now, I’d just stick with Viper’s examples. Viper has been used successfully on many production projects including Hugo. I’d be willing to bet it can meet your needs just fine. If not, configuration is a relatively simple topic and you could just use the stdlib. As noted, you can use struct tags to build your own configuration object with metadata embedded in the struct itself.

jtuchel · July 22, 2023, 8:19pm

Thanks for your reply. I tried to follow your suggestions, would you mind having a look? As a Go beginner I have to get used to the idiomatic way…

( real code is split into multiple files, just an example )

// using fmt, zerolog, viper

type LogLevelTooLowError struct {
	ActualLogLevel  int8
	MinimumLogLevel int8
}

func (logLevelTooLowError LogLevelTooLowError) Error() string {
	return fmt.Sprintf("Log level is too low. Actual log level is %d, but the minimum log level is %d.", logLevelTooLowError.ActualLogLevel, logLevelTooLowError.MinimumLogLevel)
}

// ##########################################################

type LogLevelTooHighError struct {
	ActualLogLevel  int8
	MaximumLogLevel int8
}

func (logLevelTooHighError LogLevelTooHighError) Error() string {
	return fmt.Sprintf("Log level is too high. Actual log level is %d, but the maximum log level is %d.", logLevelTooHighError.ActualLogLevel, logLevelTooHighError.MaximumLogLevel)
}

// ##########################################################

type Configuration struct {
	Level int8 `mapstructure:"LOGGING_LEVEL"`
}

// ##########################################################

func GetConfiguration() (Configuration, error) {
	viper.SetDefault("LOGGING_LEVEL", int8(zerolog.WarnLevel))

	var configuration Configuration

	err := viper.Unmarshal(&configuration)

	return configuration, err
}

// ##########################################################

func ValidateConfiguration(configuration Configuration) error {
	minimumLogLevel := int8(zerolog.TraceLevel)

	if configuration.Level < minimumLogLevel {
		return LogLevelTooLowError{
			MinimumLogLevel: minimumLogLevel,
			ActualLogLevel:  configuration.Level,
		}
	}

	maximumLogLevel := int8(zerolog.PanicLevel)

	if configuration.Level > maximumLogLevel {
		return LogLevelTooHighError{
			MaximumLogLevel: maximumLogLevel,
			ActualLogLevel:  configuration.Level,
		}
	}

	return nil
}

Any suggestions?

Dean_Davidson · July 22, 2023, 10:46pm

This is pretty verbose. It seems you are wanting to declaratively define your validation logic in structs rather than just writing code to validate them. Is that correct? If I were going to do something like this, I think I’d either import a validation module of some kind or quickly roll my own. For the “import existing module” version, maybe take a look at this?

I think most gophers would just keep it simple and write something like this:

type Configuration struct {
	Level int8 `mapstructure:"LOGGING_LEVEL"`
}

func (c Configuration) Validate() error {
	minimumLogLevel := int8(zerolog.TraceLevel)
	if c.Level < minimumLogLevel {
		return fmt.Errorf("Log level is too low. Actual log level is %d, but the minimum log level is %d.", c.Level, minimumLogLevel)
	}
	maximumLogLevel := int8(zerolog.PanicLevel)
	if c.Level > maximumLogLevel {
		return fmt.Errorf("Log level is too high. Actual log level is %d, but the maximum log level is %d.", c.Level, maximumLogLevel)
	}
	return nil
}

func GetConfiguration() (Configuration, error) {
	viper.SetDefault("LOGGING_LEVEL", int8(zerolog.InfoLevel))
	var configuration Configuration
	err := viper.Unmarshal(&configuration)
	return configuration, err
}

… and then write unit tests to make sure it behaves how you want it to and continues to do so in the future. The fast compile times means unit tests are inexpensive to run. This code is extremely easy to read and reason about.

drornir.dev · August 9, 2023, 1:27pm

I’m sorry for the shameless self plug, but I written a blog post about this

A Declarative Config for Golang

The tl;dr is that I feel that there isn’t a good idiomatic way for this. The best way I saw is how kubebuilder are doing it with code generation on top of structs. Specifivally, they created a concept called “markers”, where it’s like tags, but for build time.

Here’s a link to how they feel like from the kubebuilder book

// CronJobSpec defines the desired state of CronJob
type CronJobSpec struct {
    //+kubebuilder:validation:MinLength=0

    // The schedule in Cron format, see https://en.wikipedia.org/wiki/Cron.
    Schedule string `json:"schedule"`

    //+kubebuilder:validation:Minimum=0

    // Optional deadline in seconds for starting the job if it misses scheduled
    // time for any reason.  Missed jobs executions will be counted as failed ones.
    // +optional
    StartingDeadlineSeconds *int64 `json:"startingDeadlineSeconds,omitempty"`

system · November 7, 2023, 1:28pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.