De/Serializing interface data

mAdkins23 · October 14, 2021, 12:47am

I would like to de/serialize (go both ways, as it were) Go structs from/into various forms such as JSON, YAML, BSON, and so forth. This is pretty simple until one considers deserializing interfaces.

Solutions to this are available on the web, but there isn’t a simple framework to deal with the necessary steps. I’ve been wrestling with this problem for a while as I’d really like it to just work and it always seems to require a lot of application-specific (or perhaps I should say interface-specific) coding.

My current solution is in github.com/madkins23/go-serial. It sort of does what I want but trying to use it in an existing code body wasn’t as simple as I’d hoped. I find myself wondering if it’s worth putting more time into it.

Mostly I find myself wondering if I’m not trying to do something anti-Go. Like when I really, really want dynamic lookup of “subclass” methods because I’m thinking of my data as coming in classes instead of composited structs (which would most likely solve the current problem…I believe I have code for that solution in Java in some partially completed personal project). Usually I have to reset my expectations and think like a Gopher before things get clear and simple and I think perhaps this may be true here.

I wouldn’t mind some general discussion. I’m not trying to build some uber-framework that will solve everyone’s problems. This is just something that has fallen out of my own personal (and admittedly silly) projects. I’m more interested in how I’ve really misunderstood the language and I should just do this or that and all will be well.

Since I’m hoping to keep comments and discussion to one forum (this one) I’ve turned off commentary and discussion in github. And issues. Really not ready for issues.

mje · October 14, 2021, 2:55pm

Probably so. The strong static typing of Go requires you to define types ahead of time. The fact that reflection is onerous should provide incentive to do that.

Sometimes for quick deserialization hacks I decode into a map[string]interface{}, but that’s just for prototyping before I settle on a better definition of the struct to decode into.

christophberger · October 15, 2021, 2:47pm

Hi @mAdkins23,

Your readme says,

An interface may be filled with instances of any type that implements the interface, so the decoder can’t know what type to generate to fill the interface.

Ok, so the Go interface{} type naturally provides no clue about the type to be read. However, there is still the JSON data itself.

Question: wouldn’t the definition of the JSON data provide enough type information for unmarshalling the data?

What I mean is, if the unmarshaller knows that the JSON struct contains, for example, a string at the place it is going to read next, it knows it has to turn that into a Go string.

What am I (obviously) missing?

mAdkins23 · October 15, 2021, 3:12pm

Deserialization (or unmarshaling) in Go works as you describe for struct objects. The unmarshaler uses reflection to figure out that the next field is a string, looks it up by the field name, and fills in the field in the struct from the contents of the JSON field.

Where the next field is an interface this is not possible as the interface could be instantiated by pretty much any type. There is no way to attach a method to the interface itself to figure out which type to instantiate, and any unmarshaling code must be attached to a type implementing that interface. The only place to attach the code that figure out the type, instantiate it, and then unmarshal that instantiated object is to the struct that has that interface as a field.

JSON data itself is generally agnostic. In the most likely case where the data is a Go struct the JSON data equivalent will be a map. In fact JSON can be unmarshaled into a map[string]interface{}.

You could theorize that the different implementations of an interface could be teased apart by looking at the data in the JSON map (for example an Animal interface might distinguish between Cat and Dog implementations by looking for the Barks field) but that requires the deserialization code to be tightly coupled to the actual implementations of the interface. And this code must be attached not to the interface but to every struct that has a field with that interface.

More general approaches either add a “type” field to the JSON map or wrap the object in a map with a type field. The current go-serial implementation uses the latter. But again, that wrapping/unwrapping process must be attached to every “parent” struct with fields that are interfaces.

christophberger · October 15, 2021, 4:08pm

You could theorize that the different implementations of an interface could be teased apart by looking at the data in the JSON map (for example an Animal interface might distinguish between Cat and Dog implementations by looking for the Barks field) but that requires the deserialization code to be tightly coupled to the actual implementations of the interface.

What if the deserializer would get help from metadata that describes all possible incarnations of the incoming data? Then the deserializer would not have to know about actual implementations. (Like, e.g., "I found a Barks field.

And maybe the JSON data itself can also be defined to contain enough type information so that distinguishing between “Cat” and “Dog” would not require deducing the type from detected fields (like, “here comes a dog”, instead of, “umm, if it barks it probably is a dog”).

mAdkins23 · October 15, 2021, 11:57pm

Adding data into the JSON in the form of an extra “type” field or a wrapper containing the type and a pointer to the actual data is the general solution.

Deserializing these data formats is done by the appropriate library (e.g. encoding/json or gopkg.in/yaml.v3). Both libraries use metadata from the Go code and provide hooks to override behavior for specific classes. This doesn’t work for interfaces which are abstractions that can’t have methods directly attached.

So deserializing a struct such as:

type Cage struct {
    contains Animal
}

where Animal is an interface that may be Cat or Dog is the problem.

There are a lot of articles about this (e.g. GOLANG JSON SERIALIZATION WITH INTERFACES) on the interwebs. I have been trying to build myself a framework to make this simpler than engineering a custom solution for each occurrence.

christophberger · October 21, 2021, 9:35pm

~~Do you have an example of the JSON data that you want to deserialize into that struct?~~

~~And is this attempt to process inheritance-based data with a non-OOP language of academic nature or is there a real-world problem attached to it?~~

Edit: So a general deserializer does not get enough info from the Go code for analyzing the incoming JSON data, and without that data, or suitable standardized metadata, the deserializer has de facto no chance.

For this situation, would a package like gojsonq be a feasible option?

In a nutshell, jsonq parses any JSON and provides a query interface instead of filled structs.

The rest (that is, knowing the structure of the data and how to query for the desired info) is left to the user of the package.

LeonardAdams · December 10, 2021, 5:51am

I really enjoyed reading this topic. The Deserialize method can deserialize an object from the stream into the target object. Reading your post, I believe that Stream is easier to use if another source is not provided, and now I can easily object to implementing it and also check programmeur inhuren oekraine for quality work. In this article, I’m presenting my research in progress rather than a tutorial. I will share my findings in case it helps someone else on their own journey.

system · March 10, 2022, 5:51am

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.