I need to extract a few fields from an XML file and wonder how to go about it.
First, some context:
I need to fetch a libvirt VM’s snapshot information, which is encoded in XML.
I do this with data, _ := snap.GetXMLDesc(0)
(data is a string var… so yes, libvirt returns ±8kb files in a single string var).
If I write that string in an xml file, it amounts to 188 lines (7.8kb) of fileds + attributes. I only need 3 of them, really.
From various readings on the net, I gather that to unmarshall the XML file, I would need to create a data struct where I’d map all fields/attributes, and so on. Surely there’s a better way, when I need about 3 lines of that XML file?
One workaround I thought about is to dump that string var in a file, and “grep” within that file to get my info, but found that un-elegant. There must be a way to only map the info I need from that file than that ?
Here’s a sample of the XML file. Let’s say I wanted only the “parent”, “creation type” and “type arch” fileds + attributes, besides dumping the XML to a file, I do not see my way around that.
You might take a look at “sax” or “stream” parsing of XML.
Alternatively XPath.
If the library you decide to use is implemented well, memory consumption will be relatively minimal, plus/minus some garbage collectable artifacts which generally happen while seeking files.
Oh… I guess I’ve missed XPath, for some reason ?!. It is very similar to a Python3 lib I’ve used, awhile pack (must be a port/fork). This is the closest I’ve came across for a solution, today.
I was going the workaround way I’ve mentioned in my OP, pinching my nose all the way down
Thanks, @NobbZ . I do not know why I’ve missed XPath in my search ! Do not care, I’ve a solution, now.
Easier yet than XPath, @clbanning , thanks. I did not think you could “partial map” between a struct and an xml doc. I was to test it in the playground, and well, got carried away with @nobbz’s solution.
Ok, @clbanning , I thought it worked, but it does not (note to self: compiles != works). I’m not sure if it’s my limited so-far knowledge of GO, or my rusty rememberance of XML docs: here’s an edited version of the XML I need to parse:
Now, I need the snapshot name, parent name (if present) and creationTime.
I’ve built the following structs:
type ParentElement struct {
XMLName xml.Name `xml:"parent"`
Name string `xml:"parentname"`
}
type SnapshotXMLstruct struct {
SnapshotName string `xml:"snapname"`
Creationdate uint64 `xml:"creationdate"`
ParentName ParentElement `xml:"parent",omitempty`
}
ParentName has omitempty set as this tag might be missing in the XML doc.
My code to retrieve the XML and append the 3 needed fields in my own struct is thus:
var snapXMLdata SnapshotXMLstruct
var snaps []SnapshotXMLstruct
<snip>
snapshots, _ := domain.ListAllSnapshots(0)
for _, snap := range snapshots {
data, _ := snap.GetXMLDesc(0)
err := xml.Unmarshal([]byte(data), &snapXMLdata)
if err != nil {
fmt.Println("Error: ", err)
os.Exit(0)
}
snaps = append(snaps, snapXMLdata)
}
The data var is non-empty, so it’s not a question of not fetching a valid XML doc.
… yet, snapXMLdata is empty, and err == nil.