So in my organization i am working on a problem where we hit a internal url to get pages of xmk information that we need to parse and send up to a S3 bucket. Currently we do that in a lambda running node js application. Now because of Node’s leaky futures implementation we are unable to use parallel processing. Instead we are having to hit the url and traverse through the chain of urls sequentially to work around that problem.
I want to use go to parallelise this job and move the node app to Go.
A caveat is that the html files can be 1mb - 4 mb in size and a minimal parsing needs to be done in the app. Node.js has good support for XML parsers.
What does XML parsing look like in Go?
I came across this issue which put kind of a dampener on my plan.
Can dsomeone please suggest what is the best way to approack xml parsing in golang.
So on one app i need very cursory parsing. However on the other I need more detailed parsing. I am interested in validating if at all the parallelism gotten through Go be a improvement over the node js thing we are doing today.
XML parsing is mainly IO bound. Parsing multiple XML files concurrently may very well be faster than parsing one after the other. But you will have to try and benchmark.