Trying to parse a website with login


(wk) #1

Hello,
I’m trying to parse items from a supplier and cannot seem to figure out why my codes not working, I’m setting my cookie with MercuryEngineering/CookieMonster, and am returning the site data but it’s the login site data and not the product page data:

func main() {
cookies, err := cookiemonster.ParseFile("MyCookieFile.txt")
if err != nil {
	panic(err)
}

cookiejar, err := cookiejar.New(nil)
if err != nil {
	panic(err)
}

u, err := url.Parse("MySite")
if err != nil {
	panic(err)
}

cookiejar.SetCookies(u, cookies)

if err != nil {
	panic(err)
}

// jujujar, err := cookiejar.New(&cookiejar.Options{
// 	Filename: cookies,
// })

// if err != nil {
// 	panic(err)
// }

client := &http.Client{
	Jar: cookiejar,
}

response, err := client.Get("MySiteDirectoryWithProducts")

if err != nil {
	panic(err)
}

query, err := goquery.NewDocumentFromResponse(response)
if err != nil {
	panic(err)
}

myQuery := query.Find("body a").Each(func(index int, item *goquery.Selection) {
	linkTag := item
	link, _ := linkTag.Attr("href")
	linkText := linkTag.Text()
	fmt.Printf("Link #%d: '%s' - '%s'\n", index, linkText, link)
})

fmt.Print(myQuery)
}

(Ali Hassan) #2

show me your code


(Johann Forster) #3

Make sure that the cookies at MyCookieFile.txt are valid.
You can not simply copy the cookies from you browser, as they do generally time out after some time and are specific to you browser (the server might check the User-Agent too).
To query websites with login programmatically, you might also do the login with Go: sending your credentials to the login page to retrieve some new cookies.