Am new to GO doing POC for my project .Can anyone please help me with the sample program to generate 5million records in loop and inserting these records to couchbase db using golang to check performance .
Any sample program to run and check the performance would be very helpful due to timeline constraint .
Do you have anythin already you can show use? You will need:
a Couchbase DB installation
a Couchbase DB client library for Go
a simple Go program to connect to the DB and generate and insert records
What exactly do you want to measure? The Go part will not be the problem. If anything affects performance, it is how fast the DB can insert records and how fast or slow your network connection is.
Benchmarking databases or performance testing is not an easy task, first you should have full understanding of the database/s. You need to define correct schemes/tables/collections/indexes/analysers/engines in the given db and then you can run your tests. If you don’t have strong knowledge on the dbs, I recommend you to make research, check existing benchmark in your use case and then consult someone who knows better.
If it is a fun project, start from somewhere, do your tests, try to increase performance, decrease resource usages, and then if you think this is all you can do, ask for help.
i already have completed all these steps . Below is my program . I have strong knowledge on database side but am new to GOLANG ,writting program for testing is taking more time for me than defined timeline for my project .Could you please help me to modify my code how can i insert 5lakhs record by configuring 5 threads , each thread inserts 1lakh records parallelly
package main
import (
"fmt"
"gopkg.in/couchbase/gocb.v1"
"strconv"
"github.com/zenthangplus/goccm"
"time"
"runtime"
)
var (
bucket *gocb.Bucket
)
type Person struct {
FirstName, LastName string
}
func main(){
cluster, err := gocb.Connect("couchbase://ip") //Connects to the cluster
if err != nil {
fmt.Println(err.Error())
return
}
cluster.Authenticate(gocb.PasswordAuthenticator{
Username: "",
Password: "",
})
fmt.Println("Cluster:%v",cluster)
bucket, err = cluster.OpenBucket("", "") //Connects to the bucket
if err != nil {
fmt.Println(err.Error())
return
}
fmt.Println("Bucket:%v",bucket)
// Limit 3 goroutines to run concurrently.
c := goccm.New(3)
dt := time.Now()
fmt.Println("Start Current date and time is: ", dt.String())
fmt.Println(runtime.NumCPU())
for i := 1; i <= 500000; i++ {
fmt.Println("Starting new thread... %v",i )
person := Person{FirstName: "Syeda", LastName: "Anjum"}
test := "Go_Demo_"+strconv.Itoa(i)
// This function have to call before any goroutine
c.Wait()
go createDocument(test, &person)
fmt.Println("Completed thread... %v",i)
c.Done()
}
fmt.Println(runtime.NumCPU())
// This function have to call to ensure all goroutines have finished after close the main program.
c.WaitAllDone()
fmt.Println("End Current date and time is: ", dt.String())
fmt.Println(runtime.NumCPU())
}
func createDocument(documentId string, person *Person) {
fmt.Println("Upserting a full document...")
_, error := bucket.Upsert(documentId, person, 0)
if error != nil {
fmt.Println(error.Error())
return
}
fmt.Println("After Upsert fetching the document")
}
i need to check how much time golang takes to perform bulk insert and get operartion from to couchbase .How it is more efficient than other programming languages and how spawning threads i can implement through GO and check the performance as i have tested in c++ like below
Number of Requests per thread : 100000
Total Number of threads : 5
WRITE PROPORTION : 50%
READ PROPORTION : 50%
THREAD_ID 3 TOTAL WRITE : 50000, TOTAL READ : 50000, TOTAL OPERATION TIME : 30.13 S, AVG_LATENCY = 0.000301 S, OPS_PER_SECOND = 3319.12
THREAD_ID 2 TOTAL WRITE : 50000, TOTAL READ : 50000, TOTAL OPERATION TIME : 30.22 S, AVG_LATENCY = 0.000302 S, OPS_PER_SECOND = 3309.02
THREAD_ID 1 TOTAL WRITE : 50000, TOTAL READ : 50000, TOTAL OPERATION TIME : 30.25 S, AVG_LATENCY = 0.000303 S, OPS_PER_SECOND = 3305.39
THREAD_ID 0 TOTAL WRITE : 50000, TOTAL READ : 50000, TOTAL OPERATION TIME : 30.29 S, AVG_LATENCY = 0.000303 S, OPS_PER_SECOND = 3301.78
THREAD_ID 4 TOTAL WRITE : 50000, TOTAL READ : 50000, TOTAL OPERATION TIME : 30.31 S, AVG_LATENCY = 0.000303 S, OPS_PER_SECOND = 3298.83
Here are an example worker/thread implementation in go. You can’t control OS thread in go, but you can limit go scheduler to use only specified number go routine on the given jobs.
We need first limited number go routine to reach 5 concurrency as you desire:
for i := 0; i < 5; i++ {
// create jobs per worker/thread
workerJobs := make(chan interface{}, jobsCount)
jobs = append(jobs, workerJobs)
go worker(i, workerJobs, results)
}
and then send jobs to all created workers
for _, job := range jobs {
for j := 1; j <= jobsCount; j++ {
job <- j
}
close(job)
}
Wait all workers to complete their jobs
// wait all workers to complete their jobs
for a := 1; a <= jobsCount*len(jobs); a++ {
//wait all the workers to return results
<-results
}
Change the jobs from workerJobs := make(chan interface{}, jobsCount) to workerJobs := make(chan Person, jobsCount)
And call createDocument in worker:
func worker(id int, jobs <-chan Person, results chan<- interface{}) {
// do your thread measurements here
fmt.Printf("worker %d started \n", id)
for person := range jobs {
createDocument("randomdoc", person)
results <- job
}
fmt.Printf("worker %d completed \n", id)
}
modified my code as suggested ,but am stuck on this below error. Tried declaring j as struct but couldn’t get to use it later for for operation
command-line-arguments
./goprocs.go:54:8: cannot use j (type int) as type Person in send
package main
import (
"fmt"
"gopkg.in/couchbase/gocb.v1"
)
var (
bucket *gocb.Bucket
)
type Person struct {
FirstName, LastName string
}
func main() {
// if you have more than 5 core, make it 5 to use 5 os thread
//runtime.GOMAXPROCS(5)
jobsCount := 50
cluster, err := gocb.Connect("couchbase://") //Connects to the cluster
if err != nil {
fmt.Println(err.Error())
return
}
cluster.Authenticate(gocb.PasswordAuthenticator{
Username: "",
Password: "",
})
fmt.Println("Cluster:%v",cluster)
bucket, err = cluster.OpenBucket("", "") //Connects to the bucket
if err != nil {
fmt.Println(err.Error())
return
}
fmt.Println("Bucket:%v",bucket)
person := Person{FirstName: "Syeda", LastName: "Anjum"}
var jobs []chan Person
results := make(chan interface{}, jobsCount)
//jobs := make(chan int, jobsCount)
//results := make(chan int, jobsCount)
fmt.Println("value of results: %v",&results)
for i := 0; i < 5; i++ {
// create jobs per worker/thread
workerJobs := make(chan Person, jobsCount)
jobs = append(jobs, workerJobs)
go worker(i, workerJobs, results)
}
for _, job := range jobs {
for j:= 1; j <= jobsCount; j++ {
job <- j
}
close(job)
}
// wait all workers to complete their jobs
for a := 1; a <= jobsCount*len(jobs); a++ {
//wait all the workers to return results
<-results
}
fmt.Println("finished")
}
func worker(id int, jobs <-chan Person, results chan<- interface{}) {
// do your thread measurements here
fmt.Printf("worker %d started \n", id)
for person := range jobs {
createDocument("randomdoc", &person)
results <- person
}
fmt.Printf("worker %d completed \n", id)
}
func createDocument(documentId string, person *Person) {
fmt.Println("Upserting a full document...")
_, error := bucket.Upsert(documentId, person, 0)
if error != nil {
fmt.Println(error.Error())
return
}
fmt.Println("After Upsert fetching the document")
}
variable job defined in main() is not being picked in function worker ,used the same function worker given by you . i edited it to jobs it worked ,but only one document inserted in COUCHBASE
command-line-arguments
./goprocs.go:63:14: undefined: job
Didn’t understand what this below job does?
for _, job := range jobs {
for j := 1; j <= jobsCount; j++ {
job <- Person{FirstName: "Syeda", LastName: "Anjum"}
}
close(job)
}
Could you please check the ## comments mentioned in code ?
Below is the complete code
package main
import (
"fmt"
"gopkg.in/couchbase/gocb.v1"
)
var (
bucket *gocb.Bucket
)
type Person struct {
FirstName, LastName string
}
func main() {
// if you have more than 5 core, make it 5 to use 5 os thread runtime.GOMAXPROCS(5)
jobsCount := 500
cluster, err := gocb.Connect("couchbase://10.10.216.53") //Connects to the cluster
if err != nil {
fmt.Println(err.Error())
return
}
cluster.Authenticate(gocb.PasswordAuthenticator{
Username: "golang",
Password: "mavenir",
})
fmt.Println("Cluster:%v",cluster)
bucket, err = cluster.OpenBucket("golang", "") //Connects to the bucket
if err != nil {
fmt.Println(err.Error())
return
}
fmt.Println("Bucket:%v",bucket)
var jobs []chan Person
results := make(chan interface{}, jobsCount)
for i := 0; i < 5; i++ {
// create jobs per worker/thread
workerJobs := make(chan Person, jobsCount)
jobs = append(jobs, workerJobs)
go worker(i, workerJobs, results) ########### jobs should be passed right instead of workerJobs? i tried passing jobs got error ./goprocs.go:43:18: cannot use jobs (type []chan Person) as type <-chan Person in argument to worker
}
for _, job := range jobs {
for j := 1; j <= jobsCount; j++ {
job <- Person{FirstName: "Syeda", LastName: "Anjum"}
}
close(job)
}
// wait all workers to complete their jobs
for a := 1; a <= jobsCount*len(jobs); a++ {
//wait all the workers to return results
<-results
}
fmt.Println("finished")
}
func worker(id int, jobs <-chan Person, results chan<- interface{}) {
// do your thread measurements here
fmt.Printf("worker %d started \n", id)
for person := range jobs {
createDocument("randomdoc", &person)
results <- jobs
}
fmt.Printf("worker %d completed \n", id)
}
func createDocument(documentId string, person *Person) int {
//fmt.Println("Upserting a full document...")
_, error := bucket.Upsert(documentId, person, 0)
if error != nil {
fmt.Println(error.Error())
return -1
}
return 1
}
Thank you so much for your extreme help , Am able to insert records with multiple threads configured . Must appreciate your timely response for solving the problem Thanks again