Golang performance degrades with increased document size


Can anyone help me to understand , how to tackle with golang performance .
Am inserting 5lakh documents into couchbase database,(tried both channels and waitsync group approach)

4kb document each (toatl 5lakh records) - 191 secs
1kb document each (toatl 5lakh records) - 53 secs

it’s huge difference in performance . whereas in C++ same is taking 7secs and 11secs respectively .

Below is my code :

package main
import (
type Insert_doc struct {
    Thread_id int 
    KTAB, SyncBuffer string

type Configuration struct {
    NumofDoc  int
    ServerIp  string
    Username   string
    Password   string
    Randstr   int
    BucketName string
    ThreadCount int
    Port int
    OP_TYPE int
func main() {
    configuration := Configuration{}
    _ = gonfig.GetConf("Couchbase_config.json", &configuration)
    fmt.Println("Config File name Passed :  Couchbase_config.json")
    fmt.Println("ThreadCount : ",configuration.ThreadCount)  //ThreadCount:2
    fmt.Println("Server IP : ",configuration.ServerIp)
    fmt.Println("Number of Requests per thread : ",configuration.NumofDoc) //NumofDoc: 250000
    var wg sync.WaitGroup 
    for i := 0; i < configuration.ThreadCount; i++ {   //ThreadCount : 2
        go worker(&wg,i,configuration.OP_TYPE)


func worker(wg *sync.WaitGroup,id int,s int) {
    configuration := Configuration{}
    _ = gonfig.GetConf("Couchbase_config.json", &configuration)
    var insertCount int64 = 0
    var readCount int64 = 0
    var readproportion int
    var updateproportion int
    var opsSequence[100]int
    operation_type := s
    cluster, err := gocb.Connect(configuration.ServerIp) //Connects to the cluster

    if err != nil {

        Username: configuration.Username,
        Password: configuration.Password,

    var bucket *gocb.Bucket
    bucket, err = cluster.OpenBucket(configuration.BucketName, "") //Connects to the bucket
    if err != nil {

    if operation_type == 1 {
        updateproportion = 100
        readproportion = 0
    } else if operation_type == 2 {
        updateproportion = 0
        readproportion = 100
    } else if operation_type == 3 {
        updateproportion = 50
        readproportion = 50

    for b := 0; b < updateproportion; b++ {
        opsSequence[b] =1

    for b := 0; b < readproportion; b++ {
    Thread_Start := time.Now().Unix()
    for j :=0; j < configuration.NumofDoc; j++ {   //NumofDoc : 250000
       k := j%100;
       optype := opsSequence[k];
       var x int = int(readCount % 5000);
           case 1:
               document := Insert_doc{Thread_id: id, KTAB: "INSERT", SyncBuffer: RandomString(configuration.Randstr)} // Randstr - 4000
               test := "Go_Demo_"+strconv.Itoa(id)+"_"+strconv.Itoa(int(insertCount))
               createDocument(bucket,test, &document)
           case 2:
               test := "Go_Demo_"+strconv.Itoa(id)+"_"+strconv.Itoa(x)

               fmt.Println("Invalid Operation Type ",optype)

    Thread_End := time.Now().Unix()
    timediff := Thread_End - Thread_Start
    var avgLatency float64 = float64(timediff)/float64(insertCount+readCount);
    var opsPerSec float64 = 1/avgLatency;
    fmt.Printf("THREAD_ID %d TOTAL WRITE : %d, TOTAL READ : %d, TOTAL OPERATION TIME : %d, AVG_LATENCY = %f S, OPS_PER_SECOND = %f \n",id, insertCount,readCount, timediff, avgLatency,opsPerSec);

func createDocument(bucket *gocb.Bucket,documentId string, document *Insert_doc) {
    _, error := bucket.Upsert(documentId, document, 0)
    if error != nil {

func RandomString(n int) string {
    var letter = []rune("abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789")
    b := make([]rune, n)
    for i := range b {
    b[i] = letter[rand.Intn(len(letter))]
    return string(b)

func getDocument(bucket *gocb.Bucket,documentId string) {
    var get_data Insert_doc
    _, error := bucket.Get(documentId, &get_data)
    if error != nil {


2 spotted pitfalls and some questions:



You have a O(n^2) in your for j :=0; j < configuration.NumofDoc; j++ { //NumofDoc : 250000 with RandomString(..). If configuration.Randstr is not constant, you can see large differences when it is a large value.

Inefficient usage of rune type (do this first and benchmark again)

Secondly, there is no need to go into rune since the random string characters are ASCII compatible. You only use rune when your characters list contains special characters (e.g. Japanese, Chinese, Hindi, Russian, etc.). Here’s an example of using []byte for randomness: https://gist.github.com/dopey/c69559607800d2f2f90b1b1ed4e550fb#file-main-go-L31

FYI, a rune value is a “multi-byte” values. The differences in performance is huge between using []rune and []byte even as a string processor.



Is the C++ code operate in parallelism? If yes, you need to set Go to run with parallelism. Go is concurrency first. See:

  1. https://www.ardanlabs.com/blog/2014/01/concurrency-goroutines-and-gomaxprocs.html
  2. https://stackoverflow.com/questions/44039223/golang-why-are-goroutines-not-running-in-parallel
  3. https://stackoverflow.com/questions/52975260/concurrency-vs-parallelism-when-executing-a-goroutine
  4. https://blog.golang.org/waza-talk
  5. https://golang.org/pkg/runtime/#GOMAXPROCS

Running on same hardware and OS?

Although sounds silly but just to double check, both C++ and Go are symbol-free compiled version running on the same hardware, same operating systems, which the same network configurations, right?


If the above are cleared and you still get slower results, then the problems are likely caused by the packages’ integration. In this case, you will need to mock each packages and identify which package is causing performance problems.

Then, proceed to work with your investigations.


Also, shift this to constant group. You don’t need to declare the same string into variable for every iterations. Topping up with rune conversion and O(n^2) effect, this is a complete waste of computing cycles.


Thank you Hooloway , Yep this was the issue . Got solved now . Thanks alot :slight_smile:

1 Like

Would you mind letting us know which fix had the greatest impact on performance? Was it the variable declaration you indicated or any of the other suggestions? And what is your performance now (# seconds)?


This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.