Mysql Performance between go and php7

Glenn_Hancock · February 20, 2019, 12:08am

We are a PHP7 shop and have been programming in it for 12+ years. I have been playing with the idea of breaking our webserver down into serviceable units that handle the different tasks and have been looking at doing this in Go for a number of years. One of the issues I have with doing this is the huge amount of code required to work with queries in Go versus PHP. In PHP you just setup a variable to hold your query, execute it, set a row to it or loop through and each variable comes out as a named array (MAP) and you just grab what you need from the result array.

In GO you have to first define variables for each field you’re getting back that shares the type of the field in the database, setup your query string, run it, set a row back from the result and then manually set each field from the result to each variable you defined before. It seems like a lot more work, but work I’m willing to put up with if the benefits are there.

So, I setup a very simple test in my head to compare the difference between PHP and Go with relation to dealing with MYSQL. NOTE: Our database holds more than 1,000 tables so we’re not just talking about querying names and addresses.

Before I list my code I’ll state the problem: I am connecting to the database and then looping 1 million times. In each loop I am grabbing a random number which will match up to an ID in the table and then querying for it. I retrieve the row in both languages but in PHP this function actually retrieves the data and sets it to an array that I can then use. In Go this would require calling rows.Scan which I’m not doing due to not being able to compile with variables I’m not using. So I left that out. But in my mind, the Go program is doing less work than the PHP7 program.

Results: In PHP7 I am able to do my 1 million searches in 55seconds and it seems to be using around 50% CPU average. The Go app performs the same million lookups in just over a minute and never seems to use more than 30% of the CPU.

All tests done on same computer so hardware is not relevant. Also I removed the last line that fetches the row from php and it only saved 4 seconds so not very intense to get data in php. Not sure about golang as it was already slower and didn’t want to further complicate it.

#################################################
PHP Code:

$startrecord = 9086240;
$endrecord = 9644302;

$conn = new mysqli( $server, $user, $pass, "support" );

//loop 1 million times through randomly grabbed error log entries.
for ( $i = 1; $i < 1000000; $i++ )  {   
   $ran = rand( $startrecord, $endrecord );
   $query = "select * from logErrors where ErrorLogID=$ran";

   $result = $conn->query( $query );

   $row = $result->fetch_assoc();
}

#################################################
Go Code:

func main() {
startrecord := 9086240;
endrecord := 9644302;
var DBConnect string


DBConnect = dbUser + ":" + dbPass + "@" 
DBConnect = DBConnect + "(" + dbHost + ":" + dbPort + ")/" + dbName + "?charset=utf8&autocommit=true"
db, _ := sql.Open("mysql", DBConnect)

rand.Seed(time.Now().UnixNano())

//loop 1 million times
for i := 0; i < 1000000; i++ {
   ran := rand.Intn(endrecord-startrecord) + startrecord
   query := "select ErrorLogID, ErrorID from logErrors " +
            "where ErrorLogID=" + strconv.Itoa(ran)
   if rows, err := db.Query( query ); err != nil {
      fmt.Printf( "Error: ?", err )
      return 
   } else {
      for rows.Next() {

      }
    }   
  }   
}

Glenn_Hancock · February 20, 2019, 1:00am

I found another query method called QueryRow that looks for a single record coming back. I tweaked the code to use it since it is more what php is doing and would eliminate the extra for loop. However, using it has brought my time up to 1 min 38seconds.

Since our program deals with thousands of tables you can imagine how many queries go on every second of the day. Most of our logic is built around the results of queries and therefore Go being so much slower than PHP is causing me some grief with pushing this project forward. I hope I am doing something wrong but with such a simple program I’m not sure what.

I also realize that I could spawn off multiple threads to do the searches but that isn’t what php is doing so wouldn’t be a fair comparison. It would also greatly complicate an otherwise fairly simple process if we had to do that every time.

Glenn_Hancock · February 20, 2019, 2:22am

Tweaking things a bit more: Here is a new version along with an error that pops up after 45 seconds of it running. I added the following lines:

db.SetMaxIdleConns(0)                                                                                                     
db.SetMaxOpenConns(100)

Then got this error after looping for a while
Error: dial tcp 127.0.0.1:3306: connect: cannot assign requested address

So my question is: Is it possible go is using multiple mysql connections to run each query since I only added the maxopenconns lines? Here is my code running the loop and query:

var err error                                                   
var errorid int                                                 
var query string                                                
  
 //loop 1 million times                                          
  for i := 0; i < 1000000; i++ {                                  
     //grab random number between our range                       
     ran := rand.Intn(endrecord-startrecord) + startrecord                                          
     query = fmt.Sprintf("select ErrorID from logErrors " +       
              "where ErrorLogID=%v ",ran)                                                                                        
     if err = db.QueryRow( query ).Scan(&errorid);  err != nil {  
        fmt.Printf( "Error: %v - %v\n", err, query )              
     } else {                                                     
        //fmt.Printf( "errorid %v\n", errorid )                   
     }                                                           
  }

geosoft1 · February 20, 2019, 6:45am

I see an issue in your code that can slow you down (not to much but can). Instead using strconv use the following construction:

"where ErrorLogID=?",ran

LE: Try to not use fmt in benchmarks for the same reason of speed.

johandalabacka · February 20, 2019, 7:10am

db.SetMaxIdleConns(0) means no connections are allowed to be reused so for each request it must open a new connection. And since you just scans one row you will not “scan until the end” which closes the connection you will soon run reach the maximum connections.

lutzhorn · February 20, 2019, 7:51am

This line may have a performance impact. But what is worse is that it leads to bad style. Remember Bobby Tables.

Always use proper parameter binding, never use string operations to build SQL.

BTW, I don’t think this

$query = "select * from logErrors where ErrorLogID=$ran";

is recommended in PHP. Just imagine $ran being user input. In PHP parameter binding must be used, too.

bins · February 20, 2019, 7:58am

you’re using the math/rand.Intn function that uses behind the scene the default, global, thread-safe rand.Source.

I bet that by replacing:

rng := rand.New(rand.NewSource(time.Now().UnixNano()))

and then:

ran := rng.Intn(endrecord-startrecord) + startrecord

you’ll get a fair performance improvement and a somewhat fairer comparison.
(unless the equivalent PHP’s rand is also thread safe?)

geosoft1 · February 20, 2019, 8:29am

I think you can use even a simple incremental counter instead random numbers both in PHP and Go to have the same reference. Is posible to have a slightly different algorithm for rand which will affect the time of execution

Glenn_Hancock · February 20, 2019, 7:18pm

Thanks, I had already made that change and it didn’t make much difference if any. But point taken.

Glenn_Hancock · February 20, 2019, 7:22pm

I’m not sure about the random process but wanted to use it so that the system didn’t just loop down through the records. My thought is that it would keep the mysql server on its toes looking up records. I guess at the end of the day it probably doesn’t matter much but is go that much slower grabbing random numbers? I’ll give it a shot and see what happens.

Glenn_Hancock · February 20, 2019, 7:23pm

I understand about style and wasn’t trying to write a proper app. Was trying to test performance alone and trying to keep everything as simple as possible. Will adding proper wrapping effect performance? If so I can try that but will be shocked if it matters.

Glenn_Hancock · February 20, 2019, 7:29pm

I changed the random method around like you suggested but it didn’t make any difference at all. If anything it might be a few seconds longer now but could also just be computer running a little slower atm.

So far nothing I’ve tweaked has improved GO with respect to simply reading records.

Thanks,

Glenn

Glenn_Hancock · February 20, 2019, 7:52pm

I removed the random process from each program. This also reduced the number of records I am searching through so it is hard to know exactly how much that was slowing me down. However, with the new programs running I am getting 26seconds for php7 and 34 seconds for Go. So while faster, Go is still a good bit slower searching through the records.

Any other ideas?

geosoft1 · February 20, 2019, 8:50pm

Are you using this driver?

Glenn_Hancock · February 20, 2019, 9:28pm

Yes, I am. However, I just noticed something odd that I’m trying to figure out.

I opened up wireshark to see if I could see differences in the network activity and have an issue.

Both programs are set to communicate with mysql over localhost. The Go program does indeed and I can capture traffic just fine. However, PHP is not communicating over any network connection I can see. Is it possible that it is talking to mysql differently than going over the network? I’ve never noticed this before.

To Clarify, I have the script and mysql running on the same computer over localhost.

NobbZ · February 20, 2019, 9:37pm

Some MySQL client implementations are trying to be smart and connect via the default socket if you connect to localhost, for those you need to use the IP to actually enforce network mode.

Of course using the socket might be faster as it bypasses about half of the network stack, if not more, in the kernel.

For equality of chances either enforce network mode in PHP or use the socket in go as well.

Glenn_Hancock · February 20, 2019, 9:57pm

how do you force go?

NobbZ · February 20, 2019, 9:58pm

Force go what?

Glenn_Hancock · February 20, 2019, 10:15pm

to use the socket instead of localhost?

johandalabacka · February 20, 2019, 10:28pm

Hi

Run the tests myself. Both php7 and go talking to the database over tcp (it was in a docker container). php was slightly faster ~10%. I also tried using a prepered question in both languages and it didn’t make any difference.