Performance of a database application is determined by the efficiency of both the client and the server. Naturally, client performance is affected by the lan-guage in which it is written. Comparing clients written in different lanlan-guages is always a challenge, due to the myriad of possible configurations. Even if you settle on one configuration, writing equivalent code in any two languages can be tricky.
Despite the difficulties involved, I followed the rationale that some data is bet-ter than none, and decided to write a simple benchmark for the languages dis-cussed in this book: C, PHP, Perl, and Java. I ran it on one test system (Pentium III 500 with 256MB RAM running Linux); the results appear in this section. I’ve provided the source code, so you can try it on the systems you have access to and modify the benchmark code to accommodate your particular needs.
In each language, the benchmark is a command-line application that accepts three arguments from the user:
■■ num_rows:Number of rows in the table
■■ num_cols:Number of string columns in the table in addition to the pri-mary key, which is always an integer
■■ num_loops:Number of select queries to run in the benchmark
The benchmark creates a table with an integer primary key and num_cols string columns of type CHAR(10) NOT NULL. The code populates the table with data using num_rows one-row inserts, timing the insert operation in the process.
Then, it performs num_loops selects of one column (physically in the middle of the record) in a random row on the primary key, also timing the operation.
The benchmark code is available on the book’s Web site. Each benchmark assumes you have a MySQL server running on the local machine, that you can log in it to it as root with no password, and that you have a database called test.
If this is not the case, you can either temporarily modify the server configura-tion or modify the appropriate lines in the source code of the benchmarks you would like to run. The instructions to run the benchmark for each particular language are provided in the following subsections.
C
The source file is called benchmark.c. To compile it, use the standard C client compilation procedure described in Chapter 8. For example:
gcc -o benchmark -I/usr/include/mysql benchmark.c -L/usr/lib/mysql -lmysqlclient -lz -static
To run the benchmark, use the following syntax:
./benchmark num_rows num_cols num_loops
For example, you can run it with a table of 1000 rows and 10 string columns per-forming 2000 select queries, as follows:
./benchmark 1000 10 2000
PHP
The source file is called benchmark.php. To run it, you need a PHP command-line interpreter, which you can produce on a Unix system by downloading the PHP source from www.php.net and running the following:
./configure --with-mysql make
make install
On Windows, the binary distribution contains the command-line interpreter PHP.EXE.
To run the benchmark, use the following syntax:
php benchmark.php num_rows num_cols num_loops
For example, you can run it with a table of 1000 rows and 10 string columns per-forming 2000 select queries, as follows:
php benchmark.php 1000 10 2000
Perl
The source file is called benchmark.pl. You need to have Perl installed. The benchmark code uses the Time::HiRes module available from www.cpan.org.
To run the benchmark, use the following syntax:
perl benchmark.pl num_rows num_cols num_loops
For example, you can run it with a table of 1000 rows and 10 string columns per-forming 2000 select queries, as follows:
perl benchmark.pl 1000 10 2000
Choosing a Client Language 97
Java
The source file is called Benchmark.java. To be able to compile and run it, you need to have a JDK (Java Development Kit) installed. You also need the MySQL Connector/J JDBC driver.
To compile the source, use the following command:
javac Benchmark.java
To run it, use the following syntax:
java Benchmark num_rows num_cols num_loops
For example, you can run it with a table of 1000 rows and 10 string columns per-forming 2000 select queries, as follows:
Java benchmark 1000 10 2000
Test Results
The tests (run on a Pentium III 500 with 512KB CPU cache and 256MB RAM running Linux 2.4.19) yielded the results shown in Table 6.1 for 1000 rows with 5000 loop iterations.
Table 6.1 Client Language Benchmark Results
NUM_COLS/OPERATION 10 50 100 200
C/inserts per second 1367 984 693 445
C/selects per second 1748 1738 1708 1575
PHP/inserts per second 1283 838 607 393
PHP/selects per second 1493 1361 1336 1321
Perl/inserts per second 716 612 402 332
Perl/selects per second 801 759 782 827
Java/inserts per second 659 523 424 262
Java/selects per second 856 820 802 803
The tests were run against MyISAM tables on MySQL 3.23.52. using PHP version 4.0.4pl1, Perl version 5.005_03, and Sun JDK version 1.3.0 with MySQL Connector/J version 2.0.14. I ran the tests several times for each configuration, and the variations were quite significant due to caching. In each case, I ignored the off-the-curve slow results that occurred due to bad caching.
As you would expect, as the record becomes longer, the speed of the insert operation decreases for all languages. The speed of a one-column select tends to decrease somewhat as the record length increases, but the change is not sig-nificant; and, in some cases, a longer record yields produces faster perfor-mance, contrary to expectations, due to variations introduced by caching.
The C client is fastest. However, PHP is not far behind; it stays within a 10%
margin. This result is easy to understand if you consider the PHP client archi-tecture—all calls are basically direct wrappers around the C client library routines.
Perl and Java are nearly tied, with Perl perhaps a tiny bit faster, but both fall behind C and PHP. This result is also easy to understand. The Perl client has a thick DBI layer to penetrate before it gets down to the low-level C API calls.
This is the price you pay for the portability of DBI. Java is unique in that it does not call the low-level C API routines. The Connector/J JDBC driver implements the MySQL client/server communication protocol. The speed loss is due to the JDBC overhead, as well as the Java-versus-C overhead in the protocol implementation.