YCSB

Benchmark YSQL performance using YCSB.

Note

For more information about YCSB, see:

Overview

This uses a YSQL-specific binding to test the YSQL API using the YCSB benchmark.

Running the benchmark

1. Prerequisites

Note

The binaries are compiled with Java 13 and it is recommended to run these binaries with that version.

Download the YCSB binaries. You can do this by running the following commands.

$ cd $HOME
$ wget https://github.com/yugabyte/YCSB/releases/download/1.0/ycsb.tar.gz
$ tar -zxvf ycsb.tar.gz
$ cd YCSB

Make sure you have the YSQL shell ysqlsh exported to the PATH variable. You can download ysqlsh if you do not have it.

$ export PATH=$PATH:/path/to/ysqlsh

2. Start YugabyteDB

Start your YugabyteDB cluster by following the steps for manual deployment.

Tip

You will need the IP addresses of the nodes in the cluster for the next step.

3. Configure db.properties

Update the file db.properties in the YCSB directory with the following contents. Remember to put the correct values for the IP addresses in the db.url field.

db.driver=org.postgresql.Driver
db.url=jdbc:postgresql://<ip1>:5433/ycsb;jdbc:postgresql://<ip2>:5433/ycsb;jdbc:postgresql://<ip3>:5433/ycsb;
db.user=yugabyte
db.passwd=

Details about other configuration parameters are described in Core Properties on the YCSB repository.

Note

The db.url field should be populated with the IPs of all the nodes that are part of the cluster.

4. Run the benchmark

There is a handy script run_ysql.sh that loads and runs all the workloads.

$ ./run_ysql.sh --ip <ip>

The above command workload will run the workload on a table with 1 million rows. If you want to run the benchmark on a table with a different row count:

$ ./run_ysql.sh --ip <ip> --recordcount <number of rows>

Note

To get the maximum performance out of the system, you would have to tune the threadcount parameter in the script. As a reference, for a c5.4xlarge instance with 16 cores and 32GB RAM, you used a threadcount of 32 for the loading phase and 256 for the execution phase.

5. Verify results

The script creates 2 result files per workload, one for the loading and one for the execution phase with the details of throughput and latency. For example for workloada it creates workloada-ysql-load.dat and workloada-ysql-transaction.dat

6. Run individual workloads (optional)

Connect to the database using ysqlsh.

$ ./bin/ysqlsh -h <ip>

Create the ycsb database.

yugabyte=# CREATE DATABASE ycsb;

Connect to the created database.

yugabyte=# \c ycsb

Create the table.

ycsb=# CREATE TABLE usertable (
           YCSB_KEY VARCHAR(255) PRIMARY KEY,
           FIELD0 TEXT, FIELD1 TEXT, FIELD2 TEXT, FIELD3 TEXT,
           FIELD4 TEXT, FIELD5 TEXT, FIELD6 TEXT, FIELD7 TEXT,
           FIELD8 TEXT, FIELD9 TEXT);

Before starting the yugabyteSQL workload, you will need to load the data first.

$ ./bin/ycsb load yugabyteSQL -s \
      -P db.properties           \
      -P workloads/workloada     \
      -p recordcount=1000000     \
      -p operationcount=10000000 \
      -p threadcount=32          \
      -p maxexecutiontime=180

Then, you can run the workload:

$ ./bin/ycsb run yugabyteSQL -s  \
      -P db.properties           \
      -P workloads/workloada     \
      -p recordcount=1000000     \
      -p operationcount=10000000 \
      -p threadcount=256         \
      -p maxexecutiontime=180

To run the other workloads (for example, workloadb), all you need to do is change that argument in the above command.

$ ./bin/ycsb run yugabyteSQL -s  \
      -P db.properties           \
      -P workloads/workloadb     \
      -p recordcount=1000000     \
      -p operationcount=10000000 \
      -p threadcount=256         \
      -p maxexecutiontime=180

Expected results

Setup

When run on a 3-node cluster with each node on a c5.4xlarge AWS instance (16 cores, 32 GB of RAM, and 2 EBS volumes), all belonging to the same AZ with the client VM running in the same AZ, you get the following results:

1 Million Rows

Workload Throughput (ops/sec) Read Latency Write Latency
Workload A 37,377 1.5ms 12 ms update
Workload B 66,875 4ms 7.6ms update
Workload C 77,068 3.5ms read Not applicable
Workload D 63,676 4ms 7ms insert
Workload E 63,686 3.8ms scan Not applicable
Workload F 29,500 2ms 15ms read-modify-write