Testing horizontal scalability with TPC-C

This page documents the preview version (v2.21). Preview includes features under active development and is for development and testing only. For production, use the stable version (v2024.1). To learn more, see Versioning.

YugabyteDB sustains efficiency and maintains linear growth as a cluster is augmented with more nodes, each node contributing its full processing power to the collective performance. With an efficiency score of around 99.7%, nearly every CPU cycle is effectively used for transaction processing, with minimal overhead.

The following table describes how YugabyteDB horizontally scales with the TPC-C workload.

Nodes vCPUs Warehouses TPMC Efficiency(%) Connections New Order Latency
3 24 500 25646.4 99.71 200 54.21 ms
4 32 1000 34212.57 99.79 266 53.92 ms
5 40 2000 42772.6 99.79 333 51.01 ms
6 48 4000 51296.9 99.72 400 62.09 ms

Get TPC-C binaries

First, you need the benchmark binaries. To download the TPC-C binaries, run the following commands:

$ wget https://github.com/yugabyte/tpcc/releases/latest/download/tpcc.tar.gz
$ tar -zxvf tpcc.tar.gz
$ cd tpcc

Client machine

The client machine is where the benchmark is run from. An 8vCPU machine with at least 16GB memory is recommended. The following instance types are recommended for the client machine.

vCPU AWS AZURE GCP
8 c5.2xlarge Standard_F8s_v2 n2-highcpu-8

Cluster setup

We will use 8vCPU machines for this test. The following cloud provider instance types are recommended for this test.

Set up a local cluster

If a local universe is currently running, first destroy it.

Start a local three-node universe with an RF of 3 by first creating a single node, as follows:

./bin/yugabyted start \
                --advertise_address=127.0.0.1 \
                --base_dir=${HOME}/var/node1 \
                --cloud_location=aws.us-east-2.us-east-2a

On macOS, the additional nodes need loopback addresses configured, as follows:

sudo ifconfig lo0 alias 127.0.0.2
sudo ifconfig lo0 alias 127.0.0.3

Next, join more nodes with the previous node as needed. yugabyted automatically applies a replication factor of 3 when a third node is added.

Start the second node as follows:

./bin/yugabyted start \
                --advertise_address=127.0.0.2 \
                --base_dir=${HOME}/var/node2 \
                --cloud_location=aws.us-east-2.us-east-2b \
                --join=127.0.0.1

Start the third node as follows:

./bin/yugabyted start \
                --advertise_address=127.0.0.3 \
                --base_dir=${HOME}/var/node3 \
                --cloud_location=aws.us-east-2.us-east-2c \
                --join=127.0.0.1

After starting the yugabyted processes on all the nodes, configure the data placement constraint of the universe, as follows:

./bin/yugabyted configure data_placement --base_dir=${HOME}/var/node1 --fault_tolerance=zone

This command can be executed on any node where you already started YugabyteDB.

To check the status of a running multi-node universe, run the following command:

./bin/yugabyted status --base_dir=${HOME}/var/node1

Store the IP addresses of the nodes in a shell variable for use in further commands.

IPS=127.0.0.1,127.0.0.2,127.0.0.3

Setup

To set up a cluster, refer to Set up a YugabyteDB Aeon cluster.

Adding nodes

For the horizontal scale test, set the fault tolerance level to None so that you can add a single node to the cluster.

Store the IP addresses/public address of the cluster in a shell variable for use in further commands.

IPS=<cluster-name/IP>

Benchmark the 3-node cluster

To run the benchmark, do the following:

  1. Initialize the database needed for the benchmark by following the instructions specific to your cluster.

    Set up the TPC-C database schema with the following command:

    $ ./tpccbenchmark --create=true --nodes=${IPS}
    

    Populate the database with data needed for the benchmark using the following command:

    $ ./tpccbenchmark --load=true --nodes=${IPS} --warehouses=2000 --loaderthreads 20
    
  2. Run the benchmark using the following command:

    $ ./tpccbenchmark --execute=true --warmup-time-secs=300 --nodes=${IPS} --warehouses=2000 --num-connections=200
    
  3. Gather the results.

    Nodes vCPUs Warehouses TPMC Efficiency(%) Connections New Order Latency
    3 24 500 25646.4 99.71 200 54.21 ms
  4. Clean up the test run using the following command:

    $ ./tpccbenchmark --clear=true --nodes=${IPS} --warehouses=2000
    

Add the 4th node

Add a node
./bin/yugabyted start \
                --advertise_address=127.0.0.4 \
                --base_dir=${HOME}/var/node4 \
                --cloud_location=aws.us-east.us-east-1a \
                --join=127.0.0.1

Add the new IP address to the existing variable.

IPS=${IPS},127.0.0.4
Add a node using the Edit Infrastructure option and increase the node count by 1.

Re-run the test as follows:

  1. Initialize the database needed for the benchmark by following the instructions specific to your cluster.

    Set up the TPC-C database schema with the following command:

    $ ./tpccbenchmark --create=true --nodes=${IPS}
    

    Populate the database with data needed for the benchmark with the following command:

    $ ./tpccbenchmark --load=true --nodes=${IPS} --warehouses=2666  --loaderthreads 20
    
  2. Run the benchmark using the following command:

    $ ./tpccbenchmark --execute=true --warmup-time-secs=300 --nodes=${IPS} --warehouses=2666 --num-connections=266
    
  3. Gather the results.

    Nodes vCPUs Warehouses TPMC Efficiency(%) Connections New Order Latency
    4 32 1000 34212.57 99.79 266 53.92 ms
  4. Clean up the test run using the following command:

    $ ./tpccbenchmark --clear=true --nodes=${IPS} --warehouses=2000
    

Add the 5th node

Add the 5th node
./bin/yugabyted start \
                --advertise_address=127.0.0.5 \
                --base_dir=${HOME}/var/node5 \
                --cloud_location=aws.us-east.us-east-1a \
                --join=127.0.0.1

Add the new IP address to the existing variable.

IPS=${IPS},127.0.0.5
Add a node using the Edit Infrastructure option and increase the node count by 1.

Re-run the test as follows:

  1. Initialize the database needed for the benchmark by following the instructions specific to your cluster.

    Set up the TPC-C database schema with the following command:

    $ ./tpccbenchmark --create=true --nodes=${IPS}
    

    Populate the database with data needed for the benchmark with the following command:

    $ ./tpccbenchmark --load=true --nodes=${IPS} --warehouses=3333  --loaderthreads 20
    
  2. Run the benchmark from two clients as follows:

    On client 1, run the following command:

    $ ./tpccbenchmark --execute=true --warmup-time-secs=300 --nodes=${IPS} --warehouses=1500 --start-warehouse-id=1 --total-warehouses=3333 --num-connections=333
    

    On client 2, run the following command:

    $ ./tpccbenchmark --execute=true --warmup-time-secs=300 --nodes=${IPS} --warehouses=1833 --start-warehouse-id=1501 --total-warehouses=3333 --num-connections=333
    
  3. Gather the results.

    Nodes vCPUs Warehouses TPMC Efficiency(%) Connections New Order Latency
    5 40 2000 42772.6 99.79 333 51.01 ms
  4. Clean up the test run using the following command:

    $ ./tpccbenchmark --clear=true --nodes=${IPS} --warehouses=2000
    

Add the 6th node

Add the 6th node
./bin/yugabyted start \
                --advertise_address=127.0.0.6 \
                --base_dir=${HOME}/var/node6 \
                --cloud_location=aws.us-east.us-east-1a \
                --join=127.0.0.1

Add the new IP address to the existing variable.

IPS=${IPS},127.0.0.6
Add a node using the Edit Infrastructure option and increase the node count by 1.

Re-run the test as follows:

  1. Initialize the database needed for the benchmark by following the instructions specific to your cluster.

    Set up the TPC-C database schema with the following command:

    $ ./tpccbenchmark --create=true --nodes=${IPS}
    

    Populate the database with data needed for the benchmark with the following command:

    $ ./tpccbenchmark --load=true --nodes=${IPS} --warehouses=4000 --loaderthreads 20
    
  2. Run the benchmark from two clients as follows:

    On client 1, run the following command:

    $ ./tpccbenchmark --execute=true --warmup-time-secs=300 --nodes=${IPS} --warehouses=2000 --start-warehouse-id=1 --total-warehouses=4000 --num-connections=200
    

    On client 2, run the following command:

    $ ./tpccbenchmark --execute=true --warmup-time-secs=300 --nodes=${IPS} --warehouses=2000 --start-warehouse-id=2001 --total-warehouses=4000 --num-connections=200
    
  3. Gather the results.

    Nodes vCPUs Warehouses TPMC Efficiency(%) Connections New Order Latency
    6 48 4000 51296.9 99.72 400 62.09 ms
  4. Clean up the test run using the following command:

    $ ./tpccbenchmark --clear=true --nodes=${IPS} --warehouses=2000
    

Conclusion

With the addition of new nodes, the YugabyteDB cluster can handle more transactions per minute. This linear scalability and high efficiency underscore YugabyteDB's architectural strengths: its ability to distribute workloads evenly across nodes, manage resources optimally, and handle the increased concurrency and data volume that come with cluster growth.