Join us on
Star us on
Get Started
Slack
GitHub
Get Started
v2.5 (latest) v2.2 (stable) v2.1 (earlier version) v2.0 (earlier version) v1.3 (earlier version)
  • YUGABYTEDB CORE
    • Quick start
      • 1. Install YugabyteDB
      • 2. Create a local cluster
      • 3. Explore distributed SQL
      • 4. Build an application
        • Java
        • NodeJS
        • Go
        • Python
        • Ruby
        • C#
        • PHP
        • C++
        • C
        • Scala
    • Explore features
      • Linear scalability
      • Fault tolerance
      • Global distribution
      • Auto sharding
      • Follower reads
      • Observability
      • Colocated tables
      • Change data capture (CDC)
      • Two data center (2DC)
    • Develop
      • Learn app development
        • 1. SQL vs NoSQL
        • 2. Data modeling
        • 3. Data types
        • 4. ACID transactions
        • 5. Aggregations
        • 6. Batch operations
        • 7. Date and time
        • 8. Strings and text
        • 9. TTL for data expiration
      • Ecosystem integrations
        • Apache Kafka
        • Apache Spark
        • JanusGraph
        • KairosDB
        • Presto
        • Metabase
      • Build GraphQL apps
        • Hasura
        • Prisma
      • Real-world examples
        • E-Commerce app
        • IoT fleet management
        • Retail Analytics
      • Explore sample apps
      • Best practices
    • Migrate
      • Migration Process
      • Migrating From PostgreSQL
        • Schema Migration
        • App Migration
        • Export Data
        • Prepare Cluster
        • Import Data
        • Verify Migration
    • Deploy
      • Deployment checklist
      • Manual deployment
        • 1. System configuration
        • 2. Install software
        • 3. Start YB-Masters
        • 4. Start YB-TServers
        • 5. Verify deployment
      • Kubernetes
        • Single-zone
          • Open Source
          • Amazon EKS
          • Google Kubernetes Engine
          • Azure Kubernetes Service
        • Multi-zone
          • Amazon EKS
          • Google Kubernetes Engine
        • Multi-cluster
          • Google Kubernetes Engine
        • Best practices
        • Connect clients
      • Docker
      • Public clouds
        • Amazon Web Services
        • Google Cloud Platform
        • Microsoft Azure
      • Multi-DC deployments
        • Three+ data center (3DC)
        • Two data center (2DC)
        • Read replica clusters
      • Change data capture (CDC)
        • CDC to Kafka
    • Benchmark
      • TPC-C
      • sysbench
      • YCSB
      • Key-value workload
      • Large datasets
      • Scalability
        • Scaling queries
      • Resilience
        • Jepsen testing
    • Secure
      • Security checklist
      • Authentication
        • Authentication
        • Fine-grained authentication
      • Encryption in transit
        • Create client certificates
        • Create server certificates
        • Enable server-to-server encryption
        • Enable client-to-server encryption
        • Connect to clusters
      • Encryption at rest
      • Authorization
        • RBAC model
        • Create roles
        • Grant privileges
      • Audit logging
    • Manage
      • Back up and restore
        • Back up data
        • Restore data
        • Snapshot and restore data
      • Migrate data
        • Bulk import
        • Bulk export
      • Change cluster configuration
      • Diagnostics reporting
      • Upgrade a deployment
    • Troubleshoot
      • Troubleshooting
      • Cluster level issues
        • YCQL connection issues
        • YEDIS connection Issues
        • Recover tserver/master
      • Node level issues
        • Check servers
        • Inspect logs
        • System statistics
        • Disk failure
    • Contribute
  • YUGABYTE PLATFORM
    • Yugabyte Platform
      • Overview
        • Install
        • Configure
      • Install Yugabyte Platform
        • Prerequisites
        • Prepare the environment
        • Install software
        • Prepare nodes (on-prem)
        • Uninstall software
      • Configure Yugabyte Platform
        • Create admin user
        • Configure the cloud provider
        • Configure the backup target
        • Configure alerts and health checking
        • Create and edit instance tags
      • Create deployments
        • Multi-zone universe
        • Multi-region universe
        • Read replica cluster
      • Manage deployments
        • Start and stop processes
        • Add a node
        • Remove a node
        • Edit a universe
        • Edit configuration flags
        • Upgrade the YugabyteDB software
        • Delete a universe
        • Migrate to Helm 3
      • Back up and restore universes
        • Configure backup storage
        • Back up universe data
        • Restore universe data
        • Schedule data backups
      • Security
        • Security checklist
        • Customize ports
        • Authorization platform
        • Create a KMS configuration
        • Enable encryption at rest
        • Enable encryption in transit (TLS)
        • Network security
      • Troubleshoot
        • Install and upgrade
        • Universe
      • Administer Yugabyte Platform
        • Back up and restore Yugabyte Platform
  • YUGABYTE CLOUD
    • Yugabyte Cloud
      • Free tier
      • Create clusters
      • Monitor clusters
      • Create databases
      • Manage database access
      • Connect to clusters
  • REFERENCE
    • Reference
    • Architecture
      • Design goals
      • Key concepts
        • Universe
        • YB-TServer Service
        • YB-Master Service
      • Layered architecture
      • Query layer
        • Overview
      • DocDB transactions layer
        • Transactions overview
        • Transaction isolation levels
        • Explicit locking
        • Single-row transactions
        • Distributed transactions
        • Transactional IO path
      • DocDB sharding layer
        • Hash & range sharding
        • Tablet splitting
        • Colocated tables
      • DocDB replication layer
        • Replication
        • xCluster replication
        • Read replicas
        • Change data capture (CDC)
      • DocDB storage layer
        • Persistence
        • Performance
    • APIs
      • YSQL
        • Statements
          • ABORT
          • ALTER DATABASE
          • ALTER DEFAULT PRIVILEGES
          • ALTER DOMAIN
          • ALTER GROUP
          • ALTER POLICY
          • ALTER ROLE
          • ALTER SEQUENCE
          • ALTER TABLE
          • ALTER USER
          • BEGIN
          • COMMENT
          • COMMIT
          • COPY
          • CREATE AGGREGATE
          • CREATE CAST
          • CREATE DATABASE
          • CREATE DOMAIN
          • CREATE EXTENSION
          • CREATE FUNCTION
          • CREATE GROUP
          • CREATE INDEX
          • CREATE OPERATOR
          • CREATE OPERATOR CLASS
          • CREATE POLICY
          • CREATE PROCEDURE
          • CREATE ROLE
          • CREATE RULE
          • CREATE SCHEMA
          • CREATE SEQUENCE
          • CREATE TABLE
          • CREATE TABLE AS
          • CREATE TRIGGER
          • CREATE TYPE
          • CREATE USER
          • CREATE VIEW
          • DEALLOCATE
          • DELETE
          • DO
          • DROP AGGREGATE
          • DROP CAST
          • DROP DATABASE
          • DROP DOMAIN
          • DROP EXTENSION
          • DROP FUNCTION
          • DROP GROUP
          • DROP OPERATOR
          • DROP OPERATOR CLASS
          • DROP OWNED
          • DROP POLICY
          • DROP PROCEDURE
          • DROP ROLE
          • DROP RULE
          • DROP SEQUENCE
          • DROP TABLE
          • DROP TRIGGER
          • DROP TYPE
          • DROP USER
          • END
          • EXECUTE
          • EXPLAIN
          • GRANT
          • INSERT
          • LOCK
          • PREPARE
          • REASSIGN OWNED
          • RESET
          • REVOKE
          • ROLLBACK
          • SELECT
          • SET
          • SET CONSTRAINTS
          • SET ROLE
          • SET SESSION AUTHORIZATION
          • SET TRANSACTION
          • SHOW
          • SHOW TRANSACTION
          • TRUNCATE
          • UPDATE
        • Data types
          • Array
            • array[] constructor
            • Literals
              • Text typecasting and literals
              • Array of primitive values
              • Row
              • Array of rows
            • FOREACH loop (PL/pgSQL)
            • array of DOMAINs
            • Functions and operators
              • ANY and ALL
              • Array comparison
              • Array slice operator
              • Array concatenation
              • Array properties
              • array_agg(), unnest(), generate_subscripts()
              • array_fill()
              • array_position(), array_positions()
              • array_remove()
              • array_replace() / set value
              • array_to_string()
              • string_to_array()
          • Binary
          • Boolean
          • Character
          • Date and time
          • JSON
            • JSON literals
            • Primitive and compound data types
            • Code example conventions
            • Indexes and check constraints
            • Functions & operators
              • ::jsonb, ::json, ::text (typecast)
              • ->, ->>, #>, #>> (JSON subvalues)
              • - and #- (remove)
              • || (concatenation)
              • = (equality)
              • @> and <@ (containment)
              • ? and ?| and ?& (key or value existence)
              • array_to_json()
              • jsonb_agg()
              • jsonb_array_elements()
              • jsonb_array_elements_text()
              • jsonb_array_length()
              • jsonb_build_object()
              • jsonb_build_array()
              • jsonb_each()
              • jsonb_each_text()
              • jsonb_extract_path()
              • jsonb_extract_path_text() and json_extract_path_text()
              • jsonb_object()
              • jsonb_object_agg()
              • jsonb_object_keys()
              • jsonb_populate_record()
              • jsonb_populate_recordset()
              • jsonb_pretty()
              • jsonb_set() and jsonb_insert()
              • jsonb_strip_nulls()
              • jsonb_to_record()
              • jsonb_to_recordset()
              • jsonb_typeof()
              • row_to_json()
              • to_jsonb()
          • Money
          • Numeric
          • Serial
          • UUID
        • Functions and operators
          • currval()
          • lastval()
          • nextval()
          • Window functions
            • Informal functionality overview
            • Invocation SQL syntax and semantics
            • Per function signature and purpose
              • row_number(), rank() and dense_rank()
              • percent_rank(), cume_dist() and ntile()
              • first_value(), nth_value(), last_value()
              • lag(), lead()
              • Tables for the code examples
                • table t1
                • table t2
                • table t3
                • table t4
            • Analyzing a normal distribution
              • Bucket allocation scheme
              • do_clean_start.sql
              • cr_show_t4.sql
              • cr_dp_views.sql
              • cr_int_views.sql
              • cr_pr_cd_equality_report.sql
              • cr_bucket_using_width_bucket.sql
              • cr_bucket_dedicated_code.sql
              • do_assert_bucket_ok
              • cr_histogram.sql
              • cr_do_ntile.sql
              • cr_do_percent_rank.sql
              • cr_do_cume_dist.sql
              • do_populate_results.sql
              • do_report_results.sql
              • do_compare_dp_results.sql
              • do_demo.sql
              • Reports
                • Histogram report
                • dp-results
                • compare-dp-results
                • int-results
        • Extensions
        • Keywords
        • Reserved names
      • YCQL
        • ALTER KEYSPACE
        • ALTER ROLE
        • ALTER TABLE
        • CREATE INDEX
        • CREATE KEYSPACE
        • CREATE ROLE
        • CREATE TABLE
        • CREATE TYPE
        • DROP INDEX
        • DROP KEYSPACE
        • DROP ROLE
        • DROP TABLE
        • DROP TYPE
        • GRANT PERMISSION
        • GRANT ROLE
        • REVOKE PERMISSION
        • REVOKE ROLE
        • USE
        • INSERT
        • SELECT
        • EXPLAIN
        • UPDATE
        • DELETE
        • TRANSACTION
        • TRUNCATE
        • Simple expressions
        • Subscripted expressions
        • Function call
        • Operators
        • BLOB
        • BOOLEAN
        • Collection
        • FROZEN
        • INET
        • Integer and counter
        • Non-integer
        • TEXT
        • DATE, TIME, and TIMESTAMP
        • UUID and TIMEUUID
        • JSONB
        • Date and time
        • BATCH
    • CLIs
      • yb-ctl
      • yb-docker-ctl
      • ysqlsh
      • ycqlsh
      • yb-admin
      • yb-ts-cli
      • ysql_dump
      • ysql_dumpall
    • Configuration
      • yb-tserver
      • yb-master
      • yugabyted
      • Default ports
    • Drivers
      • Client drivers for YSQL API
      • YugabyteDB JDBC Driver
      • Client drivers for YCQL
      • Spring Data YugabyteDB
    • Connectors
      • Kafka Connect YugabyteDB
    • Third party tools
      • DBeaver
      • DbSchema
      • pgAdmin
      • SQL Workbench/J
      • TablePlus
      • Visual Studio Code
    • Sample datasets
      • Chinook
      • Northwind
      • PgExercises
      • SportsDB
  • RELEASES
    • Releases
    • Releases overview
    • Release versioning
    • What's new
      • 2.3 (latest)
      • 2.2 (stable)
    • Earlier releases
      • v2.1.8
      • v2.1.6
      • v2.1.5
      • v2.1.4
      • v2.1.3
      • v2.1.2
      • v2.1.1
      • v2.1.0
      • v2.0.11
      • v2.0.10
      • v2.0.9
      • v2.0.8
      • v2.0.7
      • v2.0.6
      • v2.0.5
      • v2.0.3
      • v2.0.1
      • v2.0.0
      • v1.3.1
      • v1.3.0
      • v1.2.12
      • v1.2.11
      • v1.2.10
      • v1.2.9
      • v1.2.8
      • v1.2.6
      • v1.2.5
      • v1.2.4
  • FAQ
    • Comparisons
      • Amazon Aurora
      • Google Cloud Spanner
      • CockroachDB
      • TiDB
      • Vitess
      • MongoDB
      • FoundationDB
      • Amazon DynamoDB
      • Azure Cosmos DB
      • Apache Cassandra
      • PostgreSQL
      • Redis in-memory store
      • Apache HBase
    • FAQs
      • General FAQ
      • Operations FAQ
      • API compatibility FAQ
      • Yugabyte Platform FAQ
  • MISC
    • YEDIS
      • Quick start
      • Develop
        • Build an application
        • C#
        • C++
        • Go
        • Java
        • NodeJS
        • Python
      • API reference
        • APPEND
        • AUTH
        • CONFIG
        • CREATEDB
        • DELETEDB
        • LISTDB
        • SELECT
        • DEL
        • ECHO
        • EXISTS
        • EXPIRE
        • EXPIREAT
        • FLUSHALL
        • FLUSHDB
        • GET
        • GETRANGE
        • GETSET
        • HDEL
        • HEXISTS
        • HGET
        • HGETALL
        • HINCRBY
        • HKEYS
        • HLEN
        • HMGET
        • HMSET
        • HSET
        • HSTRLEN
        • HVALS
        • INCR
        • INCRBY
        • KEYS
        • MONITOR
        • PEXPIRE
        • PEXPIREAT
        • PTTL
        • ROLE
        • SADD
        • SCARD
        • RENAME
        • SET
        • SETEX
        • PSETEX
        • SETRANGE
        • SISMEMBER
        • SMEMBERS
        • SREM
        • STRLEN
        • ZRANGE
        • TSADD
        • TSCARD
        • TSGET
        • TSLASTN
        • TSRANGEBYTIME
        • TSREM
        • TSREVRANGEBYTIME
        • TTL
        • ZADD
        • ZCARD
        • ZRANGEBYSCORE
        • ZREM
        • ZREVRANGE
        • ZSCORE
        • PUBSUB
        • PUBLISH
        • SUBSCRIBE
        • UNSUBSCRIBE
        • PSUBSCRIBE
        • PUNSUBSCRIBE
    • Legal
      • Third party software
> Benchmark > Scalability >

Scaling YSQL queries

Attention

This page documents an earlier version. Go to the latest (v2.3) version.
  • Database cluster setup
  • Benchmark setup
  • Benchmarking the INSERT workload
  • Benchmarking the SELECT workload
  • Architecture for horizontal write scaling
  • Next steps
  • YSQL
  • YCQL

There are a number of well-known experiments where eventually-consistent NoSQL databases were scaled out to perform millions of inserts and queries. Here, you do the same using YSQL, the Yugabyte SQL API that is PostgreSQL-compatible, strongly-consistent, and supports distributed SQL. We created a 100-node YugabyteDB cluster, ran single-row INSERT and SELECT workloads with high concurrency – each for an hour and measured the sustained performance (throughput and latency). This topic details the results of this experiment as well as highlights the key aspects of the YugabyteDB architecture that makes it fit for such high-volume ingest workloads. Although this topic describes the results of benchmark tests performed by Yugabyte, you can follow the steps below to perform your own benchmarks on the scalability of queries in your YugabyteDB clusters.

Database cluster setup

While YugabyteDB can be deployed across multiple availability zones or regions, this benchmark focused on the aggregate performance of a 100-node cluster. Therefore, all 100 nodes were deployed on the Amazon Web Services (AWS) cloud in the US West (Oregon) region (us-west-2) and in a single availability zone (us-west-2a). Each of the instances were of type c5.4xlarge (16 vCPUs). This information is summarized below.

  • Cluster name: MillionOps
  • Cloud: Amazon Web Services
  • Region: Oregon (us-west-2)
  • Zone: us-west-2a
  • Number of nodes: 100
  • Instance type: c5.4xlarge (16 vCPUs)
  • Disk on each node: 1TB EBS SSD (gp2)
  • Replication Factor (RF): 3
  • Consistency level: Strong consistency for both writes and reads

Benchmark setup

The benchmark application was an open-source Java program. The application’s database workload simply does multi-threaded, single-row INSERT and SELECT statements against a table that has a key and a value column. The size of each row was 64 bytes. The insert and select benchmarks were run for one hour each in order to measure the sustained throughput and latency.

This benchmark application was run on six instances of eight cores each. Note that we could not consolidate into a fewer number of more powerful instances since we were hitting the maximum network bandwidth on the network instances. Each of these benchmark instances were prepared as shown below.

Java 8 was installed using the following commands.

$ sudo apt update
$ sudo apt install default-jre

The YugabyteDB workload generator was downloaded on to these machines as shown below.

$ wget -P target https://github.com/YugaByte/yb-sample-apps/releases/download/1.3.1/yb-sample-apps.jar

This benchmark program can take a list of servers in the database cluster, and then perform random operations across these servers. In order to do this, we set up an environment variable with the list of comma-separated host:port entries of the 100 database servers as shown below.

$ export YSQL_NODES=node-1-ip-addr:5433,node-2-ip-addr:5433,...

Benchmarking the INSERT workload

The first step was to run an INSERT benchmark (using the SqlInserts workload generator) on this 100-node cluster. The following command was run on each of the benchmark instances.

java -jar ~/yb-sample-apps-no-table-drop.jar    \
                   --workload SqlInserts        \
                   --nodes $YSQL_NODES          \
                   --num_unique_keys 5000000000 \
                   --num_threads_write 400      \
                   --num_threads_read 0         \
                   --uuid 00000000-0000-0000-0000-00000000000n
java -jar ~/yb-sample-apps-no-table-drop.jar    \
                   --workload SqlInserts        \
                   --nodes $YSQL_NODES          \
                   --num_unique_keys 5000000000 \
                   --num_threads_write 400      \
                   --num_threads_read 0         \
                   --uuid 00000000-0000-0000-0000-00000000000n

The table on which the benchmark was run had the following simple schema.

CREATE TABLE table_name (k varchar PRIMARY KEY, v varchar);

This workload performed a number of INSERTs using prepared statements, as shown below.

INSERT INTO table_name (k, v) VALUES (?, ?);

Note a few points about the benchmark setup.

  • Each benchmark program writes unique set of rows. The uuid parameter forms a prefix of the row key. It is set differently (by varying the value of n from 1 to 6) on each benchmark instance to ensure it writes separate keys.

  • A total of 30 billion unique rows will be inserted upon completion. Each benchmark program proceeds to write out 5 billion keys, and there are six such programs running in parallel.

  • There are 2400 concurrent clients performing inserts. Each benchmark program uses 400 write threads, and there are six such programs running concurrently.

A screenshot of the write throughput on this cluster while the benchmark was in progress is shown below. The write throughput was 1.26 million inserts per second.

Total YSQL operations per second

The corresponding average insert latency across all the 100 nodes was 1.66 milliseconds (ms), as shown below. Note that each insert is replicated three-ways to make the cluster fault tolerant.

YSQL operations latency

The average CPU usage across the nodes in the cluster was about 78%, as shown in the graph below.

CPU usage

Benchmarking the SELECT workload

The following command was run for the SELECT workload.

java -jar ~/yb-sample-apps-no-table-drop.jar \
               --workload SqlInserts         \
               --nodes $YSQL_NODES           \
               --max_written_key 500000000   \
               --num_writes 0                \
               --num_reads 50000000000       \
               --num_threads_write 0         \
               --num_threads_read 400        \
               --read_only                   \
               --uuid 00000000-0000-0000-0000-00000000000n

The SELECT workload looks up random rows on the table that the INSERT workload (described in the previous section) populated. Each SELECT query is performed using prepared statements, as shown below.

SELECT * FROM table_name WHERE k=?;

There are 2,400 concurrent clients issuing SELECT statements. Each benchmark program uses 400 threads, and there are six programs running in parallel. Each read operation randomly selects one row from a total of 3 billion rows. Each benchmark program randomly queries one row from a total of 500 million rows, and there are six concurrent programs.

Total YSQL operations per second

The read throughput on this cluster while the benchmark was in progress is shown below. The read throughput was 2.8 million selects per second. YugabyteDB reads are strongly consistent by default and that is the setting used for this benchmark. Additional throughput can be achieved by simply allowing timeline-consistent reads from follower replicas (see Architecture for horizontal write scaling below).

CPU usage

The corresponding average select latency across all the 100 nodes was 0.56ms, as shown below.

CPU usage

The average CPU usage across the nodes in the cluster was about 64%, as shown in the graph below.

CPU usage

Architecture for horizontal write scaling

The architecture of a YugabyteDB cluster is shown in the figure below. The YB-TServer service is responsible for managing the data in the cluster while the YB-Master service manages the system configuration of the cluster. YB-TServer automatically shards every table into a number of shards (aka tablets). Given the replication factor (RF) of 3 for the cluster, each tablet is represented as a Raft group of three replicas with one replica considered the leader and other two replicas considered as followers. In a 100-node cluster, each of these three replicas are automatically stored on exactly tree (out of 100) different nodes where each node can be thought of as representing an independent fault domain. YB-Master automatically balances the total number of leader and follower replicas on all the nodes so that no single node becomes a bottleneck and every node contributes its fair share to incoming client requests. The end result is strong write consistency (by ensuring writes are committed at a majority of replicas) and tunable read consistency (by serving strong reads from leaders and timeline-consistent reads from followers), irrespective of the number of nodes in the cluster.

CPU usage

To those new to the Raft consensus protocol, the simplest explanation is that it is a protocol with which a cluster of nodes can agree on values. It is arguably the most popular distributed consensus protocol in use today. Business-critical cloud-native systems like etcd (the configuration store for Kubernetes) and consul (HashiCorp’s popular service discovery solution) are built on Raft as a foundation. YugabyteDB uses Raft for both leader election as well as the actual data replication. The benefits of YugabyteDB’s use of Raft including rapid scaling (with fully-automatic rebalancing) are highlighted in the Yugabyte blog on “How Does the Raft Consensus-Based Replication Protocol Work in YugabyteDB?”. Raft is tightly integrated with a high-performance document store (extended from RocksDB) to deliver on the promise of massive write scalability combined with strong consistency and low latency.

Next steps

You can visit the YugabyteDB workload generator GitHub repository to try out more experiments on your own local setups. After you set up a cluster and test your favorite application, share your feedback and suggestions with other users on the YugabyteDB Community Slack.

  • Database cluster setup
  • Benchmark setup
  • Benchmarking the INSERT workload
  • Benchmarking the SELECT workload
  • Architecture for horizontal write scaling
  • Next steps
Ask our community
  • Slack
  • Github
  • Forum
  • StackOverflow
Yugabyte
Contact Us
Copyright © 2017-2020 Yugabyte, Inc. All rights reserved.