Tune queries in YugabyteDB

A YugabyteDB cluster consists of multiple nodes running the YB-TServer server process. Due to its distributed nature, tuning a YugabyteDB database requires keeping in mind that, unlike most common databases where only a local machine influences a database query, data is stored across all the nodes of the cluster. This section provides an introduction to tuning YugabyteDB and the tools available.

Before you begin

Before trying to optimize individual statements, make sure the YugabyteDB cluster is running optimally:

  • Verify that all nodes that are supposed to be running are indeed running.
  • Verify that the YB-TServer and, where applicable, YB-Master processes are running on the nodes.

To view the nodes and servers that make up your cluster, use the yb-admin command to request the master and tablet servers. For example:

$ ./bin/yb-admin -init_master_addrs=$(hostname):7100 list_all_masters
Master UUID                        RPC Host/Port           State       Role
a637b88dfc0c4476862ca79872d763d7   172.158.22.229:7100     ALIVE       LEADER
968eb4a7a2e24c3ab95929d3a888ed05   172.158.39.23:7100      ALIVE       FOLLOWER
44e13e096955412a88c6ffd65a1e241c   172.158.50.212:7100     ALIVE       FOLLOWER
$ ./bin/yb-admin -init_master_addrs=$(hostname):7100 list_all_tablet_servers
Tablet Server UUID               RPC Host/Port       Heartbeat delay Status   Reads/s
d80150e0eeda4477a231968440dd89a9 172.158.50.212:9100 0.20s           ALIVE    ..
721661fd5d2044179e03707a862aa578 172.158.22.229:9100 0.39s           ALIVE    ..
54e17c914c0f4679aa7b07a9c7a8ddf5 172.158.39.23:9100  0.48s           ALIVE    ..

Next, make sure enough disk space is available, elementary components such as CPU, disk, and the network do not report errors, and the operating system does not report any malfunction.

For more information on troubleshooting cluster- and node-level issues, refer to Troubleshoot YugabyteDB.

After you know the entire cluster is running correctly, you can move to statement tuning.

Tuning in YSQL

To investigate a specific apparently slow running query, you first need to determine which node the query is running on. To identify the node on which the query is running, log on to the nodes and use the pg_stat_activity view to see running queries.

Note

Remember that even for a single query running on one node, the whole cluster can show activity, because the storage layer is distributed over all nodes.

After you identify the correct node, YSQL provides two views you can use to identify SQL statements and their performance characteristics:

  • pg_stat_activity provides an overview, including SQL, of current activity. Apart from the query text, it does not provide information about the SQL executed; rather it shows the current state of the PostgreSQL backend.

  • pg_stat_statements provides historic information about executed SQL statements. Currently, no 'block'/'blk' information is recorded (which is logical and physical I/O information), because the PostgreSQL I/O codepath is not used.

Which view you use for performance tuning depends on the performance issue and whether you can witness the performance problem happening live, or if it was a performance problem in the past.

Note that identical queries might not perform identically because of differences in data.

After you identify a problematic SQL statement, the next step is to look at the execution plan using EXPLAIN. An execution plan is the order and method of execution to perform the work requested in the SQL statement. This plan is generated by the database planner.

YugabyteDB uses PostgreSQL's cost-based optimizer, which estimates the costs of each possible execution plan for an SQL statement. The planner calculates the cheapest plan to execute a query, assuming that is also the fastest and the best plan, based on statistics derived from the database table and supporting structures like secondary indexes. The execution plan with the lowest cost finally is executed.

Note

Currently, YugabyteDB does not perform size-specific costing, and assumes every plan generates a thousand rows, which translates to a cost of a thousand.

You can use EXPLAIN in two modes:

  • The default mode shows the plan that the planner created and the predicted cost.

  • EXPLAIN ANALYZE mode executes the query, measures the time and actual work done, and adds these statistics to the execution plan as 'actual'.

Both the predicted and actual statistics have their uses. The predicted statistics are what the planner uses, and assuming the statistics don't change, the planner will come up with the same execution plan the next time the query is run, even if it caused performance issues in the past. The actual statistics show how much and what work was actually performed, and thus are beneficial to see the effectiveness of the execution plan.

Tip

Complex execution plans can be difficult to read or understand. Use online tools to calculate and visualize execution plan timelines. For example, https://explain.dalibo.com.