Query diagnostics TECH PREVIEW
Use query diagnostics to capture and export detailed diagnostic information across multiple dimensions, and use this information to identify and resolve database query problems.
Query diagnostics collects data about a particular query, over several executions of the query, over the duration of a specified diagnostics interval. It also manages in-flight requests during this interval, and transfers the collected data to the file system for storage and future analysis.
Query diagnostics collects the following information:
- Bind variables and constants
- PostgreSQL statement statistics
- Schema details
- Active Session History
- Explain plans
Enable query diagnostics
Query diagnostics is TP . To use query diagnostics, enable and configure the following flags for each node of your cluster.
Flag | Description |
---|---|
allowed_preview_flags_csv | Set the value of this flag to include ysql_yb_enable_query_diagnostics . |
ysql_yb_enable_query_diagnostics | Enable or disable query diagnostics. Default: false. Changing this flag requires a VM restart. |
yb_query_diagnostics_circular_buffer_size | Size (in KB) of query diagnostics circular buffer that stores statuses of bundles. Default: 64. Changing this flag requires a VM restart. |
Export diagnostics
Use the pg_stat_statements view to obtain the queryid
of a query of interest.
To initiate query diagnostics, you use the yb_query_diagnostics()
function, providing the queryid
.
For example:
SELECT yb_query_diagnostics(
query_id => 7317833532428971166,
explain_sample_rate => 5,
diagnostics_interval_sec => 120,
explain_analyze => true,
explain_dist => true,
bind_var_query_min_duration_ms => 10
);
yb_query_diagnostics
/Users/($username)/yugabyte-data/node-1/disk-1/query-diagnostics/7317833532428971166/<random-number>
(1 row)
--------------------------------------------------------------------------------
This exports diagnostics for the query with the specified ID, for a duration of 120 seconds, and includes EXPLAIN(ANALYZE, DIST) plans for 5% of all queries observed during this time frame. It also dumps constants/bind variables for queries that took more than 10ms.
To cancel a running query diagnostic, use the yb_cancel_query_diagnostics()
function.
For example:
SELECT yb_cancel_query_diagnostics(query_id => 7317833532428971166);
Options
Parameter | Description | Default |
---|---|---|
query_id | Required. A unique identifier for the query used to identify identical normalized queries. You can obtain query IDs using the pg_stat_statements view. This is also the same as the query_id in the yb_active_session_history. | |
diagnostics_interval_sec | The duration for which the bundle will run, in seconds. | 300 |
explain_sample_rate | You can export the output of the EXPLAIN command for a randomly selected percentage (1-100) of queries that are running during the diagnostics interval. | 1 |
explain_analyze | Enhance the EXPLAIN plan output with planning and execution data. Note that this data is gathered during query execution itself, and the query is not re-executed. | false |
explain_dist | Log distributed data in the EXPLAIN plan. explain_analyze must be set to true. | false |
Check diagnostic status
Use the yb_query_diagnostics_status
view to check the status of previously executed query diagnostic bundles.
Column | Description |
---|---|
query_id | Hash code to identify identical normalized queries. |
start_time | Timestamp when the bundle was triggered. |
diagnostics_interval_sec | The duration for which the bundle will run. |
explain_params | Flags used to gather EXPLAIN plans. |
bind_var_query_min_duration_ms | The minimum query duration for logging bind variables. |
folder_path | File system path where the data for this bundle has been dumped. |
state | The state of the diagnostic - Completed, In Progress, or Failed. |
status | In case of a failure, a short description of the type of failure. |
Output
When the specified diagnostic interval has elapsed, query diagnostics outputs the data to files in the following directory:
pg_data/query_diagnostics/query_id/<random-number>/
To locate your data directory, run SHOW data_directory
in ysqlsh.
Constants
Provides details of the executed bind variables when the specified query ID corresponds to a prepared statement. Otherwise, provides the constants embedded in the query. Additionally, it includes the time of each execution along with the bind variable. Using the bind_var_query_min_duration_ms
parameter, you can log only those bind variables that take more than a specified duration.
Output file: constants_and_bind_variables.csv
For example:
| var1 | var2 | var3 | query_time |
| 19146 | 9008 | 'text_1' | 12.323 |
pg_stat_statements
Presents aggregated pg_stat_statements data spanning the diagnostics interval of query diagnostics for the specified query ID.
Output file: pg_stat_statements.csv
For more information, see Get query statistics using pg_stat_statements.
Schema information
Provides information about the columns and indexes associated with unique tables mentioned in the query being diagnosed.
The output is the same as that obtained using the \d+ meta-command.
Output file: schema_details.txt
For example, the following query would generate schema details as follows:
SELECT
a,
b
FROM
test_table1
WHERE
b > 0;
Table name: test_table1
- Table information:
|Table Name |Table Groupname |Colocated |
+-------------------+----------------+----------+
|public.test_table1 | |false |
- Columns:
|Column |Type |Collation |Nullable |Default |Storage |Stats Target |Description |
+-------+-----------------+----------+---------+--------+---------+-------------+------------+
|a |text | | | |extended | | |
|b |integer | | | |plain | | |
|c |double precision | | | |plain | | |
Active session history
Outputs the data from the yb_active_session_history
view, for the diagnostics interval. You can later import the table in SQL and query it to get data for only the specified query ID.
Output file: active_session_history.csv
For more information on active session history, see Active Session History.
Explain plans
Provides the output of the EXPLAIN command for a randomly selected subset of queries.
Normally, all data is written to the file system when the diagnositcs interval has expired. However, if the explain data being gathered exceeds the internal buffer size, query diagnostics will flush the data to the file system to make room for data from further queries.
Output file: explain_plan.txt
For example:
QUERY PLAN:
--------------------------------------------------------------------------------
Insert on n (cost=0.00..0.01 rows=1 width=44) (actual time=9.114..9.114 rows=0 loops=1)
-> Result (cost=0.00..0.01 rows=1 width=44) (actual time=0.003..0.005 rows=1 loops=1)
Storage Table Write Requests: 1
Planning Time: 0.142 ms
Execution Time: 9.294 ms
Storage Read Requests: 0
Storage Rows Scanned: 0
Storage Write Requests: 1
Catalog Read Requests: 0
Catalog Write Requests: 0
Storage Flush Requests: 0
Storage Execution Time: 0.000 ms
Peak Memory Usage: 24 kB
(13 rows)
--------------------------------------------------------------------------------
Insert on n (cost=0.00..0.01 rows=1 width=44) (actual time=9.114..9.114 rows=0 loops=1)
-> Result (cost=0.00..0.01 rows=1 width=44) (actual time=0.003..0.005 rows=1 loops=1)
Storage Table Write Requests: 1
Planning Time: 0.142 ms
Execution Time: 9.294 ms
Storage Read Requests: 0
Storage Rows Scanned: 0
Storage Write Requests: 1
Catalog Read Requests: 0
Catalog Write Requests: 0
Storage Flush Requests: 0
Storage Execution Time: 0.000 ms
Peak Memory Usage: 24 kB
(13 rows)
...
--------------------------------------------------------------------------------
For more information about using EXPLAIN, see Analyze queries with EXPLAIN.
Limitations
-
Multiple diagnostic bundles can be triggered concurrently for different queries; however, only one bundle can run at a time for a single query ID.
-
Query diagnostics is available per node and is not aggregated across the cluster.
-
Schema details can export a maximum 10 unique schemas in the query.
-
You cannot have more than 100 simultaneous query diagnostic operations.
-
The feature is designed to minimize its impact on query latency. However, operations like EXPLAIN ANALYZE and I/O tasks may affect the data being analyzed.
-
Memory consumption may increase as data is stored in memory before being flushed to disk. Memory consumption while accumulating data is as follows:
num_queries_diagnosed * (mem_explain + mem_bind_var + mem_schema_details + mem_ash + mem_pg_stat_statements)
Where
- num_queries_diagnosed ≤ 100 bytes
- mem_explain ≤ 16 kb
- mem_bind_var ≤ 2 kb
- mem_pg_stat_statements ≤ 1 kb
-
Folders generated by query diagnostics are not cleaned up automatically.
-
If any query has more than 16384 chars in explain_plan, then it is not dumped. See issue 23720.