Active Session History TECH PREVIEW
Active Session History (ASH) provides a current and historical view of system activity by sampling session activity in the database. A database session or connection is considered active if it is consuming CPU, or has an active RPC call that is waiting on one of the wait events.
ASH exposes session activity in the form of SQL views so that you can run analytical queries, aggregations for analysis, and troubleshoot performance issues. To run ASH, you need to enable YSQL or YCQL for their respective sessions.
Currently, ASH is available for YSQL, YCQL, and YB-TServer processes. ASH facilitates analysis by recording wait events related to YSQL, YCQL, or YB-TServer requests while they are being executed. These wait events belong to the categories including but not limited to CPU, WaitOnCondition, Network, and Disk IO.
Analyzing the wait events and wait event types lets you troubleshoot, answer the following questions, and subsequently tune performance:
- Why is a query taking longer than usual to execute?
- Why is a particular application slow?
- What are the queries that are contributing significantly to database load and performance?
Key terminology
The following terms include definitions of some important column identifiers related to ASH.
-
root_request_id
: A unique ID for the top-level request. Typically this corresponds to a user issued DML query, (either via YSQL or YCQL APIs). Note that not all background operations are currently supported. -
rpc_request_id
: When processing a root or top-level request, many internal requests may be generated. Each of these requests will have a different ID. -
query_id
: A hash code to identify identical normalized queries as there may be many active requests in the system (that is, with differentroot_request_id
's) with the samequery_id
. -
top_level_node_id
: ID of the node where the top-level YSQL/YCQL query (corresponding toquery_id
) is being processed.
Configure ASH
To use ASH, enable and configure the following flags for each node of your cluster.
Flag | Description |
---|---|
allowed_preview_flags_csv | Pass the flags ysql_yb_ash_enable_infra and ysql_yb_enable_ash in this flag in CSV format. |
ysql_yb_ash_enable_infra | Enable or disable ASH infrastructure. Default: false. Changing this flag requires a system restart. |
ysql_yb_enable_ash | Works only in conjunction with the flag ysql_yb_ash_enable_infra . Start sampling and instrumentation (that is, periodically check and keep track) of YSQL and YCQL queries, and YB-TServer requests.Default: false. Changing this flag doesn't require a system restart. |
Additional flags
You can also use the following flags based on your requirements.
Flag | Description |
---|---|
ysql_yb_ash_circular_buffer_size | Size (in KBs) of circular buffer where the samples are stored. Default: 16000. Changing this flag requires a system restart. |
ysql_yb_ash_sampling_interval_ms | Time (in milliseconds) duration between two sampling events (ysql, ycql, yb-tserver). Default: 1000. Changing this flag doesn't require a system restart. |
ysql_yb_ash_sample_size | Maximum number of events captured per sampling interval. Default: 500. Changing this flag doesn't require a system restart. |
Limitations
Note that the following limitations are subject to change as the feature is in Tech Preview.
- ASH is available per node only. Aggregations need to be done by you.
- ASH is not available for YB-Master processes.
- ASH is available only for foreground activities or queries from customer applications.
- ASH does not capture start and end time of wait events.
YSQL/YCQL views
ASH exposes the following views in each node to analyze and troubleshoot performance issues.
yb_active_session_history
Get information on wait events for each normalized query, YSQL, or YCQL request.
Column | Type | Description |
---|---|---|
root_request_id | UUID | A 16-byte UUID that is generated per request. Generated for queries at YSQL/YCQL layer. |
rpc_request_id | integer | Request ID per RPC. This is not globally unique. However, it is a monotonically increasing number for the lifetime of a YB-TServer. If a YB-TServer restarts, the number starts from 0 again, so it may not be unique across time. However, if there are no restarts, the combination of the server/rpc_request_id is unique. This could be used for advanced use cases later. For example, understanding if the same RPC is being sampled multiple times. |
wait_event_component | text | There are three components: YSQL, YCQL, and YB-TServer. |
wait_event_class | text | Every wait event has a class associated with it. |
wait_event | text | Provides insight into what the RPC is waiting on. |
wait_event_type | text | Type of the wait event such as CPU, WaitOnCondition, Network, Disk IO, and so on. |
wait_event_aux | text | Additional information for the wait event. For example, tablet ID for YB-TServer wait events. |
top_level_node_id | UUID | 16-byte YB-TServer UUID of the YSQL/YCQL node where the query is being executed. |
query_id | bigint | Query ID as seen on the /statements endpoint. This can be used to join with pg_stat_statements/ycql_stat_statements. For background activities, query ID is a known constant (for example, log appender is 1, flush is 2, compaction is 3, consensus is 4, and so on). |
ysql_session_id | bigint | Same as PgClientSessionId (displayed as Session id in logs). This is 0 for YCQL and background activities, as it is a YSQL-specific field. |
client_node_ip | text | Client IP for the RPC. For YSQL, it is the client node from where the query is generated. For YB-TServer, the YSQL/TServer node from where the RPC originated. |
sample_weight | float | If in any sampling interval there are too many events, YugabyteDB only collects yb_ash_sample_size samples/events. Based on how many were sampled, weights are assigned to the collected events. For example, if there are 200 events, but only 100 events are collected, each of the collected samples will have a weight of (200 / 100) = 2.0 |
Wait events
List of wait events by the following request types.
YSQL
Class | Wait Event | Type | AUX | Description |
---|---|---|---|---|
TServer Wait | StorageRead | Network | Waiting for a DocDB read operation | |
TServer Wait | CatalogRead | Network | Waiting for a catalog read operation | |
TServer Wait | IndexRead | Network | Waiting for a secondary index read operation | |
TServer Wait | StorageFlush | Network | Waiting for a storage flush request | |
YSQLQuery | QueryProcessing | CPU | Doing CPU work | |
YSQLQuery | yb_ash_metadata | LWLock | Waiting to update ASH metadata for a query | |
Timeout | YBTxnConflictBackoff | Timeout | PG process sleeping due to conflict in DocDB | |
Timeout | PgSleep | Timeout | PG process sleeping due to pg_sleep(N) |
YB-TServer
Class | Wait Event | Type | AUX | Description |
---|---|---|---|---|
Common | OnCpu_Passive | CPU | Waiting for a thread to pick it up | |
Common | OnCpu_Active | CPU | RPC is being actively processed on a thread | |
Common | ResponseQueued | Network | Waiting for response to be transferred | |
Tablet | AcquiringLocks | Lock | <tablet‑id> | Taking row-wise locks. May need to wait for other rpcs to release the lock. |
Tablet | MVCC_WaitForSafeTime | Lock | <tablet-id> | Waiting for the SafeTime to be at least the desired read-time. |
Tablet | BackfillIndex_WaitForAFreeSlot | Lock | <tablet-id> | Waiting for a slot to open if there are too many backfill requests at the same time. |
Tablet | CreatingNewTablet | I/O | <tablet-id> | Creating a new tablet may involve writing metadata files, causing I/O wait. |
Tablet | WaitOnConflictingTxn | Lock | <tablet-id> | Waiting for the conflicting transactions to complete. |
Consensus | WAL_Append | I/O | <tablet-id> | Persisting Wal edits |
Consensus | WAL_Sync | I/O | <tablet-id> | Persisting Wal edits |
Consensus | Raft_WaitingForReplication | Network | <tablet-id> | Waiting for Raft replication |
Consensus | Raft_ApplyingEdits | Lock/CPU | <tablet-id> | Applying the edits locally |
RocksDB | BlockCacheReadFromDisk | I/O | <tablet-id> | Populating block cache from disk |
RocksDB | Flush | I/O | <tablet-id> | Doing RocksDB flush |
RocksDB | Compaction | I/O | <tablet-id> | Doing RocksDB compaction |
RocksDB | RateLimiter | I/O | Slow down due to rate limiter throttling access to disk |
YCQL
Class | Wait Event | Type | AUX | Description |
---|---|---|---|---|
YCQLQuery | YCQL_Parse | CPU | CQL call is being actively processed | |
YCQLQuery | YCQL_Read | Network | <table‑id> | Waiting for DocDB read operation |
YCQLQuery | YCQL_Write | Network | <table-id> | Waiting for DocDB write operation |
YBClient | LookingUpTablet | Network | Looking up tablet information | |
YBClient | YBCSyncLeaderMasterRpc | Network | Waiting on an RPC to the master/master-service | |
YBClient | YBCFindMasterProxy | Network | Waiting on establishing the proxy to master leader |
Examples
Before you start
The examples will run on any YugabyteDB universe.
To create a universe, see Set up YugabyteDB universe.
Make sure you have an active ysqlsh session (./bin/ysqlsh
) to run the following examples.
-
Distribution of wait events on each component for each query_id
SELECT query_id, wait_event_component, wait_event, wait_event_type, ROUND(COUNT(*) * 100.0 / SUM(COUNT(*)) OVER (PARTITION BY query_id, wait_event_component), 2) AS percentage FROM yb_active_session_history GROUP BY query_id, wait_event_component, wait_event, wait_event_type ORDER BY query_id, wait_event_component, wait_event_type;
query_id | wait_event_component | wait_event | wait_event_type | percentage ---------------------+----------------------+------------------------------------+-----------------+------------ -8743373368438010782 | YSQL | QueryProcessing | Cpu | 100.00 -6991556726170825105 | YSQL | QueryProcessing | Cpu | 100.00 -1970690938654296136 | TServer | Raft_ApplyingEdits | Cpu | 6.91 -1970690938654296136 | TServer | OnCpu_Active | Cpu | 13.29 -1970690938654296136 | TServer | OnCpu_Passive | Cpu | 16.59 -1970690938654296136 | TServer | RocksDB_NewIterator | DiskIO | 0.36 -1970690938654296136 | TServer | ConflictResolution_ResolveConficts | Network | 1.19 -1970690938654296136 | TServer | Raft_WaitingForReplication | Network | 60.89 -1970690938654296136 | TServer | MVCC_WaitForSafeTime | WaitOnCondition | 0.19 -1970690938654296136 | TServer | Rpc_Done | WaitOnCondition | 0.59 -1970690938654296136 | YSQL | QueryProcessing | Cpu | 100.00 0 | TServer | OnCpu_Active | Cpu | 90.75 0 | TServer | OnCpu_Passive | Cpu | 9.25 1 | TServer | WAL_Append | DiskIO | 43.23 1 | TServer | WAL_Sync | DiskIO | 56.66 1 | TServer | Idle | WaitOnCondition | 0.10 4 | TServer | OnCpu_Active | Cpu | 83.46 4 | TServer | OnCpu_Passive | Cpu | 15.46 4 | TServer | Raft_ApplyingEdits | Cpu | 0.17 4 | TServer | ReplicaState_TakeUpdateLock | WaitOnCondition | 0.91 2721988341380097743 | YSQL | QueryProcessing | Cpu | 100.00 6107501747146929242 | TServer | OnCpu_Active | Cpu | 87.76 6107501747146929242 | TServer | OnCpu_Passive | Cpu | 6.74 6107501747146929242 | TServer | RocksDB_NewIterator | DiskIO | 2.08 6107501747146929242 | TServer | Rpc_Done | WaitOnCondition | 1.97 6107501747146929242 | TServer | MVCC_WaitForSafeTime | WaitOnCondition | 1.46 6107501747146929242 | YSQL | QueryProcessing | Cpu | 24.58 6107501747146929242 | YSQL | yb_ash_metadata | LWLock | 0.01 6107501747146929242 | YSQL | StorageRead | Network | 75.41 6424016913938255253 | YSQL | QueryProcessing | Cpu | 100.00 (30 rows)
-
Distribution of wait events on each component for each query
SELECT SUBSTRING(query, 1, 50) AS query, wait_event_component, wait_event, wait_event_type, ROUND(COUNT(*) * 100.0 / SUM(COUNT(*)) OVER (PARTITION BY query, wait_event_component), 2) AS percentage FROM yb_active_session_history JOIN pg_stat_statements ON query_id = queryid GROUP BY query, wait_event_component, wait_event, wait_event_type ORDER BY query, wait_event_component, wait_event_type;
query | wait_event_component | wait_event | wait_event_type | percentage ---------------------------------------------------+----------------------+------------------------------------+-----------------+------------ INSERT INTO postgresqlkeyvalue (k, v) VALUES ($1, | TServer | Raft_ApplyingEdits | Cpu | 6.85 INSERT INTO postgresqlkeyvalue (k, v) VALUES ($1, | TServer | OnCpu_Active | Cpu | 13.31 INSERT INTO postgresqlkeyvalue (k, v) VALUES ($1, | TServer | OnCpu_Passive | Cpu | 16.50 INSERT INTO postgresqlkeyvalue (k, v) VALUES ($1, | TServer | RocksDB_NewIterator | DiskIO | 0.34 INSERT INTO postgresqlkeyvalue (k, v) VALUES ($1, | TServer | ConflictResolution_ResolveConficts | Network | 1.22 INSERT INTO postgresqlkeyvalue (k, v) VALUES ($1, | TServer | Raft_WaitingForReplication | Network | 60.97 INSERT INTO postgresqlkeyvalue (k, v) VALUES ($1, | TServer | MVCC_WaitForSafeTime | WaitOnCondition | 0.20 INSERT INTO postgresqlkeyvalue (k, v) VALUES ($1, | TServer | Rpc_Done | WaitOnCondition | 0.61 INSERT INTO postgresqlkeyvalue (k, v) VALUES ($1, | YSQL | QueryProcessing | Cpu | 100.00 SELECT k, v FROM postgresqlkeyvalue WHERE k = $1 | TServer | OnCpu_Active | Cpu | 87.69 SELECT k, v FROM postgresqlkeyvalue WHERE k = $1 | TServer | OnCpu_Passive | Cpu | 6.76 SELECT k, v FROM postgresqlkeyvalue WHERE k = $1 | TServer | RocksDB_NewIterator | DiskIO | 2.14 SELECT k, v FROM postgresqlkeyvalue WHERE k = $1 | TServer | MVCC_WaitForSafeTime | WaitOnCondition | 1.45 SELECT k, v FROM postgresqlkeyvalue WHERE k = $1 | TServer | Rpc_Done | WaitOnCondition | 1.97 SELECT k, v FROM postgresqlkeyvalue WHERE k = $1 | YSQL | QueryProcessing | Cpu | 24.62 SELECT k, v FROM postgresqlkeyvalue WHERE k = $1 | YSQL | yb_ash_metadata | LWLock | 0.01 SELECT k, v FROM postgresqlkeyvalue WHERE k = $1 | YSQL | StorageRead | Network | 75.38 (17 rows)
-
Distribution of requests to all the tablets (finding out a hot shard)
SELECT wait_event_aux, ROUND(COUNT(*) * 100.0 / SUM(COUNT(*)) OVER(), 2) AS percentage FROM yb_active_session_history WHERE wait_event_component = 'TServer' AND wait_event_aux IS NOT NULL GROUP BY wait_event_aux ORDER BY percentage DESC;
wait_event_aux | percentage -----------------+------------ 1ee7c269d3a74cc | 67.68 7f6f448e2280406 | 14.67 90d9b0138a1c49d | 4.41 1a0da03e370b4d3 | 4.41 24db684f73c9479 | 4.24 49434b5d87ff4f3 | 4.14 f30265abd9764da | 0.14 090e43c5da98421 | 0.07 3d6615f3d97f4be | 0.07 55e32f4ad65649a | 0.03 7f2aaf8f96564bd | 0.03 c1f70242f8cf463 | 0.03 779602f85a6a466 | 0.03 fbfc62e86fa2444 | 0.03 (14 rows)
-
Distribution of type of wait events per component
SELECT wait_event_component, wait_event_type, ROUND(100.0 * COUNT(*) / SUM(COUNT(*)) OVER (PARTITION BY wait_event_component), 2) AS percentage FROM yb_active_session_history GROUP BY wait_event_component, wait_event_type ORDER BY wait_event_component;
wait_event_component | wait_event_type | percentage ----------------------+-----------------+------------ TServer | Cpu | 33.48 TServer | DiskIO | 55.95 TServer | WaitOnCondition | 0.53 TServer | Network | 10.04 YSQL | Network | 78.27 YSQL | Cpu | 21.73 (6 rows)