Raft and distributed systems
Raft operations, throughput, and latencies
YugabyteDB implements the RAFT consensus protocol, with minor modifications. Replicas implement an RPC method called
UpdateConsensus which allows a tablet leader to replicate a batch of log entries to the follower. Replicas also implement an RPC method called
RequestConsensusVote, which candidates invoke to gather votes. The
ChangeConfig RPC method indicates the number of times a peer was added or removed from the consensus group. An increase in change configuration typically happens when YugabyteDB needs to move data around. This may happen due to a planned server addition or decommission or a server crash looping. A high number for the request consensus indicates that many replicas are looking for a new election because they have yet to receive a heartbeat from the leader. This could happen due to high CPU or a network partition condition.
All handler latency metrics include additional attributes. Refer to Throughput and latency.
The following are key metrics for monitoring RAFT processing.
||microseconds||counter||Time to replicate a batch of log entries from the leader to the follower. Includes the total count of the RPC method being invoked.|
||microseconds||counter||Time by candidates to gather votes. Includes the total count of the RPC method being invoked.|
||microseconds||counter||Time by candidates to add or remove a peer from the Raft group. Includes the total count of the RPC method being invoked.|
The throughput (Ops/Sec) can be calculated and aggregated for nodes across the entire cluster using appropriate aggregations.
Clock skew is an important metric for performance and data consistency. It signals if the Hybrid Logical Clock (HLC) used by YugabyteDB is out of state or if your virtual machine was paused or migrated. If the skew is more than 500 milliseconds, it may impact the consistency guarantees of YugabyteDB. If there is unexplained, seemingly random latency in query responses and spikes in the clock skew metric, it could indicate that the virtual machine got migrated to another machine, or the hypervisor is oversubscribed.
||microseconds||gauge||Time for clock drift and skew.|
When a Raft peer fails, YugabyteDB executes an automatic remote bootstrap to create a new peer from the remaining ones. Bootstrapping can also result from planned user activity when adding or decommissioning nodes.
||microseconds||counter||Time to remote bootstrap a new Raft peer. Includes the total count of remote bootstrap connections.|
This metric can be aggregated for nodes across the entire cluster using appropriate aggregations.