Sharding and rebalancing
Horizontal scaling is a first-class feature in YugabyteDB. The database is designed to scale horizontally seamlessly when adding nodes. Data is moved transparently to the new nodes without any service interruption.
YugabyteDB stores table data in tablets. Sharding is the process of splitting and distributing table data into tablets. Horizontal scalability requires transparent sharding of data.
The following illustrates how sharding works.
Initial cluster setup
Suppose you have a 3-node cluster with a replication factor (RF) of 3, and you are going to store a basic table with an integer as primary key. The data is stored in a single tablet (say T1
). The table starts off with a single tablet with a tablet leader (node-2) and two followers (replicas on node-1 and node-3) for high availability.
Add more data
As you add more data to the table, the leaders and followers of tablet T1
start to grow.
Tablet splitting
Once a tablet reaches a threshold size, say 4 rows for the purpose of this illustration, the tablet T1
splits into two by creating a new tablet, T2
. The tablet splitting is almost instantaneous and is transparent to the application. The newly created tablet will have leaders and followers.
Rebalancing
In the previous illustration, T1
and T2
are in the same node (node-2
). YugabyteDB realizes that the leaders are not balanced and automatically distributes the leaders across different nodes. This ensures that the cluster is used optimally.
Completely scaled out
As more data is added to the table, the tablets split further and are rebalanced as needed. With more data added, you would get a distribution similar to the following illustration: