Sharding and rebalancing

Split tables into tablets seamlessly across nodes

Horizontal scaling is a first-class feature in YugabyteDB. The database is designed to scale horizontally seamlessly when adding nodes. Data is moved transparently to the new nodes without any service interruption.

YugabyteDB stores table data in tablets. Sharding is the process of splitting and distributing table data into tablets. Horizontal scalability requires transparent sharding of data.

The following illustrates how sharding works.

Initial cluster setup

Suppose you have a 3-node cluster with a replication factor (RF) of 3, and you are going to store a basic table with an integer as primary key. The data is stored in a single tablet (say T1). The table starts off with a single tablet with a tablet leader (node-2) and two followers (replicas on node-1 and node-3) for high availability.

Cluster and Table

Add more data

As you add more data to the table, the leaders and followers of tablet T1 start to grow.

Add more rows

Tablet splitting

Once a tablet reaches a threshold size, say 4 rows for the purpose of this illustration, the tablet T1 splits into two by creating a new tablet, T2. The tablet splitting is almost instantaneous and is transparent to the application. The newly created tablet will have leaders and followers.

Split into two

Rebalancing

In the previous illustration, T1 and T2 are in the same node (node-2). YugabyteDB realizes that the leaders are not balanced and automatically distributes the leaders across different nodes. This ensures that the cluster is used optimally.

Leader rebalancing

Rebalancing is done also for followers and not just for leaders.

Completely scaled out

As more data is added to the table, the tablets split further and are rebalanced as needed. With more data added, you would get a distribution similar to the following illustration:

completely scaled out

Learn more