Docs

Auto Rebalancing

YugaByte DB automatically rebalances data into newly added nodes, so that the cluster can easily be expanded if more space is needed. In this tutorial, we will look at how YugaByte rebalances data while a workload is running. We will run a read-write workload using a pre-packaged sample application against a 3-node local universe with a replication factor of 3, and add nodes to it while the workload is running. We will then observe how the cluster rebalances its on-disk data as well as its memory footprint.

If you haven’t installed YugaByte DB yet, do so first by following the Quick Start guide.

1. Setup - create universe

If you have a previously running local universe, destroy it using the following.

$ ./yb-docker-ctl destroy

Start a new local cluster - by default, this will create a 3 node universe with a replication factor of 3.

$ ./yb-docker-ctl create

2. Run sample key-value app

Run the Cassandra sample key-value app against the local universe by typing the following command.

$ java -jar ./yb-sample-apps.jar --workload CassandraKeyValue \
                                    --nodes localhost:9042 \
                                    --num_threads_write 1 \
                                    --num_threads_read 4 \
                                    --value_size 4096

3. Observe data sizes per node

You can check a lot of the per-node stats by browsing to the tablet-servers page. It should look like this. The total data size per node as well as the total memory used per node are highlighted in the screenshot below. Note that both of those metrics are roughly the same across all the nodes indicating uniform usage across the nodes.

Tablet count, data and memory sizes with 3 nodes

4. Add a node and observe data rebalancing

Add a node to the universe.

$ ./yb-docker-ctl add_node

Now we should have 4 nodes. Refresh the tablet-servers page to see the stats update. As you refresh, you should see the new node getting more and more tablets, which would cause it to get more data as well as increase its memory footprint. Finally, all the 4 nodes should end up with a similar data distribution and memory usage.

Tablet count, data and memory sizes with 4 nodes

5. Add another node and observe linear scale out

Add yet another node to the universe.

$ ./yb-docker-ctl add_node

Now we should have 5 nodes. Refresh the tablet-servers page to see the stats update, and as before you should see all the nodes end up with similar data sizes and memory footprints.

Tablet count, data and memory sizes with 5 nodes

YugaByte DB automatically balances the tablet leaders and followers of a universe by moving them in a rate-limited manner into the newly added nodes. This automatic balancing of the data is completely transparent to the application logic.

6. Clean up (optional)

Optionally, you can shutdown the local cluster created in Step 1.

$ ./yb-docker-ctl destroy

1. Setup - create universe and table

If you have a previously running local universe, destroy it using the following.

$ kubectl delete -f yugabyte-statefulset.yaml

Start a new local cluster - by default, this will create a 3 node universe with a replication factor of 3.

$ kubectl apply -f yugabyte-statefulset.yaml

Check the Kubernetes dashboard to see the 3 yb-tserver and 3 yb-master pods representing the 3 nodes of the cluster.

$ minikube dashboard

Kubernetes Dashboard

Connect to cqlsh on node 1.

$ kubectl exec -it yb-tserver-0 /home/yugabyte/bin/cqlsh
Connected to local cluster at 127.0.0.1:9042.
[cqlsh 5.0.1 | Cassandra 3.9-SNAPSHOT | CQL spec 3.4.2 | Native protocol v4]
Use HELP for help.
cqlsh>

Create a Cassandra keyspace and a table.

cqlsh> CREATE KEYSPACE users;
cqlsh> CREATE TABLE users.profile (id bigint PRIMARY KEY,
	                               email text,
	                               password text,
	                               profile frozen<map<text, text>>);

2. Check cluster status with Admin UI

We need to access the [yb-master Admin UI]](/admin/yb-master/#admin-ui) on port 7000 exposed by any of the pods in the yb-master service (one of yb-master-0, yb-master-1 or yb-master-2). Let us set up a network route to access yb-master-0 on port 7000 from our localhost. You can do this by running the following command.

$ kubectl port-forward yb-master-0 7000

Now, you can view the yb-master-0 Admin UI is available at http://localhost:7000.

3. Observe data sizes per node

You can check a lot of the per-node stats by browsing to the tablet-servers page. It should look like this. The total data size per node (labeled as Total SST File Sizes) as well as the total memory used per node (labeled as RAM Used) are highlighted in the screenshot below. Note that both of these metrics are roughly the same across all the nodes indicating uniform usage across the nodes.

Tablet count, data and memory sizes with 3 nodes

4. Add a node and observe data rebalancing

Add a node to the universe.

$ kubectl scale statefulset yb-tserver --replicas=4

Now we should have 4 nodes. Refresh the tablet-servers page to see the stats update. As you refresh, you should see the new node getting more and more tablets, which would cause it to get more data as well as increase its memory footprint. Finally, all the 4 nodes should end up with a similar data distribution and memory usage.

Tablet count, data and memory sizes with 4 nodes

YugaByte DB automatically balances the tablet leaders and followers of a universe by moving them in a rate-limited manner into the newly added nodes. This automatic balancing of the data is completely transparent to the application logic.

6. Clean up (optional)

Optionally, you can shutdown the local cluster created in Step 1.

$ kubectl delete -f yugabyte-statefulset.yaml

1. Setup - create universe

If you have a previously running local universe, destroy it using the following:

$ ./bin/yb-ctl destroy

Start a new local cluster - by default, this will create a 3 node universe with a replication factor of 3.

$ ./bin/yb-ctl create

2. Run sample key-value app

Run the Cassandra sample key-value app against the local universe by typing the following command.

$ java -jar ./java/yb-sample-apps.jar --workload CassandraKeyValue \
                                    --nodes 127.0.0.1:9042 \
                                    --num_threads_write 1 \
                                    --num_threads_read 4 \
                                    --value_size 4096

3. Observe data sizes per node

You can check a lot of the per-node stats by browsing to the tablet-servers page. It should look like this. The total data size per node as well as the total memory used per node are highlighted in the screenshot below. Note that both of those metrics are roughly the same across all the nodes indicating uniform usage across the nodes.

Data and memory sizes with 3 nodes

4. Add a node and observe data rebalancing

Add a node to the universe.

$ ./bin/yb-ctl add_node

Now we should have 4 nodes. Refresh the tablet-servers page to see the stats update. As you refresh, you should see the new node getting more and more tablets, which would cause it to get more data as well as increase its memory footprint. Finally, all the 4 nodes should end up with a similar data distribution and memory usage.

Data and memory sizes with 4 nodes

5. Add another node and observe linear scale out

Add yet another node to the universe.

$ ./bin/yb-ctl add_node

Now we should have 5 nodes. Refresh the tablet-servers page to see the stats update, and as before you should see all the nodes end up with similar data sizes and memory footprints.

Data and memory sizes with 5 nodes

YugaByte DB automatically balances the tablet leaders and followers of a universe by moving them in a rate-limited manner into the newly added nodes. This automatic balancing of the data is completely transparent to the application logic.

6. Clean up (optional)

Optionally, you can shutdown the local cluster created in Step 1.

$ ./bin/yb-ctl destroy

1. Setup - create universe

If you have a previously running local universe, destroy it using the following:

$ ./bin/yb-ctl destroy

Start a new local cluster - by default, this will create a 3 node universe with a replication factor of 3.

$ ./bin/yb-ctl create

2. Run sample key-value app

Run the Cassandra sample key-value app against the local universe by typing the following command.

$ java -jar ./java/yb-sample-apps.jar --workload CassandraKeyValue \
                                    --nodes 127.0.0.1:9042 \
                                    --num_threads_write 1 \
                                    --num_threads_read 4 \
                                    --value_size 4096

3. Observe data sizes per node

You can check a lot of the per-node stats by browsing to the tablet-servers page. It should look like this. The total data size per node as well as the total memory used per node are highlighted in the screenshot below. Note that both of those metrics are roughly the same across all the nodes indicating uniform usage across the nodes.

Data and memory sizes with 3 nodes

4. Add a node and observe data rebalancing

Add a node to the universe.

$ ./bin/yb-ctl add_node

Now we should have 4 nodes. Refresh the tablet-servers page to see the stats update. As you refresh, you should see the new node getting more and more tablets, which would cause it to get more data as well as increase its memory footprint. Finally, all the 4 nodes should end up with a similar data distribution and memory usage.

Data and memory sizes with 4 nodes

5. Add another node and observe linear scale out

Add yet another node to the universe.

$ ./bin/yb-ctl add_node

Now we should have 5 nodes. Refresh the tablet-servers page to see the stats update, and as before you should see all the nodes end up with similar data sizes and memory footprints.

Data and memory sizes with 5 nodes

YugaByte DB automatically balances the tablet leaders and followers of a universe by moving them in a rate-limited manner into the newly added nodes. This automatic balancing of the data is completely transparent to the application logic.

6. Clean up (optional)

Optionally, you can shutdown the local cluster created in Step 1.

$ ./bin/yb-ctl destroy