You can perform deployment using unidirectional (master-follower) or bidirectional (multi-master) asynchronous replication between universes (also known as data centers).

For information on two data center (2DC) deployment architecture and replication scenarios, see Two data center (2DC) deployments.

Set up universes

You can create source and target universes as follows:

  1. Create the yugabyte-source universe by following the procedure from Manual deployment.
  2. Create tables for the APIs being used by the source universe.
  3. Create the yugabyte-target universe by following the procedure from Manual deployment.
  4. Create tables for the APIs being used by the target universe. These should be the same tables as you created for the source universe.
  5. Proceed to setting up unidirectional or bidirectional replication.

If you already have existing data in your tables, follow the bootstrap process described in Bootstrap a target universe.

Set up unidirectional replication

After you created the required tables, you can set up asynchronous replication as follows:

  • Look up the source universe UUID and the table IDs for the two tables and the index table:

    • To find a universe's UUID, check http://yb-master-ip:7000/varz for --cluster_uuid. If it is not available in this location, check the same field in the universe configuration.

    • To find a table ID, execute the following command as an admin user:

      ./bin/yb-admin -master_addresses <source master ips comma separated> list_tables include_table_id
      

      The preceding command lists all the tables, including system tables. To locate a specific table, you can add grep, as follows:

      ./bin/yb-admin -master_addresses <source master ips comma separated> list_tables include_table_id | grep table_name
      

      Note that if there are multiple schemas with the same table name, you might need to contact Yugabyte Support for assistance.

  • Run the following yb-admin setup_universe_replication command from the YugabyteDB home directory in the source universe:

    ./bin/yb-admin \
      -master_addresses <target_universe_master_addresses> \
      setup_universe_replication <source_universe_uuid> \
        <source_universe_master_addresses> \
        <table_id>,[<table_id>..]
    

    For example:

    ./bin/yb-admin \
      -master_addresses 127.0.0.11:7100,127.0.0.12:7100,127.0.0.13:7100 \
      setup_universe_replication e260b8b6-e89f-4505-bb8e-b31f74aa29f3 \
        127.0.0.1:7100,127.0.0.2:7100,127.0.0.3:7100 \
        000030a5000030008000000000004000,000030a5000030008000000000004005,dfef757c415c4b2cacc9315b8acb539a
    

The preceding command contains three table IDs: the first two are YSQL for the base table and index, and the third is the YCQL table. Make sure that all master addresses for source and target universes are specified in the command.

If you need to set up bidirectional replication, see instructions provided in Set up bidirectional replication. Otherwise, proceed to Load data into the source universe.

Set up bidirectional replication

To set up bidirectional replication, repeat the procedure described in Set up unidirectional replication applying the steps to the yugabyte-target universe. You need to set up each yugabyte-source to consume data from yugabyte-target.

When completed, proceed to Load data.

Load data into the source universe

Once you have set up replication, load data into the source universe, as follows:

  • Download the YugabyteDB workload generator JAR file yb-sample-apps.jar from GitHub.

  • Start loading data into yugabyte-source by following examples for YSQL or YCQL:

    • YSQL:

      java -jar yb-sample-apps.jar --workload SqlSecondaryIndex --nodes 127.0.0.1:5433
      
    • YCQL:

      java -jar yb-sample-apps.jar --workload CassandraBatchKeyValue --nodes 127.0.0.1:9042
      


    Note that the IP address needs to correspond to the IP of any T-Servers in the universe.

  • For bidirectional replication, repeat the preceding step in the yugabyte-target universe.

When completed, proceed to Verify replication.

Verify replication

You can verify replication by stopping the workload and then using the COUNT(*) function on the yugabyte-target to yugabyte-source match.

Unidirectional replication

For unidirectional replication, connect to the yugabyte-target universe using the YSQL shell (ysqlsh) or the YCQL shell (ycqlsh), and confirm that you can see the expected records.

Bidirectional replication

For bidirectional replication, repeat the procedure described in Unidirectional replication, but reverse the source and destination information, as follows:

  1. Run yb-admin setup_universe_replication on the yugabyte-target universe, pointing to yugabyte-source.
  2. Use the workload generator to start loading data into the yugabyte-target universe.
  3. Verify replication from yugabyte-target to yugabyte-source.

To avoid primary key conflict errors, keep the key ranges for the two universes separate. This is done automatically by the applications included in the yb-sample-apps.jar.

Replication lag

Replication lag is computed at the tablet level as follows:

replication lag = hybrid_clock_time - last_read_hybrid_time

hybrid_clock_time is the hybrid clock timestamp on the source's tablet-server, and last_read_hybrid_time is the hybrid clock timestamp of the latest record pulled from the source.

To obtain information about the overall maximum lag, you should check /metrics or /prometheus-metrics for async_replication_sent_lag_micros or async_replication_committed_lag_micros and take the maximum of these values across each source's T-Server. For information on how to set up the node exporter and Prometheus manually, see Prometheus integration.

Set up replication with TLS

The setup process depends on whether the source and target universes have the same certificates.

If both universes use the same certificates, run yb-admin setup_universe_replication and include the -certs_dir_name flag. Setting that to the target universe's certificate directory will make replication use those certificates for connecting to both universes.

Consider the following example:

./bin/yb-admin -master_addresses 127.0.0.11:7100,127.0.0.12:7100,127.0.0.13:7100 \
  -certs_dir_name /home/yugabyte/yugabyte-tls-config \
  setup_universe_replication e260b8b6-e89f-4505-bb8e-b31f74aa29f3 \
  127.0.0.1:7100,127.0.0.2:7100,127.0.0.3:7100 \
  000030a5000030008000000000004000,000030a5000030008000000000004005,dfef757c415c4b2cacc9315b8acb539a

When universes use different certificates, you need to store the certificates for the source universe on the target universe, as follows:

  1. Ensure that use_node_to_node_encryption is set to true on all masters and t-servers on both the source and target.

  2. For each master and t-server on the target universe, set the gflag certs_for_cdc_dir to the parent directory where you want to store all the source universe's certificates for replication.

  3. Find the certificate authority file used by the source universe (ca.crt). This should be stored within the [--certs_dir]/preview/reference/configuration/yb-master/#certs-dir.

  4. Copy this file to each node on the target. It needs to be copied to a directory named<certs_for_cdc_dir>/<source_universe_uuid>.

    For example, if you previously set certs_for_cdc_dir=/home/yugabyte/yugabyte_producer_certs, and the source universe's ID is 00000000-1111-2222-3333-444444444444, then you would need to copy the certificate file to /home/yugabyte/yugabyte_producer_certs/00000000-1111-2222-3333-444444444444/ca.crt.

  5. Set up replication using yb-admin setup_universe_replication, making sure to also set the -certs_dir_name flag to the directory with the target universe's certificates (this should be different from the directory used in the previous steps).

    For example, if you have the target universe's certificates in /home/yugabyte/yugabyte-tls-config, then you would run the following:

    ./bin/yb-admin -master_addresses 127.0.0.11:7100,127.0.0.12:7100,127.0.0.13:7100 \
      -certs_dir_name /home/yugabyte/yugabyte-tls-config \
      setup_universe_replication 00000000-1111-2222-3333-444444444444 \
      127.0.0.1:7100,127.0.0.2:7100,127.0.0.3:7100 \
      000030a5000030008000000000004000,000030a5000030008000000000004005,dfef757c415c4b2cacc9315b8acb539a
    

Set up replication with geo-partitioning

You start by creating a source and a target universe with the same configurations (the same regions and zones), as follows:

  • Regions: EU(Paris), Asia Pacific(Mumbai), US West(Oregon)

  • Zones: eu-west-3a, ap-south-1a, us-west-2a

    ./bin/yb-ctl --rf 3 create --placement_info "cloud1.region1.zone1,cloud2.region2.zone2,cloud3.region3.zone3"
    

    Consider the following example:

    ./bin/yb-ctl --rf 3 create --placement_info "aws.us-west-2.us-west-2a,aws.ap-south-1.ap-south-1a,aws.eu-west-3.eu-west-3a"
    

Create tables, tablespaces, and partition tables at both the source and target universes, as per the following example:

  • Main table: transactions

  • Tablespaces: eu_ts, ap_ts, us_ts

  • Partition tables: transactions_eu, transactions_in, transactions_us

    CREATE TABLE transactions (
        user_id   INTEGER NOT NULL,
        account_id INTEGER NOT NULL,
        geo_partition VARCHAR,
        amount NUMERIC NOT NULL,
        created_at TIMESTAMP DEFAULT NOW()
    ) PARTITION BY LIST (geo_partition);
    
    CREATE TABLESPACE eu_ts WITH(
        replica_placement='{"num_replicas": 1, "placement_blocks":
        [{"cloud": "aws", "region": "eu-west-3","zone":"eu-west-3a", "min_num_replicas":1}]}');
    
    CREATE TABLESPACE us_ts WITH(
        replica_placement='{"num_replicas": 1, "placement_blocks":
        [{"cloud": "aws", "region": "us-west-2","zone":"us-west-2a", "min_num_replicas":1}]}');
    
    CREATE TABLESPACE ap_ts WITH(
        replica_placement='{"num_replicas": 1, "placement_blocks":
        [{"cloud": "aws", "region": "ap-south-1","zone":"ap-south-1a", "min_num_replicas":1}]}');
    
    CREATE TABLE transactions_eu
                      PARTITION OF transactions
                      (user_id, account_id, geo_partition, amount, created_at,
                      PRIMARY KEY (user_id HASH, account_id, geo_partition))
                      FOR VALUES IN ('EU') TABLESPACE eu_ts;
    
    CREATE TABLE transactions_in
                      PARTITION OF transactions
                      (user_id, account_id, geo_partition, amount, created_at,
                      PRIMARY KEY (user_id HASH, account_id, geo_partition))
                      FOR VALUES IN ('IN') TABLESPACE ap_ts;
    
    CREATE TABLE transactions_us
                      PARTITION OF transactions
                      (user_id, account_id, geo_partition, amount, created_at,
                      PRIMARY KEY (user_id HASH, account_id, geo_partition))
                      DEFAULT TABLESPACE us_ts;
    

To create unidirectional replication, peform the following:

  1. Collect partition table UUIDs from the source universe (partition tables, transactions_eu, transactions_in, transactions_us) by navigating to Tables in the Admin UI available at 127.0.0.1:7000. These UUIDs are to be used while setting up replication.

    xCluster_with_GP

  2. Run the replication setup command for the source universe, as follows:

    ./bin/yb-admin -master_addresses <consumer_master_addresses> \
    setup_universe_replication <source_universe_UUID>_<replication_stream_name> \
    <producer_master_addresses> <comma_separated_table_ids>
    

    Consider the following example:

    ./bin/yb-admin -master_addresses 127.0.0.11:7100,127.0.0.12:7100,127.0.0.13:7100 \
    setup_universe_replication 00000000-1111-2222-3333-444444444444_xClusterSetup1 \
    127.0.0.1:7100,127.0.0.2:7100,127.0.0.3:7100 \
    000033e1000030008000000000004007,000033e100003000800000000000400d,000033e1000030008000000000004013
    
  3. Optionally, if you have access to YugabyteDB Anywhere, you can observe the replication setup (xClusterSetup1) by navigating to Replication on the source and target universe.

Set up asynchronous replication in Kubernetes

In the Kubernetes environment, you can set up a pod to pod connectivity, as follows:

  • Create a source and a target universe.

  • Create tables in both universes, as follows:

    • Execute the following commands for the source universe:

      kubectl exec -it -n <source_universe_namespace> -t <source_universe_master_leader> -c <source_universe_container> -- bash
      /home/yugabyte/bin/ysqlsh -h <source_universe_yqlserver>
      create table query
      

      Consider the following example:

      kubectl exec -it -n xcluster-source -t yb-master-2 -c yb-master -- bash
      /home/yugabyte/bin/ysqlsh -h yb-tserver-1.yb-tservers.xcluster-source
      create table employees(id int primary key, name text);
      
    • Execute the following commands for the target universe:

      kubectl exec -it -n <target_universe_namespace> -t <target_universe_master_leader> -c <target_universe_container> -- bash
      /home/yugabyte/bin/ysqlsh -h <target_universe_yqlserver>
      create table query
      

      Consider the following example:

      kubectl exec -it -n xcluster-target -t yb-master-2 -c yb-master -- bash
      /home/yugabyte/bin/ysqlsh -h yb-tserver-1.yb-tservers.xcluster-target
      create table employees(id int primary key, name text);
      
  • Collect table UUIDs by navigating to Tables in the Admin UI available at 127.0.0.1:7000. These UUIDs are to be used while setting up replication.

  • Set up replication from the source universe by executing the following command on the source universe:

    kubectl exec -it -n <source_universe_namespace> -t <source_universe_master_leader> -c \
    <source_universe_container> -- bash -c "/home/yugabyte/bin/yb-admin -master_addresses \
    <target_universe_master_addresses> setup_universe_replication \
    <source_universe_UUID>_<replication_stream_name> <source_universe_master_addresses> \
    <comma_separated_table_ids>"
    

    Consider the following example:

    kubectl exec -it -n xcluster-source -t yb-master-2 -c yb-master -- bash -c \
    "/home/yugabyte/bin/yb-admin -master_addresses yb-master-2.yb-masters.xcluster-target.svc.cluster.local, \
    yb-master-1.yb-masters.xcluster-target.svc.cluster.local,yb-master-0.yb-masters.xcluster-target.svc.cluster.local \
    setup_universe_replication ac39666d-c183-45d3-945a-475452deac9f_xCluster_1 \
    yb-master-2.yb-masters.xcluster-source.svc.cluster.local,yb-master-1.yb-masters.xcluster-source.svc.cluster.local, \
    yb-master-0.yb-masters.xcluster-source.svc.cluster.local 00004000000030008000000000004001"
    
  • Perform the following on the source universe and then observe replication on the target universe:

    kubectl exec -it -n <source_universe_namespace> -t <source_universe_master_leader> -c <source_universe_container> -- bash
    /home/yugabyte/bin/ysqlsh -h <source_universe_yqlserver>
    insert query
    select query
    

    Consider the following example:

    kubectl exec -it -n xcluster-source -t yb-master-2 -c yb-master -- bash
    /home/yugabyte/bin/ysqlsh -h yb-tserver-1.yb-tservers.xcluster-source
    INSERT INTO employees VALUES(1, 'name');
    SELECT * FROM employees;
    
  • Perform the following on the target universe:

    kubectl exec -it -n <target_universe_namespace> -t <target_universe_master_leader> -c <target_universe_container> -- bash
    /home/yugabyte/bin/ysqlsh -h <target_universe_yqlserver>
    select query
    

    Consider the following example:

    kubectl exec -it -n xcluster-target -t yb-master-2 -c yb-master -- bash
    /home/yugabyte/bin/ysqlsh -h yb-tserver-1.yb-tservers.xcluster-target
    SELECT * FROM employees;
    

Bootstrap a target universe

You can set up asynchronous replication for the following purposes:

  • Enabling replication on a table that has existing data.
  • Catching up an existing stream where the target has fallen too far behind.

To ensure that the WALs are still available, you need to perform the following steps within the cdc_wal_retention_time_secs gflag window. If the process is going to take more time than the value defined by this flag, you should increase the value.

Proceed as follows:

  1. Create a checkpoint on the source universe for all the tables you want to replicate by executing the following command:

    ./bin/yb-admin -master_addresses <source_universe_master_addresses> \
    bootstrap_cdc_producer <comma_separated_source_universe_table_ids>
    

    Consider the following example:

    ./bin/yb-admin -master_addresses 127.0.0.1:7100,127.0.0.2:7100,127.0.0.3:7100 \
    bootstrap_cdc_producer 000033e1000030008000000000004000,000033e1000030008000000000004003,000033e1000030008000000000004006
    

    The following output is a list of bootstrap IDs, one per table ID:

    table id: 000033e1000030008000000000004000, CDC bootstrap id: fb156717174941008e54fa958e613c10
    table id: 000033e1000030008000000000004003, CDC bootstrap id: a2a46f5cbf8446a3a5099b5ceeaac28b
    table id: 000033e1000030008000000000004006, CDC bootstrap id: c967967523eb4e03bcc201bb464e0679
    
  2. Take the backup of the tables on the source universe and restore at the target universe by following instructions from Backup and restore.

  3. Execute the following command to set up the replication stream using the bootstrap IDs generated in step 1. Ensure that the bootstrap IDs are in the same order as their corresponding table IDs.

    ./bin/yb-admin -master_addresses <target_universe_master_addresses> setup_universe_replication \
    <source_universe_uuid>_<replication_stream_name> <source_universe_master_addresses> \
    <comma_separated_source_universe_table_ids> <comma_separated_bootstrap_ids>
    

    Consider the following example:

    ./bin/yb-admin -master_addresses 127.0.0.11:7100,127.0.0.12:7100,127.0.0.13:7100 setup_universe_replication \
    00000000-1111-2222-3333-444444444444_xCluster1 127.0.0.1:7100,127.0.0.2:7100,127.0.0.3:7100 \
    000033e1000030008000000000004000,000033e1000030008000000000004003,000033e1000030008000000000004006 \
    fb156717174941008e54fa958e613c10,a2a46f5cbf8446a3a5099b5ceeaac28b,c967967523eb4e03bcc201bb464e0679
    

Modify the bootstrap

You can modify the bootstrap as follows:

  • To wipe the test setup, use the delete_universe_replication command.
  • After running the bootstrap_cdc_producer command on the source universe, you can verify that it work as expected by running the list_cdc_streams command to view the associated entries: the bootstrap IDs generated by the bootstrap_cdc_producer command should match the stream_id values you see after executing the list_cdc_streams command.

You can also perform the following modifications:

  • To add a table to the source and target universes, use the alter_universe_replication add_table command.
  • To remove an existing table from the source and target universes, use the alter_universe_replication remove_table command.
  • To change the master nodes on the source universe, execute the alter_universe_replication set_master_addresses command.
  • You can verify changes via the get_universe_config command.

Migrate schema

You can execute DDL operations after replication has been already configured for some tables.

Stop user writes

In certain cases, it is possible to temporarily stop incoming user writes. You would typically approach this as follows:

  • Stop any new incoming user writes.
  • Wait for all changes to get replicated to the target universe. This can be observed by replication lag dropping to 0.
  • Apply the DDL changes on both universes and alter replication for any newly-created tables. For example, after executing the CREATE TABLE or CREATE INDEX statements.
  • Resume user writes.

Use backup and restore

If you cannot stop incoming user traffic, then the safest approach would be to apply DDLs on the source universe combined with bootstrapping. Specifically, you would need to do the following:

  • Stop replication before making any DDL changes.
  • Apply all your DDL changes to the source universe.
  • Backup the source universe and all the relevant tables for which you intend to replicate changes. Follow instructions provided in Bootstrap a target universe.
  • Restore this backup on the target universe.
  • Set up replication again for all of the relevant tables. Ensure that you pass in all the bootstrap_id values.

Note that it is possible to add and remove columns of primitive data types without a bootstrap. The schema changes must match on both sides, including changes introduced by any faulty apply and revert operations. Since this process is error-prone, it is recommended to contact Yugabyte Support for assistance.