Change data capture (CDC)

Change data capture (CDC) EARLY ACCESS

Capture changes made to data in the database

In databases, change data capture (CDC) is a set of software design patterns used to determine and track the data that has changed so that action can be taken using the changed data. CDC is beneficial in a number of scenarios:

  • Microservice-oriented architectures: Some microservices require a stream of changes to the data, and using CDC in YugabyteDB can provide consumable data changes to CDC subscribers.

  • Asynchronous replication to remote systems: Remote systems may subscribe to a stream of data changes and then transform and consume the changes. Maintaining separate database instances for transactional and reporting purposes can be used to manage workload performance.

  • Multiple data center strategies: Maintaining multiple data centers enables enterprises to provide high availability (HA).

  • Compliance and auditing: Auditing and compliance requirements can require you to use CDC to maintain records of data changes.

YugabyteDB supports the following methods for reading change events.

This method uses the PostgreSQL replication protocol, ensuring compatibility with PostgreSQL CDC systems. Logical replication operates through a publish-subscribe model. It replicates data objects and their changes based on the replication identity.

It works as follows:

  1. Create Publications in the YugabyteDB cluster similar to PostgreSQL.
  2. Deploy the YugabyteDB Connector in your preferred Kafka Connect environment.
  3. The connector uses replication slots to capture change events and publishes them directly to a Kafka topic.

This is the recommended approach for most CDC applications due to its compatibility with PostgreSQL.

YugabyteDB gRPC Replication Protocol

This method involves setting up a change stream in YugabyteDB that uses the native gRPC replication protocol to publish change events.

It works as follows:

  1. Establish a change stream in the YugabyteDB cluster using the yb_admin CLI commands.
  2. Deploy the YugabyteDB gRPC Connector in your preferred Kafka Connect environment.
  3. The connector captures change events using YugabyteDB's native gRPC replication and directly publishes them to a Kafka topic.

To learn about gRPC Replication, see Using YugabyteDB gRPC Replication.