Change data capture (CDC)

Change data capture (CDC) EARLY ACCESS

Capture changes made to data in the database

In databases, change data capture (CDC) is a set of software design patterns used to determine and track the data that has changed so that action can be taken using the changed data. CDC is beneficial in a number of scenarios. Let us look at few of them.

  • Microservice-oriented architectures : Some microservices require a stream of changes to the data, and using CDC in YugabyteDB can provide consumable data changes to CDC subscribers.

  • Asynchronous replication to remote systems : Remote systems may subscribe to a stream of data changes and then transform and consume the changes. Maintaining separate database instances for transactional and reporting purposes can be used to manage workload performance.

  • Multiple data center strategies : Maintaining multiple data centers enables enterprises to provide high availability (HA).

  • Compliance and auditing : Auditing and compliance requirements can require you to use CDC to maintain records of data changes.

How does CDC work

YugabyteDB CDC captures changes made to data in the database and streams those changes to external processes, applications, or other databases. CDC allows you to track and propagate changes in a YugabyteDB database to downstream consumers based on its Write-Ahead Log (WAL). YugabyteDB CDC uses Debezium to capture row-level changes resulting from INSERT, UPDATE, and DELETE operations in the upstream database, and publishes them as events to Kafka using Kafka Connect-compatible connectors.

What is CDC

To know more about the internals of CDC, see Overview.

Debezium connector

To capture and stream your changes in YugabyteDB to an external system, you need a connector that can read the changes in YugabyteDB and stream it out. For this, you can use the Debezium connector. Debezium is deployed as a set of Kafka Connect-compatible connectors, so you first need to define a YugabyteDB connector configuration and then start the connector by adding it to Kafka Connect.

To understand how the various features and configuration of the connector, see Debezium connector.

Monitoring

You can monitor the activities and status of the deployed connectors using the http end points provided by YugabyteDB.

To know more about how to monitor your CDC setup, see Monitor.

For tutorials on streaming data to Kafka environments, including Amazon MSK, Azure Event Hubs, and Confluent Cloud, see Kafka environments.

Learn more