Distributed snapshots for YSQL
The most efficient way to back up the data stored in YugabyteDB is to create a distributed snapshot. A snapshot is a consistent cut of a data taken across all the nodes in the cluster. For YSQL, snapshots are created on per-database level. Backing up individual tables is currently not supported.
When YugabyteDB creates a snapshot, it doesn't physically copy the data; instead, it creates hard links to all the relevant files. These links reside on the same storage volumes where the data itself is stored, which makes both backup and restore operations nearly instantaneous.
Note on space consumptionThere are no technical limitations on how many snapshots you can create. However, increasing the number of snapshots stored also increases the amount of space required for the database. The actual overhead depends on the workload, but you can estimate it by running tests based on your applications.
Create a snapshot
The distributed snapshots feature allows you to back up a database, and then restore it in case of a software or operational error, with minimal RTO and overhead.
To back up a database, create a snapshot using the
yb-admin create_database_snapshot my_database
When you run the command, it returns a unique ID for the snapshot:
Started snapshot creation: 0d4b4935-2c95-4523-95ab-9ead1e95e794
create_database_snapshot command exits immediately, but the snapshot may take some time to complete. Before using the snapshot, verify its status with the
This command lists the snapshots in the cluster, along with their states. Locate the ID of the new snapshot and make sure its state is COMPLETE:
Snapshot UUID State 0d4b4935-2c95-4523-95ab-9ead1e95e794 COMPLETE
Delete a snapshot
Snapshots never expire and are retained as long as the cluster exists. If you no longer need a snapshot, you can delete it with the
yb-admin delete_snapshot 0d4b4935-2c95-4523-95ab-9ead1e95e794
Restore a snapshot
To restore the data backed up in one of the previously created snapshots, run the
yb-admin restore_snapshot 0d4b4935-2c95-4523-95ab-9ead1e95e794
This command rolls back the database to the state which it had when the snapshot was created. The restore happens in-place: in other words, it changes the state of the existing database within the same cluster.
Move a snapshot to external storage
Storing snapshots in-cluster is extremely efficient, but also comes with downsides. It can increase the cost of the cluster by inflating the space consumption on the storage volumes. Also, in-cluster snapshots don't protect you from a disaster scenario like filesystem corruption or a hardware failure.
To mitigate these issues, consider storing backups outside of the cluster, in cheaper storage that is also geographically separated from the cluster. This approach helps you to reduce the cost, and also to restore your databases into a different cluster, potentially in a different location.
To move a snapshot to external storage, gather all the relevant files from all the nodes, and copy them along with the additional metadata required for restores on a different cluster:
Get the current YSQL schema catalog version by running the
Back up the YSQL metadata using the
ysql_dump --include-yb-metadata --serializable-deferrable --create --schema-only --dbname my_database --file my_database_schema.sql
Verify that the catalog version is the same as it was prior to creating the snapshot. If it isn't, you're not guaranteed to get a consistent restorable snapshot, and should restart the process.
Create the snapshot metadata file by running the
export_snapshotcommand and providing the ID of the snapshot:
yb-admin export_snapshot 0d4b4935-2c95-4523-95ab-9ead1e95e794 my_database.snapshot
Copy the newly created YSQL metadata file (
my_database_schema.sql) and the snapshot metadata file (
my_database.snapshot) to the external storage.
Copy the tablet snapshot data into the external storage directory. Do this for all tablets of all tables in the database.
cp -r ~/yugabyte-data/node-1/disk-1/yb-data/tserver/data/rocksdb/table-00004000000030008000000000004003/tablet-b0de9bc6a4cb46d4aaacf4a03bcaf6be.snapshots snapshot/
The file path structure is:
<yb_data_dir>is the directory where YugabyteDB data is stored. The default value is
<node_number>is used when multiple nodes are running on the same server (for testing, QA, and development). The default value is
<disk_number>when running YugabyteDB on multiple disks with the
--fs_data_dirsflag. The default value is
<table_id>is the UUID of the table. You can get it from the
http://<yb-master-ip>:7000/tablesurl in the Admin UI.
<tablet_id>in each table there is a list of tablets. Each tablet has a
<tablet_id>.snapshotsdirectory that you need to copy.
<snapshot_id>there is a directory for each snapshot since you can have multiple completed snapshots on each server.
In practice, for each server, you will use the
--fs_data_dirsflag, which is a comma-separated list of paths where to put the data (normally different paths should be on different disks).
TipTo get a snapshot of a multi-node cluster, you need to go into each node and copy the folders of ONLY the leader tablets on that node. Because each tablet-replica has a copy of the same data, you do not need to keep a copy for each replica.
If you don't want to keep the in-cluster snapshot, it's now safe to delete it.
Restore a snapshot from external storage
To restore a snapshot that you've moved to external storage, do the following:
Fetch the YSQL metadata file from the external storage and apply it using the
ysqlsh -h 127.0.0.1 --echo-all --file=my_database_schema.sql
Fetch the snapshot metadata file from the external storage and apply it by running the
yb-admin import_snapshot my_database.snapshot my_database
The output contains the mapping between the old tablet IDs and the new tablet IDs:
Read snapshot meta file my_database.snapshot Importing snapshot 0d4b4935-2c95-4523-95ab-9ead1e95e794 (COMPLETE) Table type: table Target imported table name: test.t1 Table being imported: test.t1 Table type: table Target imported table name: test.t2 Table being imported: test.t2 Successfully applied snapshot. Object Old ID New ID Keyspace 00004000000030008000000000000000 00004000000030008000000000000000 Table 00004000000030008000000000004003 00004000000030008000000000004001 Tablet 0 b0de9bc6a4cb46d4aaacf4a03bcaf6be 50046f422aa6450ca82538e919581048 Tablet 1 27ce76cade8e4894a4f7ffa154b33c3b 111ab9d046d449d995ee9759bf32e028 Snapshot 0d4b4935-2c95-4523-95ab-9ead1e95e794 6beb9c0e-52ea-4f61-89bd-c160ec02c729
Copy the tablet snapshots.
Use the tablet mappings to copy the tablet snapshot files from the external storage to the appropriate location.
In our example, it'll be:
cp -r snapshot/tablet-b0de9bc6a4cb46d4aaacf4a03bcaf6be.snapshots/0d4b4935-2c95-4523-95ab-9ead1e95e794 \ ~/yugabyte-data-restore/node-1/disk-1/yb-data/tserver/data/rocksdb/table-00004000000030008000000000004001/tablet-50046f422aa6450ca82538e919581048.snapshots/6beb9c0e-52ea-4f61-89bd-c160ec02c729
cp -r snapshot/tablet-27ce76cade8e4894a4f7ffa154b33c3b.snapshots/0d4b4935-2c95-4523-95ab-9ead1e95e794 \ ~/yugabyte-data-restore/node-1/disk-1/yb-data/tserver/data/rocksdb/table-00004000000030008000000000004001/tablet-111ab9d046d449d995ee9759bf32e028.snapshots/6beb9c0e-52ea-4f61-89bd-c160ec02c729
NoteFor each tablet, you need to copy the snapshots folder on all tablet peers and in any configured read replica cluster.
Automated backups for YugabyteDB AnywhereYugabyteDB Anywhere provides an API and UI for backup and restore, which automate most of the steps described here. You should use one or both, especially if you have many databases and snapshots to manage.