Distributed transactions
This example shows how a distributed transaction works in YugabyteDB.
Setup
Any YugabyteDB cluster, including local single-node cluster or Sandbox cluster in YugabyteDB Managed. See Set up YugabyteDB universe.Create a table
Create the following table:
CREATE TABLE accounts (
account_name varchar,
account_type varchar,
balance float,
PRIMARY KEY (account_name, account_type)
);
Insert some sample data into the table:
INSERT INTO accounts VALUES ('John', 'savings', 1000);
INSERT INTO accounts VALUES ('John', 'checking', 100);
INSERT INTO accounts VALUES ('Smith', 'savings', 2000);
INSERT INTO accounts VALUES ('Smith', 'checking', 50);
Display the contents of the table:
yugabyte=# SELECT * FROM accounts;
account_name | account_type | balance
--------------+--------------+---------
John | checking | 100
John | savings | 1000
Smith | checking | 50
Smith | savings | 2000
(4 rows)
Run a transaction
Run the following transaction:
BEGIN TRANSACTION;
UPDATE accounts SET balance = balance - 200
WHERE account_name='John' AND account_type='savings';
UPDATE accounts SET balance = balance + 200
WHERE account_name='John' AND account_type='checking';
COMMIT;
After the transaction succeeds, display the contents of the table:
yugabyte=# SELECT * FROM accounts;
account_name | account_type | balance
--------------+--------------+---------
John | checking | 300
John | savings | 800
Smith | checking | 50
Smith | savings | 2000
(4 rows)
The transaction can be broken down as follows:
-
Begin:
BEGIN TRANSACTION;
The node that receives this statement becomes the transaction coordinator. A new transaction record is created in the
transaction status
table for the current transaction. It has a unique transaction id with the statePENDING
. Note that in practice, these records are pre-created to achieve high performance. -
Update:
UPDATE accounts SET balance = balance - 200 WHERE account_name='John' AND account_type='savings';
The transaction coordinator writes a provisional record to the tablet that contains this row. The provisional record consists of the transaction ID, so the state of the transaction can be determined. If a provisional record written by another transaction already exists, then the current transaction would use the transaction ID that is present in the provisional record to fetch details and check if there is a potential conflict.
-
Second update:
UPDATE accounts SET balance = balance + 200 WHERE account_name='John' AND account_type='checking';
This step is largely the same as the previous step. Note that the rows being accessed can live on different nodes. The transaction coordinator would need to perform a provisional write RPC to the appropriate node for each row.
-
Commit:
COMMIT;
To commit, all the provisional writes must have successfully completed. The
COMMIT
statement causes the transaction coordinator to update the transaction status in thetransaction status
table toCOMMITED
, at which point it is assigned the commit timestamp (which is a hybrid timestamp to be precise). At this point, the transaction is completed. In the background, theCOMMIT
record along with the commit timestamp is applied to each of the rows that participated to make future lookups of these rows efficient.
The following diagram shows the sequence of events that occur when a transaction is running:
Scalability
Because all nodes of the cluster can process transactions by becoming transaction coordinators, horizontal scalability can be achieved by distributing the queries evenly across the nodes of the cluster.
Resilience
Each update performed as a part of the transaction is replicated across multiple nodes for resilience and persisted on disk. This ensures that the updates made as a part of a transaction that has been acknowledged as successful to the end user is resilient even if failures occur.
Concurrency control
Concurrency control in databases ensures that multiple transactions can execute concurrently while preserving data integrity. Concurrency control is essential for correctness in environments where two or more transactions can access the same data at the same time. The two primary mechanisms to achieve concurrency control are optimistic and pessimistic.
YugabyteDB currently supports optimistic concurrency control, with pessimistic concurrency control being worked on actively.
Transaction options
You can see the various options supported by transactions by running the following \h BEGIN
meta-command:
yugabyte=# \h BEGIN
Command: BEGIN
Description: start a transaction block
Syntax:
BEGIN [ WORK | TRANSACTION ] [ transaction_mode [, ...] ]
where transaction_mode is one of:
ISOLATION LEVEL { SERIALIZABLE | REPEATABLE READ | READ COMMITTED | READ UNCOMMITTED }
READ WRITE | READ ONLY
[ NOT ] DEFERRABLE
transaction_mode
The transaction_mode
can be set to one of the following options:
READ WRITE
– You can perform read or write operations.READ ONLY
– You can only perform read operations.
For example, trying to do a write operation such as creating a table or inserting a row in a READ ONLY
transaction would result in an error:
yugabyte=# BEGIN READ ONLY;
BEGIN
yugabyte=# CREATE TABLE example(k INT PRIMARY KEY);
ERROR: cannot execute CREATE TABLE in a read-only transaction
DEFERRABLE transactions
As in PostgreSQL, the DEFERRABLE
transaction property is not used unless the transaction is also both SERIALIZABLE
and READ ONLY
.
When all three of these properties (SERIALIZABLE
, DEFERRABLE
, and READ ONLY
) are set for a transaction, the transaction may block when first acquiring its snapshot, after which it is able to run without the typical overhead of a SERIALIZABLE
transaction and without any risk of contributing to or being canceled by a serialization failure.
Tip
This mode is well-suited for long-running reports or backups without impacting or being impacted by other transactions.