The transaction model for top-level SQL statements
This section describes how top-level SQL statements, sent to the PostgreSQL server using TCP/IP, execute. The same model, of course, applies for how YugabyteDB's YSQL subsystem executes such statements. The chapter Frontend/Backend Protocol in the PostgreSQL documentation describes how the server processes, and responds to, SQL statements that are sent this way.
Client-side libraries and tools
Various client-side libraries are available to allow you to send and receive requests that use the low-level protocol from an ordinary programming language—like, for example:
- libpq — C Library (documented as part of the overall PostgreSQL documentation set);
- pgJDBC — the PostgreSQL JDBC Driver from postgresql.org;
- psycopg — for Python;
- and, of course, psql, the so-called PostgreSQL interactive terminal, together with the YugabyteDB equivalent, ysqlsh.
There are many others.
However, because each of these client libraries and tools adds its own notion called "autocommit" (or a similar term) as a purely client-side construct, it's best to understand the transaction model first just in terms of the server's response to the TCP/IP protocol as you'd see it (as a thought experiment) if you chose your own favorite low-level programming language to write a tool similar to psql to send and receive the messages that the protocol specifies. This way, you'd observe the native, unadulterated, server behavior.
Transaction control statements
PostgreSQL and YSQL support these transaction control SQL statements:
- begin (alternatively spelled start transaction)
- commit (alternatively spelled end)
- rollback (alternatively spelled abort)
The implication is that a session must always be in one of two possible states:
- A transaction is ongoing.
- No transaction is ongoing.
Notice that commit (or end) and rollback (or abort) are meaningful only when the session has an ongoing transaction. If you issue one of these while there is no ongoing transaction, it has no effect and causes the 25P01 (no_active_sql_transaction) error/warning. (The error/warning status depends on the value of the session's client_min_messages setting.) Similarly, begin (or start transaction) are meaningful only when there is no ongoing transaction. If you issue one of these while a transaction is already ongoing, then you get the 25001 (active_sql_transaction) error/warning. Notice, too, that commit (or end) have the same effect as rollback (or abort) if an ongoing transaction has suffered an error. In this state, any SQL statement apart from rollback (or abort) or commit (or end) causes the 25P02 (in_failed_sql_transaction) error.
(There are also transaction control statements to set and rollback to savepoints; and to set the mode of an ongoing transaction—to read committed, repeatable read, or serializable. These can be used only during an ongoing transaction. They are not interesting in the present discussion.)
Semantics of issuing non-transaction-control SQL statements during an ongoing transaction
The effects of ordinary non-transaction-control SQL statements that are issued during an ongoing transaction are essentially private to the current session and invisible to any other concurrent sessions. If the current session issues rollback, then from the point of view of other sessions, it's the same as if the current session had done nothing. Only if the current session issues commit will the effects of its transaction become visible in other concurrent sessions.
The behavior of concurrent sessions, each of which has an ongoing transaction, needs very careful description—and it depends critically on which so-called isolation level was specified when each transaction was started. See the section Isolation levels. The isolation level of each session affects, for example:
- how sessions might wait on each other in contention scenarios;
- what kinds of conflict errors might occur, and when they occur;
- what one session sees, during its ongoing transaction, when another concurrent session issues commit.
The pedagogy that this page addresses does not depend on understanding the semantics of isolation levels.
Semantics of issuing non-transaction-control SQL statements when no transaction is ongoing
Different RDBMSs behave differently in this scenario. Of course, YugabyteDB's YSQL behaves identically to PostgreSQL—and only this behavior will be described here.
The server implicitly, and automatically, starts a transaction (using the session's current value of the default_transaction_isolation run-time parameter). Then it executes the current non-transaction-control SQL statement. And then it implicitly and automatically issues commit. (As mentioned, this will have the same effect as rollback if the current non-transaction-control SQL statement caused an error.) This leaves the session back in its starting state with no transaction ongoing.
Notice that this behavior is defined and implemented entirely server-side. The PostgreSQL documentation does not describe this mode explicitly and does not, therefore, name the mode. (It describes only the mode where you start and end a multi-statement transaction explicitly. But it doesn't name this mode either.) The YSQL documentation coins these two terms of art for the two modes:
- single statement automatic transaction mode; or
- multistatement manual transaction mode.
Critically, you choose between these modes simply by issuing, or not issuing, begin (or start transaction) when there is currently no ongoing transaction. And you do this simply by submitting the appropriate SQL statements (in the context of the thought experiment that set the stage for the present discussion) by using your own program that sends the SQL statements using TCP/IP and that receives the responses over the same protocol.
There is no dedicated built-in SQL function for this purpose. Nor is it possible to write a deterministically reliable user-defined boolean function, say in_ongoing_txn(), to do this because, if no transaction were ongoing at the moment that you issue select txn_is_ongoing(), then this statement would anyway execute in single statement automatic transaction mode so that, tautologically, the function could only be deemed correct if it always returned true—and this would render it useless. Of course, by extension of this thinking, a built-in SQL function would suffer in the identical way.
Here's a reasonably reliable expedient approach. It relies on the inherent uncertainty of reading the time from a clock. Simply do this:
select (statement_timestamp() > transaction_timestamp())::text as txn_is_ongoing;
You hope, here, to rely on the fact that the timestamptz value at which the select statement starts will be identical, within measurement limits, to the timestamptz value at which the automatically started enclosing single statement transaction began. And, indeed, you're very likely to see that this does reliably return false. On the other hand, you're very likely to see that this:
begin;
select (statement_timestamp() > transaction_timestamp())::text as txn_is_ongoing;
reliably returns true because the typical client-server round trip time for the begin statement is long enough for the clock measurements to detect. You can subvert it thus:
begin\;
select (statement_timestamp() > transaction_timestamp())::text as txn_is_ongoing;
The \;
psql locution asks to send the two SQL statements that it separates in a single round-trip so that you're very likely to see that this always, but wrongly, returns false.
Notice that you can, by hand, ensure that client_min_messages is set to a severity lower than, or equal to, warning and issue begin. If this completes silently, then you know that you were not in an ongoing transaction (and that you are in one now). And if it causes the 25001 warning, then you know that you were in an ongoing transaction (and that you are still in one now). But it's impossible to encapsulate this manual approach into a function.
Observing single statement automatic transaction mode by setting « log_statement = 'all' »
Observe the log when you issue, say, this insert when no transaction is ongoing:
insert into s.t(v) values(42);
The log shows exactly and only this statement—in other words, you don't see begin before the insert statement and you don't see commit after it. You must simply understand, from this documentation, that the server executes the code that implements begin (taking account of the current default_transaction_isolation setting) before the insert statement, that you do see in the log, executes. And then it executes the code that implements commit (turning this into rollback if the insert caused an error) when the insert statement finishes.
Observing multistatement manual transaction mode by setting « log_statement = 'all' »
Observe the log when you issue, say, this, again starting when no transaction is ongoing:
begin;
insert into s.t(v) values(42;
commit;
Here, the log shows these three statements exactly as you entered them.
What is the so-called "autocommit mode" that client-side libraries and tools implement?
This is most definitely an invented, client-side-only, notion that has no direct server implementation. Most PostgreSQL experts consider it to be nothing more than entirely unnecessary and potentially hugely confusing and dangerous. See the Cybertech post Disabling autocommit in PostgreSQL can damage your health by Laurenz Albe. (Here, Laurenz uses "autocommit mode" to denote what, above, was denoted by the term of art "single statement automatic transaction mode". And he emphasizes this:
PostgreSQL operates in autocommit mode, and there is no way to change that behavior on the server side.
However, it seems that all client-side libraries and tools have implemented a notion that they call "autocommit". For example, libpq has a so-called embedded SQL command thus:
SET AUTOCOMMIT { = | TO } { ON | OFF }
JDBC has this:
java.sql.Connection.setAutoCommit(boolean)
psycopg2 has this:
connection.set_session(autocommit=True)
and psql (and therefore ysqlsh) have the meta-command:
\set AUTOCOMMIT { 'on' | 'off' }
The PostgreSQL documentation for psql describes it thus:
The autocommit-off mode works by issuing an implicit BEGIN for you, just before any command that is not already in a transaction block and is not itself a BEGIN or other transaction-control command...
You can see this client-side implementation by observing the server log. Thereafter, it's just the same as if you had issued the begin with autocommit set to off using your own SQL—so it's up to you eventually to issue commit or rollback to end the transaction.
Yugabyte recommends that you adopt the practice that the Cybertech post by Laurentz Albe recommends: use the native PostgreSQL behavior "as is"; and choose between single statement automatic transaction mode and multistatement manual transaction mode explicitly by whether or not you issue begin explicitly before the non-transaction-control statement(s) that you want to execute within a single enclosing transaction.
Notice that you can achieve this implicitly by always implementing every multistatement transaction in a user-defined subprogram. See the section Restrictions that govern the use of "commit" in user-defined subprograms and in particular the tip Avoid invoking user-defined subprograms in multistatement manual transaction mode.