Doing SQL from PL/pgSQL

Syntax

See these sections for other dedicated ways to loop over the rows that a select statement defines:

Here are the remaining basic PL/pgSQL statements for doing SQL:

Grammar
Diagram

plpgsql_static_bare_sql_stmt ::= sql_stmt

plpgsql_static_dml_returning_stmt ::= { insert | update | delete } 
                                      returning_clause INTO [ STRICT ] 
                                      plpgsql_into_target [ , ... ]

plpgsql_static_select_into_stmt ::= [ with_clause ] SELECT select_list 
                                    INTO [ STRICT ] 
                                    plpgsql_into_target [ , ... ] 
                                    [ trailing_select_clauses ]

plpgsql_dynamic_sql_stmt ::= EXECUTE text_expression 
                             [ INTO [ STRICT ] plpgsql_into_target 
                               [ , ... ] ] 
                             [ USING expression [ , ... ] ]

plpgsql_perform_stmt ::= PERFORM { select_list 
                                   [ trailing_select_clauses ]
                                   | ( select ) }

plpgsql_static_bare_sql_stmt

plpgsql_static_dml_returning_stmt

plpgsql_static_select_into_stmt

plpgsql_dynamic_sql_stmt

plpgsql_perform_stmt

Semantics

The bare SQL statement

See the syntax rule above

This example shows four embedded SQL statements encapsulated within a single PL/pgSQL procedure. The first three are DDL statements. And the fourth is a DML statement—i.e. a statement that can be used by a prepare statement. The three DDL statements are written exactly as they would be as top-level SQL statements. The insert statement is also syntactically correct as a top-level SQL statement. But if you do this, you get a 42703 semantic error:

insert into s.t(v) select generate_series(lo, hi);

This is the error text:

column "lo" does not exist

This is what the term of art embedded means: the SQL statement text is simply part of the program text, every bit as much as is, say, an assignment statement like this:

a := b + c;

In other words, the SQL statement is syntactically analyzed at the time that the subprogram is created—and a syntax error turns the attempt into a no-op. (A corresponding account holds for a do statement. An attempt to execute it isn't made if its defining PL/pgSQL block statement has a syntax error.)

The term of art dynamic (see the section "execute" statement below) is used when SQL statement text is presented as a text value. (Some people prefer the term of art static, instead of embedded, to contrast with dynamic.)

Here is the example. Not only does it serve the purpose of illustrating the plpgsql_static_bare_sql_stmt; but also it is a useful tool for re-creating a clean start before running each of the remaining examples on this page.

drop schema if exists admin cascade;
create schema admin;

create procedure admin.init(lo in int, hi in int)
  set search_path = pg_catalog, pg_temp
  set client_min_messages = error
  language plpgsql
as $body$
begin
  drop schema if exists s cascade;
  create schema s;
  create table s.t(k serial primary key, v int);
  insert into s.t(v) select generate_series(lo, hi);
end;
$body$;

It finishes silently without error—leaving you set up for the remaining examples.

The "select into" statement

See the syntax rule above

The following example illustrates:

both the "select into" statement
and the "insert, update, delete into" statement

Do this:

call admin.init(lo=>11, hi=>20);

create procedure s.p(key in int)
  set search_path = pg_catalog, pg_temp
  language plpgsql
as $body$
declare
  v_into  int not null := 0;
  v_ret   int not null := 0;
begin
  select v into strict v_into from s.t where k = key;
  delete from s.t where k = key returning v into strict v_ret;
  assert v_ret = v_into;
end;
$body$;

call s.p(5);

It finishes silently, showing that the assert holds.

Notice the use of the strict keyword in both of the embedded SQL statements. This is the common case when you do a single row select into or delete returning where the row is identified by the value of the primary key. The effect of strict is that the SQL statement causes a dedicated P0002 error with the message query returned no rows.

The "insert, update, delete into" statement

See the syntax rule above

The previous "select into" section illustrates the "insert, update, delete into" statement. Look for this:

delete from s.t where k = 5 returning v into strict v_ret;

You might wonder how to handle the case that the delete restriction identifies more than one row. The clue is the common_table_expression hereinafter CTE. It lets you use the delete statement that you want in the definition of a CTE in a with clause in an overall select statement. Then you use that in a "query for loop" with a subquery.

Do this:

call admin.init(lo=>11, hi=>20);

create function s.f(lo in int, hi in int)
  returns table(del int)
  set search_path = pg_catalog, pg_temp
  language plpgsql
as $body$
declare
  v_into  int not null := 0;
  v_ret   int not null := 0;
begin
  for del in
    (
      with c as (delete from s.t where k between lo and hi returning v)
      select v from c
    )
  loop
    return next;
  end loop;
end;
$body$;

select del as "Deleted rows" from s.f(5, 7);

This is the result:

 Deleted rows 
--------------
           15
           16
           17

The "execute" statement

See the syntax rule above

Simply transform each of the embedded SQL statements from the "select into" statement section above into a dynamic SQL statement.

Do this:

call admin.init(lo=>11, hi=>20);

create procedure s.p(key in int)
  set search_path = pg_catalog, pg_temp
  language plpgsql
as $body$
declare
  select_stmt constant text not null := 'select v from s.t where k = $1';
  delete_stmt constant text not null := 'delete from s.t where k = $1 returning v';
  v_into  int not null := 0;
  v_ret   int not null := 0;
begin
  execute select_stmt into strict v_into using key;
  execute delete_stmt into strict v_ret  using key;
  assert v_ret = v_into;
end;
$body$;

call s.p(5);

It finishes silently, showing that the assert holds.

Of course, the SQL statement text can be a DDL, too. Dynamic DDLs are especially useful when you don't know identifier names until run-time. Indeed, this is the canonical, and overwhelmingly common, use of Dynamic DDLs.

Dynamic DDLs are best illustrated by deriving the function admin.init_dynamic() from the function admin.init() shown in the bare (static) SQL statement section above. Do this:

create procedure admin.init_dynamic(
  schema_name  in text,
  table_name   in text,
  lo           in int,
  hi           in int)
  set search_path = pg_catalog, pg_temp
  set client_min_messages = error
  language plpgsql
as $body$
declare
  drop_schema    constant text not null :=
    'drop schema if exists %I cascade';
  create_schema  constant text not null :=
    'create schema %I';
  create_table   constant text not null :=
    'create table %I.%I(k serial primary key, v int)';
  insert_rows    constant text not null :=
    'insert into %I.%I(v) select generate_series(%L::int, %L::int)';
begin
  execute format(drop_schema,   schema_name);
  execute format(create_schema, schema_name);
  execute format(create_table,  schema_name, table_name);
  execute format(insert_rows,   schema_name, table_name, lo, hi);
end;
$body$;

Test it like this:

call admin.init_dynamic('My Schema', 'My Table', lo=>11, hi=>20);

It finishes silently without error. Check that you can see the expected new schema in the catalog:

select n.nspname as "Schema name"
from
  pg_namespace as n
  inner
  join pg_roles as r
  on n.nspowner = r.oid
where
  r.rolname = :quoted_u and
  n.nspname ~'^My';

This is the result:

 Schema name 
-------------
 My Schema

Now check that the expected table is found in the schema My Schema:

select c.relname as "Table name"
from
  pg_class as c
  inner join
  pg_namespace as n
  on c.relnamespace = n.oid
  inner join
  pg_roles as r
  on n.nspowner = r.oid
where
  c.relkind = 'r' and
  r.rolname = :quoted_u and
  n.nspname ~'^My';

This is the result:

 Table name 
------------
 My Table

Notice that the procedure admin.init_dynamic() adds these text formal arguments to admin.init():

schema_name
table_name

And notice how all of the formal arguments are used, in the execute statement, as actual arguments to the format() built-in SQL function.

The placeholder %I means that the actual argument will be taken as the text of an object name—and then, according to the emergent status of the name as common or exotic, this actual argument will replace the %I placeholder, respectively, either as is or tightly surrounded with double-quotes. The section Names and identifiers explains the common name and the exotic name notions and what these notions imply for the need for double-quoting.
The placeholder %L means that the actual argument will be taken as the text of a literal value. The replaced %L placeholder will always be surrounded by single-quotes. You must therefore take account of the data type that you know that you want by writing the appropriate typecast after %L.

Try these simple tests—first using %I:

select
  (select format('select * from %I',  'employees')  as "common name"),
  (select format('select * from %I',  'Pay Grade')  as "exotic name");

This is the result:

       common name       |        exotic name        
-------------------------+---------------------------
 select * from employees | select * from "Pay Grade"

And next using %L:

select
  (select format('select %L::int',           42)    as "int literal"),
  (select format('select %L::boolean',       true)  as "boolean literal"),
  (select format('select %L',         'My Schema')  as "text literal");

This is the result:

   int literal    |   boolean literal   |    text literal    
------------------+---------------------+--------------------
 select '42'::int | select 't'::boolean | select 'My Schema'

Always use the 'format()' function to add object names or values into the text of a dynamic SQL statement

An Internet search for "SQL injection" reports that it finds about 7 million hits. The risk of SQL injection is considered to be a huge security threat. You'll soon find articles with code examples that illustrate how the risk can occur. This Wikipedia piece is a good place to start.

Briefly, the risk occurs when you want to construct the text of a SQL statement programmatically and you want it to contain identifiers and values that become known only at run-time. Naïve programmers simply assemble the text using the ordinary text concatenation operator—but you must never do this. The examples that Internet search finds show you what can go wrong if you do this.

(You can reduce the size of the hit-list to about 300 thousand by searching for PostgreSQL "format()" "SQL injection". )

You must always use the format() built-in SQL function when you need to define the text of a dynamic SQL statement at run-time. The example above shows you how to do this.

The "perform" statement

See the syntax rule above

The perform statement lets you execute a select statement that, as a top-level SQL statement, would produce one or many rows. But, critically, it lets you avoid using the select into statement for one row or the query for loop (or the infinite loop) for many rows. It's the moral equivalent of this:

select ... into null;

analogous, in Unix, to sending output to /Dev/Null. Try this top-level SQL statement:

with s(val) as (select g.val from generate_series(1, 11) as g(val))
select s.val as k, s.val*10 as v from s;

This, of course, is the result:

 k | v  
---+----
 1 | 10
 2 | 20
 ......
 8 | 80
 9 | 90

To use this as the argument for perform, you must encapsulate as a subquery in an outer select statement, thus:

select k, v from (
    with s(val) as (select g.val from generate_series(1, 11) as g(val))
    select s.val as k, s.val*10 as v from s
  ) as subqry;

You can now use this in a PL/pgSQL block statement simply by replacing the starting select with perform. Do this:

do $body$
begin
  perform k, v from (
      with s(val) as (select g.val from generate_series(1, 11) as g(val))
      select s.val as k, s.val*10 as v from s
    ) as subqry;
end;
$body$;

It finishes silently without error.

This do statement immediately above shows the general case, where the select statement whose results you want to throw away must use a with clause.

This need arises, for example, when you perform multi-table insert, update or delete by encapsulating these statements each with a pro-forma returning clause, and in its own CTE. Then you cause them to execute by selecting the union of these pro-forma results from the various CTEs. But the results are of no interest. They were simply a necessary device.

Coming soon

A self-contained, working example will follow.

By far the most common use of perform is when you need to use a SQL built-in, like pg_sleep() that is morally a procedure but that, for implementation reasons, has to be a function. Try this:

do $body$
begin
  perform pg_sleep(2.0);
end;
$body$;

It finishes silently after two seconds.