The "query for loop"

This page documents the preview version (v2.23). Preview includes features under active development and is for development and testing only. For production, use the stable version (v2024.1). To learn more, see Versioning.

The query for loop has three variants according to how the query is defined:

  • with a subquery
  • with a so-called bound ref cursor variable
  • with a dynamic SQL statement

"Query for loop" with a subquery

Do this:

\c :db :u
drop schema if exists s cascade;
create schema s;
create table s.t(k serial primary key, v int not null);
insert into s.t(v) select generate_series(0, 24, 5);

create function s.f(k_lo in int, k_hi in int)
  returns table(z text)
  set search_path = pg_catalog, pg_temp
  language plpgsql
as $body$
declare
  rec record;
begin
  for rec in (
    select k, v
    from s.t
    order by k)
  loop
    z := rpad(rec.k::text, 10)||rec.v::text;      return next;
  end loop;
  z := '';                                        return next;
  for rec in (
    select v
    from s.t
    where k between k_lo and k_hi
    order by k)
  loop
    z := rec.v::text;                             return next;
  end loop;
end;
$body$;

Test it thus:

select s.f(2, 4);

This is the result:

 1         0         
 2         5         
 3         10        
 4         15        
 5         20        
 
 5
 10
 15

Look at the first query for loop:

for rec in (
  select k, v
  from s.t
  order by k)
loop
  z := rpad(rec.k::text, 10)||rec.v::text;        return next;
end loop;

Notice these points:

  • The loop variable for a query for loop must be explicitly declared.
  • The loop variable, rec, has the data type record. This is a so-called pseudo-type. While you can use it as the data type of a subprogram's formal argument or of a PL/pgSQL variable, you cannot use it as the data type of, for example, a table's column. The attempt "create table s.x(k serial primary key, v record)" causes the 42P16 error:
    column "v" has pseudo-type record
    
  • The loop variable, rec, can therefore accommodate, at runtime, whatever shape the loop's defining query happens to establish—i.e. it is polymorphic.

The polymorphism is emphasized by the second query for loop:

for rec in (
  select v
  from s.t
  where k between k_lo and k_hi
  order by k)
loop
  z := rec.v::text;                               return next;
end loop;

The select list in the first loop is k, v. But in the second loop, it's just v alone.

In most cases, you'll simply use ordinary variables or formal arguments (i.e. not a record) for the loop variable(s) like this:

create function s.g(v_lo in int, v_hi in int)
  returns table(k_out int, v_out int)
  set search_path = pg_catalog, pg_temp
  language plpgsql
as $body$
begin
  for k_out, v_out in (
    with
      c as (update s.t set v = v + 3 where k between v_lo and v_hi returning k, v)
    select k, v from c
    order by k
    )
  loop
    return next;
  end loop;
end;
$body$;

select k_out, v_out from s.g(3, 7);

This is the result:

 k_out | v_out 
-------+-------
     3 |    13
     4 |    18
     5 |    23

This example makes another point. The query for loop cannot use an insert... returning, an update... returning, or a delete... returning statement directly. (The attempt causes a syntax error.) However, the rules of top-level SQL do allow a select statement to use a common table expression (a.k.a. CTE) that is defined with such a DML statement that uses a returning clause. See the section WITH clause—SQL syntax and semantics. You can therefore take advantage of this to use a query for loop to iterate over the result set that the returning clause of a DML statement defines.

"Query for loop" with a bound refcursor variable

See the subsection Semantics for the "plpgsql_bound_refcursor_declaration" rule on the declaration section page.

Try this:

\c :db :u
drop schema if exists s cascade;
create schema s;
create table s.t(k serial primary key, v int not null);
insert into s.t(v) select generate_series(0, 49, 5);

create function s.f()
  returns table(z text)
  set search_path = pg_catalog, pg_temp
  language plpgsql
as $body$
declare
  cur cursor(k_lo int, k_hi int) for (
    select k, v
    from s.t
    where k between k_lo and k_hi
    order by k);
begin
  assert pg_typeof(cur)::text = 'refcursor';
  for rec in cur(2, 4) loop
    assert pg_typeof(rec)::text = 'record';
    z := rpad(rec.k::text, 10)||rec.v::text;      return next;
  end loop;
  z := '';                                        return next;
  for rec in cur(7, 9) loop
    z := rpad(rec.k::text, 10)||rec.v::text;      return next;
  end loop;
end;
$body$;

select s.f();

This is the result:

 2         5
 3         10
 4         15
 
 7         30
 8         35
 9         40

Notice these points:

  • With this form of the query for loop, the loop variable is an implicitly declared record. (The first assert confirms this.) You cannot use, say, two ordinarily declared int variables, like k_out and v_out, even if this would serve your purposes better.
  • The cursor is implicitly opened just before the loop's first iteration. And it's implicitly closed when the loop exits.
  • The parameterization of the bound cursor variable does not allow the use of, say, (k_lo in int, k_hi in int). The in/out mode can only be in—and this is simply implied.
  • Even though the spelling of the declaration of the bound refcursor variable, cur, seems to declare it as plain cursor, the second assert shows that the data type of cur is in fact refcursor. The declaration syntax is simply a historical quirk.

Finally, try this:

create function s.g()
  returns refcursor
  set search_path = pg_catalog, pg_temp
  language plpgsql
as $body$
declare
  cur cursor for (
    select k, v
    from s.t
    where k between 2 and 4
    order by k);
begin
  return cur;
end;
$body$;

create function s.h()
  returns table(z text)
  set search_path = pg_catalog, pg_temp
  language plpgsql
as $body$
declare
  cur refcursor := s.g();
begin
  assert pg_typeof(cur)::text = 'refcursor';
  /*
  for rec in cur loop

  end loop;
  */
end;
$body$;

select s.h();

The functions are created without error. And s.h(), as written, executes without error. Of course, it's useless because it does nothing. Try to re-create s.h() with the query for loop uncommented. The attempt causes the 42601 syntax error with this message:

 cursor FOR loop must use a bound cursor variable

The status as bound of a variable whose data type is refcursor is established by the special syntax in the declaration section that's used in the function s.g():

declare
  cur cursor for (
    select k, v
    from s.t
    where k between 2 and 4
    order by k);

But this status is retained only within the block statement that the declaration section introduces. The data type of the return value of s.g() is simply an ordinary unbound refcursor—and, as such, this declaration in s.h():

cur refcursor := s.g();

has the identical effect to just this:

cur refcursor;

This means that a query for loop with a bound refcursor variable brings very limited added value beyond what a query for loop with a subquery brings. The function s.f(), above, shows what you might find to be useful added value: the same bound refcursor variable is used, but with different parameterizations, twice within the same block statement whose declaration section defines the cursor.

"Query for loop" with a dynamic SQL statement

Try this:

\c :db :u
drop schema if exists s cascade;
create schema s;
create table s.t(k serial primary key, v int not null);
insert into s.t(v) select generate_series(0, 49, 5);

create function s.f(k_lo in int, k_hi in int)
  returns table(k int, v int)
  set search_path = pg_catalog, pg_temp
  language plpgsql
as $body$
declare
  qry constant text not null := '
    select k, v 
    from s.t
    where k between $1 and $2
    order by k;';
begin
  for k, v in
    execute qry using k_lo, k_hi
  loop
    return next;
  end loop;
end;
$body$;

select k, v from s.f(5, 8);

This is the result:

 k | v  
---+----
 5 | 20
 6 | 25
 7 | 30
 8 | 35

You might think that it could help readability by surrounding the dynamically defined query with parentheses like this:

for k, v in (execute qry using k_lo, k_hi) loop

But the attempt causes the 42601 syntax error and complains about the opening parenthesis.

Finally, try this

create function s.g(k_lo in int, k_hi in int)
  returns table(k int, v int)
  set search_path = pg_catalog, pg_temp
  language plpgsql
as $body$
declare
  stmt constant text not null := '
    update s.t
    set v = v + 3
    where k between $1 and $2 returning k, v;';
begin
  for k, v in
    execute stmt using k_lo, k_hi
  loop
    return next;
  end loop;
end;
$body$;

select k, v from s.g(5, 8);

This is the result:

 k | v  
---+----
 5 | 23
 6 | 28
 7 | 33
 8 | 38

In other words, when the loop's SQL statement is defined dynamically, you can use an insert... returning statement directly in the loop's definition. (The same holds for update... returning and delete... returning.) Notice, though, that the syntax of such a DML statement with a returning clause does not allow an order by clause. You might, therefore, want to use the same construct where the DML statement is defined within a CTE for a query for loop with a dynamic SQL statement that was shown in the example for a query for loop with a subquery, above.