Variadic and polymorphic subprograms

A variadic subprogram allows one of its formal arguments to have any number of actual values, provided as a comma-separated list, just like the built-in functions least() and greatest() allow.

A polymorphic subprogram allows both its formal arguments and, for a function, its return data type to be declared using so-called polymorphic data types (for example, anyelement or anyarry) so that actual arguments with various different data types can be used, in different invocations of the subprogram, for the same formal arguments. Correspondingly, the data type(s) of the actual arguments will determine the data type of a function's return value.

You can declare variadic or polymorphic arguments, or polymorphic return values, for both language sql and language plpgsql subprograms.

Variadic subprograms

You mark a subprogram's formal argument with the keyword variadic as an alternative to marking it with in, out, or inout. See the arg_mode syntax rule. A subprogram can have at most just one variadic formal argument; and it must be last in the list of arguments. Breaking this rule causes this syntax error:

42P13: VARIADIC parameter must be the last input parameter

User-defined mean() variadic function

Though there are built-in least() and greatest() variadic functions (corresponding to the built-in min() and max() aggregate functions), there is no built-in mean() variadic function to correspond to the built-in avg() aggregate function. It therefore provides a nice example of how to implement your own. Try this:

create function mean(
  arr variadic numeric[] = array[1.1, 2.1, 3.1])
  returns numeric
  immutable
  set search_path = pg_catalog, pg_temp
  language sql
as $body$
  with c(v) as (
    select unnest(arr))
  select avg(v) from c;
$body$;

select to_char(mean(17.1, 6.5, 3.4), '99.99') as "mean";

The result is 9.00, as is expected. You can see immediately that each actual value that you provide for the variadic formal argument must have (or be typecastable to) the data type of the array with which this argument is declared.

You can transform the language sql version trivially into a language plpgsql function simply by re-writing the body thus:

  ...
  language plpgsql
as $body$
begin
  return (
    with c(v) as (
      select unnest(arr))
    select avg(v) from c);
end;
$body$;

The example defines a default value for the variadic formal argument simply to make the point that this is legal. It's probably unlikely that you'll have a use case that benefits from this.

Choosing between the variadic form and the non-variadic equivalent form

The example makes the point that declaring a formal argument as variadic is simply syntax sugar for what you could achieve without this device by declaring the argument ordinarily as an array. Try this:

create function mean(arr numeric[])
  returns numeric
  immutable
  set search_path = pg_catalog, pg_temp
  language sql
as $body$
  with c(v) as (
    select unnest(arr))
  select avg(v) from c;
$body$;

select to_char(mean(array[17.1, 6.5, 3.4]), '99.99') as "mean";

Its body is identical to that of the variadic language sql version—and it produces the same result. The choice between the two approaches is determined entirely by usability: it's less cluttered to invoke the function with a bare list of values than to surround them with the array[] constructor. On the other hand, if the common case is that the values are already available as an array, then you may just as well not choose the variadic implementation. Notice, though, that syntax is defined that lets you invoke a variadic function using an array of values as the actual argument.

Re-create the variadic form of mean(). (It doesn't matter whether you use the language sql or the language plpgsql version.) Now invoke it thus:

select to_char(mean(variadic array[17.1, 6.5, 3.4]), '99.99') as "mean";

This runs without error and produces the expected 9.00 result.

Polymorphic subprograms

These are the polymorphic data types:

  • anyelement
  • anyarray
  • anynonarray
  • anyenum
  • anyrange

See the PostgreSQL documentation section 38.2.5. Polymorphic Types. These data types are a subset of the so-called pseudo-Types. See the PostgreSQL documentation section 8.21. Pseudo-Types. Notice this from the Polymorphic Types account:

Note that anynonarray and anyenum do not represent separate type variables; they are the same type as anyelement, just with an additional constraint. For example, declaring a function as f(anyelement, anyenum) is equivalent to declaring it as f(anyenum, anyenum): both actual arguments have to be the same enum type.

This section illustrates the use of just anyelement and anyarray. See the PostgreSQL documentation section 38.5.10. Polymorphic SQL Functions for more information on this topic.

Proof-of-concept example #1

Try this. It simply shows an example of how to use the feature.

create function my_typeof(v in text)
  returns text
  immutable
  set search_path = pg_catalog, pg_temp
  language sql
as $body$
  select 'non-polymorphic text: '||pg_typeof(v)::text;
$body$;

create function my_typeof(v in int)
  returns text
  immutable
  set search_path = pg_catalog, pg_temp
  language sql
as $body$
  select 'non-polymorphic int: '||pg_typeof(v)::text;
$body$;

create function my_typeof(v in anyelement)
  returns text
  immutable
  set search_path = pg_catalog, pg_temp
  language sql
as $body$
  select 'polymorphic: '||pg_typeof(v)::text;
$body$;

create type s.ct as (b boolean, i int);

\x on
\t on
select
  (select my_typeof('dog'))             as "dog",
  (select my_typeof(42))                as "42",
  (select my_typeof(true))              as "true",
  (select my_typeof((true, 99)::s.ct))  as "(true, 99)::s.ct";
\t off
\x off

It runs without error and produces this result:

dog              | non-polymorphic text: text
42               | non-polymorphic int: integer
true             | polymorphic: boolean
(true, 99)::s.ct | polymorphic: s.ct

The function my_typeof() adds no functionality beyond what the pg_typeof() built-in function exposes. So the code example has no value beyond this pedagogy:

  • The built-in function pg_typeof() is itself polymorphic. The \df meta-command shows that its input formal argument has the data type "any". (The double quotes are used because any is a SQL reserved word.) Notice that the designer of this built-in could just as well have defined it with an input formal argument of data type anyelement.
  • A user-defined subprogram with an input formal argument of data type anyelement can be among a set of distinguishable overloads where others in the set have input formal arguments of data type, for example, text or int.

Proof-of-concept example #2

Try this. It simply shows another example of how to use the feature.

create function array_from_two_elements(
  e1 in anyelement, e2 in anyelement)
  returns anyarray
  immutable
  set search_path = pg_catalog, pg_temp
  language sql
as $body$
  select array[e1, e2];
$body$;

create type s.ct as (k int, t text);

create view pg_temp.v(arr) as
select array_from_two_elements((17, 'dog')::ct, (42, 'cat')::ct);

with c(e) as (
  select unnest(arr) from pg_temp.v)
select (e).k, (e).t from c;

It produces this result:

 k  |  t  
----+-----
 17 | dog
 42 | cat

Here, too, the function array_from_two_elements() adds no functionality beyond what the native array constructor exposes. So the code example has no value beyond this pedagogy:

  • It shows that when two elements (i.e. scalar values) of data type ct are listed in an array constructor, the emergent data type of the result is ct[].
  • In other words, when the input formal arguments have the data type anyelement, the returns anyarray clause automatically accommodates this correctly.

This makes the same point more explicitly:

select pg_typeof(arr) from pg_temp.v;

This is the result:

 pg_typeof 
-----------
 ct[]

Combining variadic and polymorphic functionality in a single user-defined function

A variadic function can be polymorphic too. Simply declare its last formal argument as variadic anyarray. Argument matching, and the determination of the actual result type, behave as you'd expect.

Example: mean() for numeric values or single character values

Don't be concerned about the artificiality of this example. Its aim is simply to illustrate how the low-level functionality works. If the variadic input is a list of numeric values, then the body code detects this (using the pg_typeof() built-in function) and branches to use the code that the non-polymorphic implementation shown in the section User-defined mean() variadic function above uses. But if the variadic input is a list of s.one_character values (where s.one_character is a user-defined domain based on the native text data type whose constraint ensures that the length is exactly one character), then the implementation uses a different method:

  • It converts each single character text value to a numeric value with the ascii() built in function.
  • It calculates the mean of these numeric values in the same way that the implementation for the mean of numeric input values does.
  • It uses the round() built in function, together with the ::int typecast to convert this mean to the nearest int value.
  • It converts this int value back to a single character text value using the chr() built in function.

Create the constraint function, the domain, and the variadic polymorphic function thus:

create schema s;

create function s.is_one_character(t in text)
  returns boolean
  set search_path = pg_catalog, pg_text
  language plpgsql
as $body$
declare
  msg constant text not null := '«%s» is not exactly one character character';
begin
  assert length(t) = 1, format(msg, t);
  return true;
end;
$body$;

create domain s.one_character as text
constraint is_one_character check(s.is_one_character(value));

create function s.mean(arr variadic anyarray)
  returns anyelement
  immutable
  set search_path = pg_catalog, pg_temp
  language plpgsql
as $body$
declare
  n  numeric         not null := 0.0;
  t  s.one_character not null := 'a';
begin
  assert cardinality(arr) > 0, 'cardinality of arr < 1';
  declare
    d_type constant regtype not null := pg_typeof(arr[1]);
  begin
    case d_type
      when pg_typeof(n) then
        with c(v) as (select unnest(arr))
          select avg(v) from c into n;
        return n;

      when pg_typeof(t) then
        with c(v) as (select unnest(arr))
          select chr(round(avg(ascii(v)))::int) from c into t;
        return t;
    end case;
  end;
end;
$body$;

Test it with a list of numeric values.

select to_char(s.mean(17.1, 6.5, 3.4), '99.99') as "mean";

This is the result:

  mean  
--------
   9.00

Now test it with a list of s.one_character values:

select s.mean(
  'c'::s.one_character,
  'e'::s.one_character,
  'g'::s.one_character);

This is the result:

 mean 
------
 e

Notice that the argument of the returns keyword in the function header is anyelement. This has a strictly defined meaning. The internal implementation detects the data type of the input array's elements. And then it checks that, however the user-defined implementation manages this, the data type of the computed to-be-returned value matches the data type of the input array's elements. If this requirement isn't met, then you're likely to get a run-time error when the run-time interpretation of your code attempts to typecast the value that you attempt to return to the expected actual data type for anyelement in the present execution. For example, if you simply use this hard-coded return:

return true::boolean;

then you'll get this run-time error:

P0004: «true» is not exactly one character character

because, of course, the text typecast of the boolean value has four characters. You can experiment by adding a leg to the case statement to handle a variadic input list of some user-defined composite type values—and then hard-code a non-composite value for the return argument. You'll see this run-time error:

42804: cannot return non-composite value from function returning composite type

Example: mean() for numeric values or rectangle area values

This example emphasizes the meaning of writing returns anyelement by specifying that the returned mean should always be numeric—both when the variadic input is a list of scalar numeric values and when it's a list of rectangle composite type values. Here, you simply write what you want: returns numeric. Create the function thus:

create schema s;
create type s.rectangle as (len numeric, wid numeric);

create function s.mean(arr variadic anyarray)
  returns numeric
  immutable
  set search_path = pg_catalog, pg_temp
  language plpgsql
as $body$
declare
  c    constant int         not null :=  cardinality(arr);
  n             numeric     not null := 0.0;
  r             s.rectangle not null := (1.0, 1.0);
  tot           numeric     not null  := 0.0;
begin
  assert c > 0, 'cardinality of arr < 1';
  declare
    d_type constant regtype not null := pg_typeof(arr[1]);
  begin
    case d_type
      when pg_typeof(n) then
        with c(v) as (select unnest(arr))
          select avg(v) from c into n;
        return n;

      when pg_typeof(r) then
        foreach r in array arr loop
          tot := tot + r.len*r.wid;
        end loop;
        return tot/c::numeric;
    end case;
  end;
end;
$body$;

Test it with the same list of numeric values that you used to test the previous version of mean():

select to_char(s.mean(17.1, 6.5, 3.4), '99.99') as "mean";

Of course, the result is the same for this version of mean() as for the previous version:

  mean  
--------
   9.00

Now test it with a list of rectangle values:

select to_char(s.mean(
    (2.0, 3.0)::s.rectangle,
    (3.0, 9.0)::s.rectangle,
    (5.0, 7.0)::s.rectangle),
  '99.99') as "mean";

This is the result:

  mean  
--------
  22.67

One polymorphic subprogram versus an overload set of non-polymorphic subprograms

Recast the previous polymorphic implementation of the variadic mean() as two separate non-polymorphic implementations, thus:

drop function if exists s.mean(anyarray) cascade;

create function s.mean(arr variadic numeric[])
  returns numeric
  immutable
  set search_path = pg_catalog, pg_temp
  language plpgsql
as $body$
begin
  assert cardinality(arr) > 0, 'cardinality of arr < 1';
  return (
    with c(v) as (select unnest(arr))
      select avg(v) from c);
end;
$body$;

create function s.mean(arr variadic s.rectangle[])
  returns numeric
  immutable
  set search_path = pg_catalog, pg_temp
  language plpgsql
as $body$
declare
  c    constant int         not null :=  cardinality(arr);
  r             s.rectangle not null := (1.0, 1.0);
  tot           numeric     not null  := 0.0;
begin
  assert cardinality(arr) > 0, 'cardinality of arr < 1';
  foreach r in array arr loop
    tot := tot + r.len*r.wid;
  end loop;
  return tot/c::numeric;
end;
$body$;

Test it with the same two select statements that you used to test the polymorphic version. The results are identical.

Choosing between the different approaches can only be a judgement call—informed by intuition born of experience.

  • The total source code size of the two non-polymorphic implementations, here, is greater than that of the single polymorphic implementation. But this is largely explained by the repetition, with the non-polymorphic implementations, of the boilerplate source text of the function headers.
  • The implementation of each non-polymorphic implementation is simpler than that of the polymorphic version because no user-written code is needed to detect the data type of the variadic argument list; rather, this is done behind the scenes by the run-time system when it picks the appropriate overload.
  • The polymorphic implementation encapsulates all the implementation variants in a single code unit. This is beneficial because (short of using dedicated schemas to group functionally related subprogram overloads) you'd have to rely on external documentation to explain that the different name() variants belonged together as a set.