Sensitivity of converting between timestamptz and plain timestamp to the UTC offset

The semantic rules for the conversion, in each direction, rest on a common-sense convention. Each conversion uses a value for the UTC offset. And this value is always known: either because the at time zone operator specifies it (either explicitly or implicitly via a timezone name); or from the session's current TimeZone setting. This is explained in the section Four ways to specify the UTC offset.

When a timestamptz value is converted to a plain timestamp value, the target is assigned to the date-and-time-of-day component of the source value, when this is expressed with respect to the inevitably known reigning UTC offset.
When a plain timestamp value is converted to a timestamptz value, the target is assigned by normalizing to UTC according to the inevitably known reigning UTC offset.

The rules that this section explains underly the rules for the text to timestamptz conversion.

The philosophy of the demonstration's design

The demonstration uses the table function plain_timestamp_to_from_timestamp_tz(). The overall shape of this is very similar to that of these two functions:

The table function interval_arithmetic_results() presented in the "sensitivity of timestamptz-interval arithmetic to the current timezone" section. That section is the present section's peer under the parent section "Scenarios that are sensitive to the UTC offset and possibly, additionally, to the timezone".
The table function timestamptz_vs_plain_timestamp() presented in the "plain timestamp and timestamptz data types" section.

The demonstration sets these three goals:

Goal one: to use assert statements to test the identity of effect of the typecast operator and the at time zone current('TimeZone') operator in the two directions of interest: the plain timestamp to timestamptz direction and the timestamptz to plain timestamp direction:
Goal two: to use assert statements to test the assumed rules for the conversions in each direction, as empirical observations suggest they might be..
Goal three: to let you vizualize the rules for the conversions by outputting the "from-to" value pairs for each conversion direction on a single line, doing the conversions at each of a set of representative timezones, each of which has a different UTC offset. (The visualization is enriched by showing the conversion outcomes first with UTC as the session's current TimeZone setting and then with, for each outcome, the timezone at which the conversions are done as the session's current TimeZone setting.

The demonstration that follows is designed like this:

A table function, plain_timestamp_to_from_timestamp_tz(at_utc in boolean), is used to enable a convenient "running commentary" visualization. It has one formal parameter: a boolean to let you specify the visualization mode.
- One choice for at_utc asks to see the plain timestamp and timestamptz values for each result row with the session's TimeZone set to UTC. (Of course, only the timestamptz values are affected by the setting.)
- The other choice asks to see the values for each result row with the session's TimeZone set to what was reigning when the plain timestamp to timestamptz and the timestamptz to plain timestamp conversions were done for that row.

Two constants, one with data type plain timestamp and one with data type timestamptz are initialized so that the internal representations (as opposed to the metadata) are identical. Look:

ts_plain    constant timestamp   not null := make_timestamp  (yyyy, mm, dd, hh, mi, ss);
ts_with_tz  constant timestamptz not null := make_timestamptz(yyyy, mm, dd, hh, mi, ss, 'UTC');

Each uses the same constant int values, yyyy, mm, dd, hh, mi, and ss, to define the identical date-and-time part for each of the two moments. The fact that UTC is used for the timezone argument of the make_timestamptz() invocation ensures the required identity of the internal representations of the two moments—actually, both as plain timestamp values.
The extract(epoch from ... ) function is used to get the numbers of seconds, as constant double precision values, from the start of the epoch for the two moment values. These two numbers of seconds are actually identical. But two distinct names (ts_plain_epoch and ts_with_tz_epoch) are used for these to help the reader see the symmetry of the two tests—one in each direction.

A constant array, timezones, is initialized thus:

timezones constant text[] not null := array[
                                              'Pacific/Pago_Pago',
                                              'America/Porto_Velho',
                                              'Atlantic/South_Georgia',
                                              'UTC',
                                              'Africa/Tripoli',
                                              'Asia/Dubai',
                                              'Pacific/Kiritimati'
                                            ];

A foreach loop is run thus:
```
foreach z in array timezones loop
```
At each loop iteration:
- The session's TimeZone setting is set to the value that the iterand, z, specifies.
- These assignments are made:
```
ts_with_tz_1       := ts_plain::timestamptz;
ts_with_tz_2       := ts_plain at time zone current_setting('timezone');
ts_with_tz_1_epoch := extract(epoch from ts_with_tz_1);

ts_plain_1         := ts_with_tz::timestamp;
ts_plain_2         := ts_with_tz at time zone current_setting('timezone');
ts_plain_1_epoch   := extract(epoch from ts_plain_1);

z_epoch            := extract(epoch from utc_offset(z));
```
- These assert statements are executed to show that the ::timestamp and ::timestamptz typecasts are identical to at time zone current_setting('timezone'):
```
assert (ts_with_tz_2 = ts_with_tz_1), 'Assert #1 failed.';
assert (ts_plain_2   = ts_plain_1  ), 'Assert #2 failed.';
```
- These assert statements are executed to show that the expected rules for the conversion of the internal representations hold, in both directions between plain timestamp and timestamptz:
```
assert ( ts_with_tz_1_epoch = (ts_plain_epoch   - z_epoch) ), 'Assert #3 failed.';
assert ( ts_plain_1_epoch   = (ts_with_tz_epoch + z_epoch) ), 'Assert #4 failed.';
```
- According to the choice for at_utc: either the timezone is set to UTC; or it is simply left at what the loop iterand, z, set it to. Then these values are formatted as a text line using the to_char() built-in function: ts_plain, ts_with_tz_1, ts_with_tz, and ts_plain_1. The row is labeled with the value of the loop iterand, z.
Finally, after the loop completes and before exiting, the session's TimeZone setting is restored to the value that it had on entry to the function. (It's always good practice to do this for any settings that your programs need, temporarily, to change.)

The demonstration

First, create a trivial function to return the pg_timezone_names.utc_offset value for a specified value of pg_timezonenames.name:

drop function if exists utc_offset(text) cascade;
create function utc_offset(tz_name in text)
  returns interval
  language plpgsql
as $body$
declare
  i constant interval not null := (
                                    select a.utc_offset from
                                    pg_timezone_names a
                                    where a.name = tz_name
                                  );
begin
  return i;
end;
$body$;

Now create a wrapper to return the utc_offset()'s returned interval value as text in exactly the format that the literal for a timestamptz value uses.

drop function if exists utc_offset_display(text) cascade;
create function utc_offset_display(tz_name in text)
  returns text
  language plpgsql
as $body$
declare
  i constant interval not null := utc_offset(tz_name);
  hh constant int     not null := extract(hours   from i);
  mm constant int     not null := extract(minutes from i);
  t  constant text    not null := to_char(hh, 's00')||
                                    case
                                      when mm <> 0 then ':'||ltrim(to_char(mm, '00'))
                                      else               ''
                                    end;
begin
  return t;
end;
$body$;

Now create two formatting functions:

one to format the results for maximum readability
and another to format the column headers.

It's dull code, and you needn't read it. You can understand what it does by its effect. It's nice to package away the dull code so that it doesn't clutter the interesting logic of the main plain_timestamp_to_from_timestamp_tz() table function.

create function report_line(
  z             in text,
  ts_plain      in timestamp,
  ts_with_tz_1  in timestamptz,
  ts_with_tz    in timestamptz,
  ts_plain_1    in timestamp)
  returns text
  language plpgsql
as $body$
declare
  t constant text not null :=
      '['||rpad(z, 23)||lpad(utc_offset_display(z), 3)||']'     ||'   '||
           rpad(to_char(ts_plain,     'Dy hh24:mi'        ),  9)||'   '||
           rpad(to_char(ts_with_tz_1, 'Dy hh24:mi TZH:TZM'), 16)||'      '||
           rpad(to_char(ts_with_tz,   'Dy hh24:mi TZH:TZM'), 16)||'   '||
           rpad(to_char(ts_plain_1,   'Dy hh24:mi'        ),  9);
begin
  return t;
end;
$body$;

drop function if exists headers() cascade;
create function headers()
  returns text[]
  language plpgsql
as $body$
declare
  t text[] := array['x', 'x', 'x'];
begin
  t[1] :=
         rpad(' ',           28)           ||'   '||
         rpad('From',         9)           ||'   '||
         rpad('To',  16)                   ||'      '||
         rpad('From',  16)                 ||'   '||
              'To';

  t[2] :=
    '['||rpad('Timezone',    20)||'Offset]'||'   '||
         rpad('ts_plain',     9)           ||'   '||
         rpad('ts_with_tz',  16)           ||'      '||
         rpad('ts_with_tz',  16)           ||'   '||
              'ts_plain';

  t[3] :=
         rpad('-',      28, '-')           ||'   '||
         rpad('-',       9, '-')           ||'   '||
         rpad('-',      16, '-')           ||'      '||
         rpad('-',      16, '-')           ||'   '||
         rpad('-', 9, '-');
  return t;
end;
$body$;

Now create and execute the plain_timestamp_to_from_timestamp_tz() table function thus.

drop function if exists plain_timestamp_to_from_timestamp_tz(boolean) cascade;

create function plain_timestamp_to_from_timestamp_tz(at_utc in boolean)
  returns table(t text)
  language plpgsql
as $body$
declare
  set_timezone      constant text             not null := $$set timezone = '%s'$$;
  tz_on_entry       constant text             not null := current_setting('timezone');

  yyyy              constant int              not null := 2000;
  mm                constant int              not null := 1;
  dd                constant int              not null := 1;
  hh                constant int              not null := 10;
  mi                constant int              not null := 15;
  ss                constant int              not null := 0;
  ts_plain          constant timestamp        not null := make_timestamp  (yyyy, mm, dd, hh, mi, ss);
  ts_with_tz        constant timestamptz      not null := make_timestamptz(yyyy, mm, dd, hh, mi, ss, 'UTC');
  ts_plain_epoch    constant double precision not null := extract(epoch from ts_plain);
  ts_with_tz_epoch  constant double precision not null := extract(epoch from ts_with_tz);
begin
  t := '---------------------------------------------------------------------------------------------';     return next;
  if at_utc then
    t := 'Displaying all results using UTC.';                                                               return next;
  else
    t := 'Displaying each set of results using the timezone in which they were computed.';                  return next;
  end if;
  t := '---------------------------------------------------------------------------------------------';     return next;
  t := '';                                                                                                  return next;

  declare
    ts constant text[] not null := headers();
  begin
    foreach t in array ts loop
                                                                                                            return next;
    end loop;
  end;

  declare
    z                  text   not null := '';
    timezones constant text[] not null := array[
                                                 'Pacific/Pago_Pago',
                                                 'America/Porto_Velho',
                                                 'Atlantic/South_Georgia',
                                                 'UTC',
                                                 'Africa/Tripoli',
                                                 'Asia/Dubai',
                                                 'Pacific/Kiritimati'
                                               ];
  begin
    foreach z in array timezones loop
      execute format(set_timezone, z);

      declare
        ts_with_tz_1        constant timestamptz      not null := ts_plain::timestamptz;
        ts_with_tz_2        constant timestamptz      not null := ts_plain at time zone current_setting('timezone');
        ts_with_tz_1_epoch  constant double precision not null := extract(epoch from ts_with_tz_1);

        ts_plain_1          constant timestamp        not null := ts_with_tz::timestamp;
        ts_plain_2          constant timestamp        not null := ts_with_tz at time zone current_setting('timezone');
        ts_plain_1_epoch    constant double precision not null := extract(epoch from ts_plain_1);

        z_epoch             constant double precision not null := extract(epoch from utc_offset(z));
      begin
        -- Show that "::timestamp" is identical to "at time zone current_setting('timezone')".
        assert (ts_with_tz_2 = ts_with_tz_1), 'Assert #1 failed.';
        assert (ts_plain_2   = ts_plain_1  ), 'Assert #2 failed.';

        -- Show that the expected rules for the conversion of the internal representations,
        -- in both plain timestamp to/from timestamptz directions, hold.
        assert ( ts_with_tz_1_epoch = (ts_plain_epoch   - z_epoch) ), 'Assert #3 failed.';
        assert ( ts_plain_1_epoch   = (ts_with_tz_epoch + z_epoch) ), 'Assert #4 failed.';

        /* Display the internally represented values:
             EITHER: using 'UTC' to show what they "really" are
             OR:     using the timezone in which they were computed to show
                     the intended usability benefit for the local observer. */
        if at_utc then
          execute format(set_timezone, 'UTC');
          -- Else, leave the timezone set to "z".
        end if;
        t := report_line(z, ts_plain, ts_with_tz_1, ts_with_tz, ts_plain_1);                                return next;
      end;
    end loop;
  end;

  execute format(set_timezone, tz_on_entry);
end;
$body$;

select t from plain_timestamp_to_from_timestamp_tz(true);

This is the result:

---------------------------------------------------------------------------------------------
Displaying all results using UTC.
---------------------------------------------------------------------------------------------

                              From        To                    From               To
[Timezone            Offset]   ts_plain    ts_with_tz            ts_with_tz         ts_plain
----------------------------   ---------   ----------------      ----------------   ---------
[Pacific/Pago_Pago      -11]   Sat 10:15   Sat 21:15 +00:00      Sat 10:15 +00:00   Fri 23:15
[America/Porto_Velho    -04]   Sat 10:15   Sat 14:15 +00:00      Sat 10:15 +00:00   Sat 06:15
[Atlantic/South_Georgia -02]   Sat 10:15   Sat 12:15 +00:00      Sat 10:15 +00:00   Sat 08:15
[UTC                    +00]   Sat 10:15   Sat 10:15 +00:00      Sat 10:15 +00:00   Sat 10:15
[Africa/Tripoli         +02]   Sat 10:15   Sat 08:15 +00:00      Sat 10:15 +00:00   Sat 12:15
[Asia/Dubai             +04]   Sat 10:15   Sat 06:15 +00:00      Sat 10:15 +00:00   Sat 14:15
[Pacific/Kiritimati     +14]   Sat 10:15   Fri 20:15 +00:00      Sat 10:15 +00:00   Sun 00:15

The execution finishes without error, confirming that the assertions hold.

Interpretation and statement of the rules

Underlying axiom.

The interpretation of the demonstration's outcome depends on this fact, stated and empirically confirmed in the plain timestamp and timestamptz data types section:

Both a plain timestamp datum and a timestamptz datum have the identical internal representation as the number of seconds from a reference moment. The extract(epoch from t) function, where t is either a plain timestamp value or a timestamptz value, returns this number. Moreover, the result is independent of the session's current TimeZone setting.

The demonstration meets the goals set out in the "The philosophy of the demonstration's design" section:

Goal one is met because there are no assert violations: these two properties of the mutual conversion shown in the paragraph that defines this goal are seen to hold.

IF:
  ts_with_tz_1 ◄— ts_plain::timestamptz
AND:
  ts_with_tz_2 ◄— ts_plain at time zone current_setting('timezone')
THEN:
  ts_with_tz_2 == ts_with_tz_1

and:

IF:
  ts_plain_1 ◄— ts_with_tz::timestamp
AND:
  ts_plain_2 ◄— ts_with_tz at time zone current_setting('timezone')
THEN:
  ts_plain_2 == ts_plain_1

Goal two is met because there are no assert violations: these rules for the mutual conversions are seen to hold.

ts-with-tz-internal-seconds ◄— ts-plain-internal-seconds - specified-utc-offset-in-seconds

and:

ts-plain-internal-seconds ◄— ts-with-tz-internal-seconds + specified-utc-offset-in-seconds

Goal three is met by inspecting the output immediately above and by Invoking the table function again to show each result row using the timezone in which it was computed.

select t from plain_timestamp_to_from_timestamp_tz(false);

This is the new result:

---------------------------------------------------------------------------------------------
Displaying each set of results using the timezone in which they were computed.
---------------------------------------------------------------------------------------------

                              From        To                    From               To
[Timezone            Offset]   ts_plain    ts_with_tz            ts_with_tz         ts_plain
----------------------------   ---------   ----------------      ----------------   ---------
[Pacific/Pago_Pago      -11]   Sat 10:15   Sat 10:15 -11:00      Fri 23:15 -11:00   Fri 23:15
[America/Porto_Velho    -04]   Sat 10:15   Sat 10:15 -04:00      Sat 06:15 -04:00   Sat 06:15
[Atlantic/South_Georgia -02]   Sat 10:15   Sat 10:15 -02:00      Sat 08:15 -02:00   Sat 08:15
[UTC                    +00]   Sat 10:15   Sat 10:15 +00:00      Sat 10:15 +00:00   Sat 10:15
[Africa/Tripoli         +02]   Sat 10:15   Sat 10:15 +02:00      Sat 12:15 +02:00   Sat 12:15
[Asia/Dubai             +04]   Sat 10:15   Sat 10:15 +04:00      Sat 14:15 +04:00   Sat 14:15
[Pacific/Kiritimati     +14]   Sat 10:15   Sat 10:15 +14:00      Sun 00:15 +14:00   Sun 00:15

The results at "Displaying all results using UTC"

These show what is really happening at the level of the internal representation—albeit that you have to deal with a degree of circularity of logic to accept this claim. It's equivalent to looking at the numbers of seconds from 12:00 on 1-Jan-1070. (You could write your own formatter, using the trunc() and mod() built-in functions, to produce the same display.) Of course, only the display of timestamptz values is sensitive to the current value of the session's TimeZone setting.

The "From ts_plain" column shows the same value in each row—as is to be expected. The "To ts_with_tz" column shows a different value in each row, supporting this informal statement of the rule:

The timestamptz value is produced by assuming that the to-be-converted plain timestamp value represents local wall-clock time in the timezone in which the conversion is done.

The "From ts_with_tz" column shows the same value in each row—again as is to be expected. The "To ts_plain" column shows a different value in each row, supporting this informal statement of the rule:

The plain timestamp value is produced by assuming that the to-be-converted timestamptz's displayed date and time of day, as these would be shown in the timezone at which the conversion is done, jointly represent the target plain timestamp value.

The results at "Displaying each set of results using the timezone in which they were computed"

The displayed text values, of course, represent the same underlying internally represented values as do the displayed values in the results at "Displaying all results using UTC". And, again of course, the text values in the columns for the plain timestamp values are pairwise identical, row by row, in the two sets of results. But the text values in the columns for the timestamptz values are pairwise different, row by row, in the two sets of results—just as they should be to bring the intended usability value of the data type. The rules are (arguably) easier to understand in the second presentation.

The "To ts_with_tz" column shows the same date-and-time value in each row, but the displayed UTC offset at which this is to be understood is different in each row (and equal to the offset that the reigning timezone specifies). The "To ts_with_tz" column shows a different value in each row, supporting this informal statement of the rule:

The to-be-converted plain timestamp value is simply decorated with the offset of the timezone in which the conversion is done.

The "To ts_plain" column shows a different value in each row. But this value is identical to the date-and-time value of the to-be-converted timestamptz value, supporting this informal statement of the rule:

The offset of the displayed to-be-converted timestamptz value is simply ignored and the target plain timestamp value is set to the to-be-converted value's displayed date-and-time component.

You should aim to be comfortable with these three different ways to state the rules:

First in terms of the internally represented seconds and the magnitude of the reigning UTC offset in seconds.
Second in terms of the text display of the results using UTC.
Third in terms of the text display of the results using the timezones at which the conversions are done.

Notice that the first way to state the rules is by far the most terse and precise—and therefore the most reliable. The other two ways are subject to the limitations of composing, and interpreting, tortuous natural language prose.