The justify() and extract(epoch ...) functions for interval values

Download and install the date-time utilities code.

The code on this page depends on the code presented in the section User-defined interval utility functions. This is included in the larger code kit that includes all of the reusable code that the overall date-time section describes and uses.

The justify_hours(), justify_days(), and justify_interval() built-in functions

Consider these interval values:

i1: '5 days    1 hour  '::interval
i2: '4 days   25 hours' ::interval
i3: '5 months  1 day'   ::interval
i4: '4 months 31 days ' ::interval

The values i2 and i4 are noteworthy:

because 25 hours is more than the number of hours in at least a typical 1 day period (but notice that 1 day will be 23 hours or 25 hours at the Daylight Savings Time "spring forward" and "fall back" transitions);
and because 31 days is more than the number of days in at least some 1 month durations.

The justify built-in functions have a single interval formal in parameter and return an interval value. They normalize the fields of the [mm, dd, ss] internal representation of the input by decreasing the value of a finer-granularity time unit field and correspondingly increasing the value of its greater granularity neighbor.

justify_hours() returns an interval whose ss value doesn't exceed the number of seconds in one day (24*60*60).
justify_days() returns an interval whose dd value doesn't exceed the nominal number of days in one month (30).
justify_interval() returns an interval both whose ss value doesn't exceed the number of seconds in one day and whose dd value doesn't exceed the nominal number of days in one month.

In general, justifying an interval value changes the semantics of adding or subtracting the result to a plain timestamp value or a timestamptz value with respect to using the input, unjustified, value. In general, too, justifying a pure interval value will produce a hybrid interval value. (A pure value is one where only one of the three fields of the [mm, dd, ss] internal representation is non-zero; And a hybrid value has two or more of these fields non-zero.)

justify_hours()

The justify_hours() built-in function "normalizes" the value of the ss field of the internal [mm, dd, ss] representation by subtracting an appropriate integral number of 24 hour periods so that the resulting ss value is less than 24 hours (but not less than zero). The subtracted 24 hour periods are converted to days, using the rule that one 24 hour period is always the same as 1 day, and added to the value of the dd field. (Daylight Savings Time regimes are ignored by the implementation of this rule of thumb.) Try this:

select justify_hours('4 days 25 hours'::interval);

This is the result:

 5 days 01:00:00

In general, justify_hours() changes the semantics of the interval-timestamptz overloads of the + and - operators. Try this:

set timezone = 'America/Los_Angeles';
with
  c as (
    select
      '2021-03-13 19:00:00 America/Los_Angeles'::timestamptz as d,
      '25 hours'::interval                                   as i)
select
  d +               i  as "d + i",
  d + justify_hours(i) as "d + justify_hours(i)"
from c;

This is the result:

         d + i          |  d + justify_hours(i)
------------------------+------------------------
 2021-03-14 21:00:00-07 | 2021-03-14 20:00:00-07

Notice that the result of adding the interval value i "as is" is changed (it becomes one hour earlier) when justify_hours(i) is used.

See the section Sensitivity of timestamptz-interval arithmetic to the current timezone.

justify_days()

In a corresponding way, the justify_days() built-in function "normalizes" the value of the dd field of the internal [mm, dd, ss] representation by subtracting an appropriate integral number of 30 day periods so that the resulting dd value is less than 30 days (but not less than zero). The subtracted 30 day periods are converted to months, using the rule that one 30 day period is the same as 1 month, and added to the value of the mm field. Try this:

select justify_days('4 months 31 days'::interval);

This is the result:

 5 mons 1 day

In general, justify_days() changes the semantics of the interval-timestamp and the interval-timestamptz overloads of the + and - operators. This has nothing to do with Daylight Savings Time and how the reigning timezone specifies the relevant rules. Rather, it has simply to do with the calendar-time convention for the meaning of adding one month: it aims to take you to the same date in the next month. Try this:

-- Here, the effect is seen even with plain timestamp (of course).
with
  c as (
    select
      '2021-02-20 12:00:00'::timestamp as d,
      '33 days'::interval              as i)
select
  d +              i  as "d + i",
  d + justify_days(i) as "d + justify_days(i)"
from c;

This is the result:

        d + i        | d + justify_days(i)
---------------------+---------------------
 2021-03-25 12:00:00 | 2021-03-23 12:00:00

Notice that the result of adding the interval value i "as is" is changed (it becomes two days earlier) when justify_days(i) is used. The rules are explained in the moment-interval overloads of the "+" and "-" operators for timestamptz, timestamp, and time section.

justify_interval()

The justify_interval() built-in function simply applies first the justify_hours() function and then the justify_days() function to produce what the YSQL documentation refers to as a "normalized interval value". (The PostgreSQL documentation doesn't define such a term.) A normalized interval value is one where the extracted hours value is less than 24 (but not less than zero) and the extracted days value is less than 30 (but not less than zero).

Try this:

select justify_interval('4 months 31 days 25 hours'::interval);

This is the result:

 5 mons 2 days 01:00:00

Of course, justify_interval() affects the semantics of moment-interval arithmetic too by combining the semantic effects of both justify_hours() and justify_days().

Modeling the implementations of justify_hours(), justify_days(), and justify_interval()

First, function justify_outcomes() tests these properties of the justify functions:

justify_days(justify_hours(i)) == justify_interval(i) # user-defined "strict" equals operator

and:

     justify_hours(justify_days (i)) =  justify_interval(i)  # natve equals operator

not (justify_hours(justify_days (i)) == justify_interval(i)) # "strict" equals operator

Then procedure test_justify_model() implements the algorithms that the three justify functions use (as described in prose above) and tests that each produces the same result as the native implementation that it models.

The helper function i() defines an interval value so that this can be used in both the implementation of function justify_outcomes() and to invoke procedure test_justify_model() without cluttering the interesting code.

Helper function i()

Create function i() and inspect the value that it returns:

drop function if exists i() cascade;

create function i()
  returns interval
  language plpgsql
as $body$
begin
  return make_interval(
    years  => 516,
    months => 317,
    days   => 977,
    hours  => 473,
    mins   => 853,
    secs   => 417.5);
end;
$body$;

select i();

This is the result:

 542 years 5 mons 977 days 487:19:57.5

function justify_outcomes()

Demonstrate the outcomes of justify_hours(), justify_days(), and justify_interval(). And test the fact the effect of justify_interval() is the same as the effect of justify_hours() followed by justify_days()—and different from the effect of justify_days() followed by justify_hours().

drop function if exists justify_outcomes() cascade;

create function justify_outcomes()
  returns table(z text)
  language plpgsql
as $body$
declare
  i  constant interval not null := i();
  i1 constant interval not null := justify_hours(justify_days (i));
  i2 constant interval not null := justify_days (justify_hours(i));
  i3 constant interval not null := justify_interval(i);
begin
  -- Native equality test.
  assert     i2 =   i1,  'Assert #1 failed';

  -- Strict equality tests.
  assert not(i2 ==  i1), 'Assert #2 failed';
  assert     i3 ==  i2,  'Assert #3 failed';

  z := 'i:  '||i ::text;  return next;
  z := 'i1: '||i1::text;  return next;
  z := 'i2: '||i2::text;  return next;
end;
$body$;

select z from justify_outcomes();

Notice that the function justify_outcomes() uses both the native interval-interval equality operator, =, and the user-defined so-called strict interval-interval equality operator, ==. See the section Comparing two interval values. Briefly, the native = operator uses a loose definition that judges two interval values to be equal if their internal representations, after applying justify_interval() to each, are identical; but the strict == operator judges two interval values to be equal only if the internal representations of the two "as is" values are identical.

This is the result:

 i:  542 years  5 mons  977 days  487:19:57.5
 i1: 545 years  1 mon    37 days   07:19:57.5
 i2: 545 years  2 mons    7 days   07:19:57.5

(Whitespace was added manually to improve the readability by vertically align the corresponding fields.)

procedure test_justify_model()

drop procedure if exists test_justify_model(interval) cascade;

create procedure test_justify_model(i in interval)
  language plpgsql
as $body$
declare
  secs_pr_day    constant double precision not null := 24*60*60;
  days_pr_month  constant int              not null := 30;
begin
  -- justify_hours()
  declare
    r_native  constant interval_mm_dd_ss_t not null := interval_mm_dd_ss(justify_hours(i));

    r         constant interval_mm_dd_ss_t not null := interval_mm_dd_ss(i);
    ss        constant double precision    not null := mod(r.ss::numeric, secs_pr_day::numeric);
    dd        constant int                 not null := r.dd + trunc(r.ss/secs_pr_day);
    r_model   constant interval_mm_dd_ss_t not null := (r.mm, dd, ss);
  begin
    assert r_model = r_native, 'Assert #1 failed';
  end;

  -- justify_days()
  declare
    r_native  constant interval_mm_dd_ss_t not null := interval_mm_dd_ss(justify_days(i));

    r         constant interval_mm_dd_ss_t not null := interval_mm_dd_ss(i);
    dd        constant int                 not null := mod(r.dd, days_pr_month);
    mm        constant int                 not null := r.mm + trunc(r.dd/days_pr_month);
    r_model   constant interval_mm_dd_ss_t not null := (mm, dd, r.ss);
  begin
    assert r_model = r_native, 'Assert #2 failed';
  end;

  -- justify_interval()
  declare
    r_native  constant interval_mm_dd_ss_t not null := interval_mm_dd_ss(justify_interval(i));

    r         constant interval_mm_dd_ss_t not null := interval_mm_dd_ss(i);
    ss        constant double precision    not null := mod(r.ss::numeric, secs_pr_day::numeric);
    dd1       constant int                 not null := r.dd + trunc(r.ss/secs_pr_day);

    dd        constant int                 not null := mod(dd1, days_pr_month);
    mm        constant int                 not null := r.mm + trunc(dd1/days_pr_month);

    r_model   constant interval_mm_dd_ss_t not null := (mm, dd, ss);
  begin
    assert r_model = r_native, 'Assert #3 failed';
  end;
end;
$body$;

Test it thus:

call test_justify_model(i());

The procedure finishes without error showing that the conditions tested by the assert statements hold.

The justified_seconds() user-defined function

The section Comparing two interval values relies on this function to model the implementation of the comparison algorithm. It's therefore included here in the User-defined interval utility functions section.

Here's a simple test:

drop function if exists justified_seconds_demo() cascade;

create function justified_seconds_demo()
  returns table(z text)
  language plpgsql
as $body$
begin
  z := 'secs  =>       86000: '||justified_seconds(make_interval(secs  =>       86000)); return next;
  z := 'secs  => 31234567891: '||justified_seconds(make_interval(secs  => 31234567891)); return next;
  z := 'hours =>  2147483647: '||justified_seconds(make_interval(hours =>  2147483647)); return next;
  z := 'days  =>  2147483647: '||justified_seconds(make_interval(days  =>  2147483647)); return next;
end;
$body$;

select z from justified_seconds_demo();

The function finishes without error, showing that, for the interval values used for the test, the assertion holds. This is the result:

 secs  =>       86000: 86000
 secs  => 31234567891: 31234567891
 hours =>  2147483647: 7730941129200
 days  =>  2147483647: 185542587100800

The extract(epoch from interval_value) built-in function

The PostgreSQL documentation specifies the semantics of extract(epoch from interval_value) thus:

[the function returns] the total number of seconds in the interval [value].

However, it does not explain what this means.

The discussion and demonstration of the user-defined justified_seconds() function shows that the notion of the total number of seconds in an interval value can be defined only by using a rule of thumb. Try this:

drop function if exists seconds_in_two_years() cascade;

create function seconds_in_two_years()
  returns table(z text)
  language plpgsql
as $body$
declare
  e_1999 constant double precision not null := extract(epoch from '1999-07-15 12:00:00'::timestamptz);
  e_2001 constant double precision not null := extract(epoch from '2001-07-15 12:00:00'::timestamptz);
  e_2003 constant double precision not null := extract(epoch from '2003-07-15 12:00:00'::timestamptz);

  s1     constant double precision not null := e_2001 - e_1999;
  s2     constant double precision not null := e_2003 - e_2001;
  s3     constant double precision not null := justified_seconds(make_interval(years=>2));
  s4     constant double precision not null := extract(epoch from make_interval(years=>2));
begin
  z := 's1: '||s1::text;  return next;
  z := 's2: '||s2::text;  return next;
  z := 's3: '||s3::text;  return next;
  z := 's4: '||s4::text;  return next;
end;
$body$;

select z from seconds_in_two_years();

The number of seconds in a nominal two year period is defined in four different ways:

s1 ::= from noon on Midsummer's Day 1999 to noon on Midsummer's Day 2001.
s2 ::= from noon on Midsummer's Day 2001 to noon on Midsummer's Day 2003.
s3 ::= justified_seconds('2 years'::interval).
s4 ::= the total number of seconds in '2 years'::interval (quoting the text from the PostgreSQL documentation)

This is the result:

 s1: 63158400
 s2: 63072000
 s3: 62208000
 s4: 63115200

Notice that each reported number of seconds differs from the others.

It's unremarkable that the duration from Midsummer 1999 to Midsummer 2001 is longer than that from Midsummer 2001 to Midsummer 2003 because the former includes a leap year and the latter does not. The difference (63158400 - 63072000) is exactly equal to the number of seconds in one day (24*60*60).
It's easy to see why justified_seconds('2 years'::interval) is less than both 63158400 and 63072000: it's because it simply uses the rule that twelve months is (12*30) days—five days less than a non-leap year. So (63072000 - 62208000) is equal to (2*5*24*60*60).
But why is the fourth result, from extract(epoch from interval_value) different from the third result?

It turns out that the result from extract(epoch from interval_value) aims to give a sensible number of seconds for durations of many years. So it uses the (semantics of the) trunc() and mod() built-in functions to transform the value of the mm field of the interval representation to years, yy, and a months remainder, mm. Then the yy value is multiplied by the number of days in a Julian year. This is greater than 12*30.

How many days are there in a year?

Internet search quickly finds lots of articles on this topic—for example, Gregorian year in Wikipedia. Two subtly different answers are in common use.

The Julian year is defined to be 365.25 days. This is calculated using the fact that over a four year period, there are usually three normal years and one leap year—so the average number of days per year is 365.25. The International Astronomical Union uses the Julian year to define the size of the light year.
The Gregorian year is defined to be 365.2425 days—just a little shorter than the Julian year. This is because the Julian calendar assumed incorrectly that the average solar year is exactly 365.25 days long—an overestimate of a little under one day per century. The Gregorian reform shortened the average (calendar) year by 0.0075 days to stop the drift of the calendar with respect to the equinoxes. It uses the following rule to say when leap years occur in order to produce the required shorter average over a sufficiently long period: a year that is evenly divisible by 100 is a leap year only if it is also evenly divisible by 400.

function modeled_extract_epoch_from_interval()

This function models the rule described in prose above. And it uses an _assert_statement to confirm that the model produces the same result as the native built-in function. Notice that it uses these scaling factors:

secs_pr_day
secs_pr_month
secs_pr_year (calculated as secs_pr_day*avg_days_pr_year where avg_days_pr_year is the length of the Julian year)

Create it thus.

drop function if exists modeled_extract_epoch_from_interval(interval) cascade;

create function modeled_extract_epoch_from_interval(i in interval)
  returns double precision
  language plpgsql
as $body$
declare
  e                 constant double precision    not null := extract(epoch from i);

  -- Scaling factors
  secs_pr_day       constant double precision    not null := 24*60*60;
  secs_pr_month     constant double precision    not null := secs_pr_day*30;
  avg_days_pr_year  constant double precision    not null := 365.25;
  secs_pr_year      constant double precision    not null := secs_pr_day*avg_days_pr_year;

  r                 constant interval_mm_dd_ss_t not null := interval_mm_dd_ss(i);
  mm                constant double precision    not null := mod(r.mm, 12::int);
  yy                constant double precision    not null := trunc(r.mm::numeric/12::int);

  modeled_e         constant double precision    not null := r.ss +
                                                             r.dd*secs_pr_day +
                                                             mm*secs_pr_month +
                                                             yy*secs_pr_year;
begin
  assert
    (extract(epoch from make_interval(days  =>1)) = secs_pr_day) and
    (extract(epoch from make_interval(months=>1)) = secs_pr_month) and
    (extract(epoch from make_interval(years =>1)) = secs_pr_year)
    ,
    'Assert scale factors OK failed';

  assert
    e = modeled_e, 'Assert "e = modeled_e" failed';

  return e;
end;
$body$;

function modeled_extract_epoch_from_interval_demo()

This function invokes modeled_extract_epoch_from_interval() using the same set of input interval values as was used (above) to demonstrate the justified_seconds() user-defined function.

drop function if exists modeled_extract_epoch_from_interval_demo() cascade;

create function modeled_extract_epoch_from_interval_demo()
  returns table(z text)
  language plpgsql
as $body$
begin
  z := 'secs  =>       86000: '||modeled_extract_epoch_from_interval(make_interval(secs  =>       86000)); return next;
  z := 'secs  => 31234567891: '||modeled_extract_epoch_from_interval(make_interval(secs  => 31234567891)); return next;
  z := 'hours =>  2147483647: '||modeled_extract_epoch_from_interval(make_interval(hours =>  2147483647)); return next;
  z := 'days  =>  2147483647: '||modeled_extract_epoch_from_interval(make_interval(days  =>  2147483647)); return next;
end;
$body$;

select z from modeled_extract_epoch_from_interval_demo();

This is the result:

 secs  =>       86000: 86000
 secs  => 31234567891: 31234567891
 hours =>  2147483647: 7730941129200
 days  =>  2147483647: 185542587100800

Compare this with the results from invoking the justified_seconds_demo() function (above):

 secs  =>       86000: 86000
 secs  => 31234567891: 31234567891
 hours =>  2147483647: 7730941129200
 days  =>  2147483647: 185542587100800

They happen to be identical. But this is just a fortuitous outcome due to how the input values happen to be defined as pure seconds and pure days interval values. Try a test that uses a hybrid interval value:

with c as (
  select make_interval(months=>987, days=>876, hours=>765) as i)
select
  justified_seconds(i)                   as "justified_seconds()",
  modeled_extract_epoch_from_interval(i) as "extracted epoch"
from c;

This is the result:

 justified_seconds() | extracted epoch
---------------------+-----------------
          2636744400 |      2673939600

Now the results from justified_seconds() and extract(epoch ...) are different. This is the rule for the behavior difference:

The results from justified_seconds() and extract(epoch ...) are the same for a pure interval value—i.e. one where only one of the three fields of the [mm, dd, ss] internal representation is non-zero. (The section Custom domain types for specializing the native interval functionality shows how to ensure declaratively that an interval value is pure in this way.)
The results from the two approaches will differ for a hybrid interval value when the value's duration is long enough that the scaling of years to seconds, in extract(epoch ...), uses the Julian year definition.

As a consequence, the total number of seconds in two interval values i1 and i2, (using the PostgreSQL documentation's wording) will differ when:

(i1 = i2) and not (i1 == i2)

In other words, the extracted seconds will differ when the native equality comparison shows i1 and i2 to be the same but the user-defined strict equality comparison shows i1 and i2 to be different.

Of course, therefore, extract(epoch from interval_value) is no help for understanding the semantics of the native interval value equality and inequality comparisons.