One of the features we are most excited about in recently released PostgreSQL 14 is the introduction of Multirange types. In a nutshell Multirange types are sets of non-overlapping ranges. Unlike array of ranges, they prevent overlaps and thus allow you to effectively model ranges with gaps.
One of the use cases we have for them is modeling time. For example if you wanted to keep track of the cumulative periods and how many days someone is in a hospital, you could store this as a datemultirange type.
There are quite a few operators and functions available in PostgreSQL 14, but some glaring ones we'd need include aggregates such as a union aggregate. That currently doesn't exist. There are however your standard operators like + (union two ranges) and * for intersection, and - for difference as well as your common containment boolean operators.
The canonical form of a multirange type is composed of an outer {} followed by a comma separated list of ranges.
SELECT '{[2021-05-01, 2021-06-01), [2020-09-01, 2020-10-01)
, [2021-09-01, 2021-09-13)}'::datemultirange;
outputs:
{[2020-09-01,2020-10-01),[2021-05-01,2021-06-01),[2021-09-01,2021-09-13)}Observe how it changed the sorting to chronological order.
Multiranges can't have overlapping ranges, but YOU CAN cast an overlapping range set into a multirange without getting an error. Let's see what happens here:
SELECT '{[2021-05-01, 2021-06-01), [2021-09-02, 2021-09-15)
, [2021-09-01, 2021-09-13)}'::datemultirange;
outputs:
{[2021-05-01,2021-06-01),[2021-09-01,2021-09-15)}
Observe how it collapsed the last two date ranges into one that contains the union of them.
Imagine you had a regular table that just tracks stays for patients and the period of time of the stay (as a daterange), one record per stay. We don't care if stays overlap because they might represent going in for a particular procedure one day, going back same day for another procedure and so forth. If it's on the same day the two sets together for some of our calcs would be treated as one day, but you still need to keep the reason information for billing.
Your table would look something like this:
CREATE TABLE stays(id bigint GENERATED ALWAYS AS IDENTITY,
id_patient bigint,
period_stay daterange, reason text,
CONSTRAINT pk_stays PRIMARY KEY (id) );
and you would insert data into it as follows:
INSERT INTO stays(id_patient, period_stay, reason)
VALUES (1, daterange('2021-05-10', '2021-06-01'), 'Operation and healing' ),
(2, daterange('2021-05-12', '2021-05-13'), 'X-Ray' ),
(2, daterange('2021-05-13', '2021-05-14'), 'Blood' ),
(2, daterange('2021-05-13', '2021-05-14'), 'MRI' ),
(2, daterange('2021-06-13', '2021-06-14'), 'Spinal Tap' );
If you wanted to create a query that aggregates these into a single record per patient and period of stay is represented as a multirange, then you can use the range_agg function, which returns a multirange.
This example demonstrates how to do it using array_agg to aggregate the ranges into an array and then using the canonical text form to cast to datemultirange.
SELECT id_patient,
range_agg(period_stay) AS period_total_stay
FROM stays
GROUP BY id_patient
ORDER BY id_patient;
The output of the above query looks like this:
id_patient | period_total_stay ------------+--------------------------------------------------- 1 | {[2021-05-10,2021-06-01)} 2 | {[2021-05-12,2021-05-14),[2021-06-13,2021-06-14)}
One of our favorite functions is unnest and unnest is supported for multiranges and returns back a set of ranges. With the combination of range_agg and unnest you can create a set of non-overlapping ranges from a set of overlapping ones as follows:
SELECT id_patient,
unnest(range_agg(period_stay)) AS period_deduped_stay
FROM stays
GROUP BY id_patient
ORDER BY id_patient;
yields:
id_patient | period_deduped_stay ------------+------------------------- 1 | [2021-05-10,2021-06-01) 2 | [2021-05-12,2021-05-14) 2 | [2021-06-13,2021-06-14) (3 rows)