When I work with other relational databases I am reminded about how I can't use my favorite PostgreSQL hacks in them. I call these hacks PostgreSQLisms. A PostgreSQLism is a pattern of SQL unique to PostgreSQL or descendants of PostgreSQL. In this post I'm going to cover some of my favorite ones.
For these exercises I'm going to use these basic tables, so you can follow along. Be warned I'm using arrays here just to say I'm using arrays.
Arrays are so fundamental to my suite of tools that I can't even acknowledge I am using them except when I get stuck working in another relational database that doesn't support them which is pretty much every other non-PostgreSQL database I have the misfortune of working with. But I'm going to save my array love fest for another article, and just lightly use them here to create my test tables.
CREATE TABLE members(id bigint GENERATED ALWAYS AS IDENTITY,
name_first varchar(50), name_last varchar(50) );
INSERT INTO members(name_first, name_last)
VALUES ('Jill', 'Hill'), ('Jack', 'Hill');
CREATE TABLE orders(id bigint GENERATED ALWAYS AS IDENTITY, id_member bigint, date_add timestamptz DEFAULT CURRENT_TIMESTAMP);
INSERT INTO orders(id_member)
SELECT i % 2
FROM generate_series(1,10) AS i;
CREATE TABLE orders_items(id bigint GENERATED ALWAYS AS IDENTITY, id_order bigint, item_code varchar(100), quantity integer);
INSERT INTO orders_items(id_order, item_code, quantity)
SELECT i, (ARRAY['orange', 'apple', 'banana', 'chocolate'])[ j ], (random()*6)::integer
FROM generate_series(1,5) AS i CROSS JOIN generate_series(1, (random()*i)::integer + 1 ) AS j;
I have never seen another database offer the DISTINCT ON clause. Sure every relational database has a DISTINCT clause, but when was the last time you saw a database support the DISTINCT ON clause. For those who are unfamiliar with this gem of a clause. What differentiates a DISTINCT ON from a DISTINCT is you have a choice of what columns should be considered distinct and your ORDER BY controls your preferred row to return. The only caveat is if you have an ORDER BY then the beginning list of columns should include your DISTINCT ON columns. Here is such an example:
SELECT DISTINCT ON(m.name_last, m.name_first, m.id) m.name_last, m.name_first, o.*, oi.item_code
FROM members AS m INNER JOIN orders AS o ON m.id = o.id_member INNER JOIN orders_items AS oi ON o.id = oi.id_order
ORDER BY m.name_last, m.name_first, m.id, o.date_add DESC;
What does this statement do, it returns each members first and last name, the last order they made, and an item_code from an item on that order.
This next PostgreSQLism I consider the foundational piece that makes PostgreSQL object-relational. It's the trick that you can stuff a whole row into a column. This feature has so many uses you can't imagine. In its classic form, it looks like this:
SELECT m
FROM members AS m;
Which outputs:
m --------------- (1,Jill,Hill) (2,Jack,Hill) (2 rows)
You might be thinking, so what? How is this useful? Its true value only shows itself when you dare pass it to functions that understand composite types which is more functions than you realize. Such as to_jsonb, array_agg, and even window functions etc.
For example
SELECT to_jsonb(m) FROM members AS m;
OR more usefully combining with another PostgreSQLism
SELECT jsonb_agg(m ORDER BY m.name_first, m.name_last) AS output
FROM members AS m;
Output:
output -------------------------------------------------------------------------------------------------------------- [{"id": 2, "name_last": "Hill", "name_first": "Jack"} , {"id": 1, "name_last": "Hill", "name_first": "Jill"}]
Yes most PostgreSQL aggregates can have an ORDER BY in the clause. This feature is not limited to built-in aggregates, but any aggregate you create or you see in an extension. Of course it's useless unless the answer changes depending on order in which you feed items.
Sure other relational databases have mastered window functions, just as PostgreSQL has mastered window functions, but can you pass a whole row into a window function in any other relational database you have come across. Try this trick in another relational database and be prepared to be very disappointed. Sometimes I get a little lazy, like for example, I'd like to see Jill's previous order in her current order, but I don't have time to go around and itemize all the fields in that last order, and hell am I going to write a bunch of lead function calls to get all the pieces I need. So I do this instead.
WITH a AS (SELECT m.name_last, m.name_first, o,
lead(o) OVER(PARTITION BY m.id ORDER BY o.date_add) AS lead_o
FROM members AS m INNER JOIN orders AS o ON m.id = o.id_member
WHERE m.name_first = 'Jill'
)
SELECT a.name_last, a.name_first, (o).id AS orig_order_id, (lead_o).*
FROM a;
Admittedly that (subtable).field syntax takes a little bit of getting used to but it gets the job done.
I have longed for an SQL syntax
SELECT m.* EXCEPT COLUMNS(name_last)
FROM members AS m
But my wish has not been granted. However there is the next best thing, which is generally good enough, cause most of my time is spent slugging data at web apis and applications who only want to be fed JSON for some reason.
So luckily I can do this
SELECT jsonb_agg(to_jsonb(m) - 'name_last' - 'id' ORDER BY name_first)
FROM members AS m;
Or slightly shorter if I've got a bunch of fields to remove
SELECT jsonb_agg(to_jsonb(m) - ARRAY['name_last','id'] ORDER BY name_first)
FROM members AS m;