Most of our use-cases for the built-in json support in PostgreSQL is not to implement schemaless design storage, but instead to remold data.
Remolding can take the form of restructuring data into json documents suitable for web maps, javascript charting web apps, or datagrids. It also has uses beyond just outputting data in json form. In addition the functions are useful for unraveling json data into a more meaningful relational form.
One of the common cases we use json support is what we call UNPIVOTING data.
We demonstrated this in Postgres Vision 2018 presentation in slide 23.
This trick won't work in other relational databases that support JSON because
it also uses a long existing feature of PostgreSQL to be able to treat a row as a data field.
Unpivoting data is the process of converting columns to rows. It's the reverse of a cross tabluation (aka PIVOT).
So why would you ever need to unpivot data?
Take for example if you had a spreadsheet where your HR person decided to input each employees hours in columns
where each column represents a separate week.
The spreadsheet data might look something like this:
Employee | 2018-05-04 | 2018-05-11 | 2018-05-18 | 2018-05-25 |
Regina | 70 | 40 | 35 | 65 |
Leo | 50 | 80 | 45 | 65 |
Caitlin | 40 | 25 | 35 | 35 |
Jane | 20 | 25 | 15 | 18 |
If you are using a spreadsheet as your data entry and presentation tool, then this is a great setup, but for storing the data in a relational database and being able to do arbitrary date summary aggregations,
it's easier to work with the data in a format such as below because it's more flexible how you rollup date ranges and you don't have to add a new column to your table every time a new week comes along.
Employee | PPE | hours |
Regina | 2018-05-04 | 70 |
Regina | 2018-05-11 | 40 |
Regina | 2018-05-18 | 35 |
Regina | 2018-05-18 | 65 |
Note this trick you can accomplish using the hstore extension as well as we covered a while back in Unpivoting data in PostgreSQL. Since JSON/JSONB functions are built-in no CREATE EXTENSION is required to use them.
INSERT INTO staff_hours(employee, ppe, hours)
SELECT hours_raw."Employee" As employee, j.key::date AS ppe, j.val::numeric As hours
FROM hours_raw, LATERAL jsonb_each_text(to_jsonb(hours_raw)) AS j(key,val)
WHERE j.key NOT IN('Employee');
The above query is composed for 4 steps
- A row in a query (not just tables, though we are only showing a table here), can be output as a single thing. In this case we are going to use our hours_raw table as a set of elements of type hours_raw and feed it to the to_jsonb function.
- to_jsonb (and it's companion to_json) can convert any element (including row elements) into json representations.
- The jsonb_each_text function returns a set of key / value pairs. When used in conjunction with LATERAL (which is the default for functions when used in from), it generates a new set of rows for each row of hours_raw
- For this example we assume all columns except for Employee are hour columns where the header is a pay period ending (PPE). We only want to return the Key/Values that do not correspond to employee name.