Postgres OnLine Journal: March / April / May 2015
An in-depth Exploration of the PostgreSQL Open Source Database
 

Table Of Contents

What's new and upcoming in PostgreSQL
PostgreSQL Q & A

What's new and upcoming in PostgreSQL

 

PostGIS 2.2 leveraging power of PostgreSQL 9.5



Things are shaping up nicely in PostGIS 2.2 development. We are going to hit feature freeze around June 30th 2015, and plan to ship late August or early September to be in line with PostgreSQL 9.5 release. So far we have committed a couple of neat features most itemized in PostGIS 2.2 New Functions. Many of the really sort after ones will require PostgreSQL 9.5 and GEOS 3.5. The geography measurement enhancements will require Proj 4.9.0+ to take advantage of. Things I'd like to highlight and then later dedicate full-length articles in our BostonGIS Waiting for PostGIS 2.2 series once they've been stress tested.

  • ST_SubDivide - chops up your geometry into smaller ones and returns as a set of geometries. You can divide your lines, polygons whatever specifying the max number of vertices you want in each resultant geometry. It's really fast and easy to use. This will come in handy for those long roads (pgRouting I'm looking at you) or you need to chop up large polygons you need to divide to improve intersect performance among other things. Stay tuned for an article. Warning: You can't use this unless you built PostGIS 2.2 with Geos 3.5.0 (still in development).
  • ST_ClipByBox2D - ST_SubDivide borrows some logic from this one. This one is generally harder to use, but you do get a bit more control on where you want to make your cuts. Already covered this in ST_ClipByBox2D Map Dicing redux. again you don't get this unless you are running GEOS 3.5+
  • KNN distance operator for geography (with true distance, not just bounding box) - We've had KNN for geometry 2D since 2.0 (though it was only limited to bounding box centroid) and now we have it for geography as well and in addition it is a true distance check, not just bounding box. Requries PostgreSQL 9.5 running PostGIS 2.2 to get this feature.
  • KNN distance operator for geometry 3D - KNN for geometry 3D - we've had this for geometry 2D since 2.0 and now we have a set of new operators for geometry 3D. Note that since both 2D geometries and 3D geometries share the same type we needed to define a new operator construct for these. So for 3D distance use <<->> and <<#>> instead of <->. If you are running on PostGIS 2.2, you also automagically get true distance check. You should also use an ND index to get full benefit from these.
  • True KNN distance for geometry - (no longer limited to bounding box). This is both an enhancement and a possible breaking change for some. In a nutshell what this means is instead of doing:
     WITH cte (SELECT a.field1, a.geom 
             FROM a  ORDER BY a.geom <-> ST_GeomFromText(...) LIMIT 1000) 
                SELECT field1, geom FROM cte ORDER BY ST_Distance(geom, ST_GeomFromText(...)) LIMIT 100; 
          
    Doing the following is sufficient for 9.5+ PostGIS 2.2:
     SELECT a.field1, a.geom 
             FROM a  ORDER BY a.geom <-> ST_GeomFromText(...) LIMIT 100; 
          
    No more need for that ugly hack of WITH cte(... ORDER BY <-> .. LIMIT 1000) SELECT ... ORDER BY ST_Distance(..) LIMIT 100 or guessing how big to make the first hop if you have linestrings. This feature utilizes the new KNN with RECHECK logic which was recently committed on both the PostgreSQL 9.5 code base and the PostGIS 2.2 codebase. I'll be doing some performance benchmarks of this with real workloads in coming weeks and user testing would also be greatly appreciated.
  • ST_AsTWKB: New compressed binary output vector format optimized for web mapping. Basic details of specification here. Why do we need yet another format when we've got GeoJSON. Because when you've got big fat geometries, GeoJSON is just too bulky. Expect a companion Leaflet plugin (and possibly an OpenLayers 3 one to take advantage of this new feature).
  • Enhancements to many of the Geography measurement functions ST_Area, ST_Distance: requires Proj 4.9.0. I haven't setup winnie with the new Proj 4.9, so can't speak much about these yet. Stayed tuned.
  • Raster -- yap raster continues to innovate- In 2.2. you can build overviews right in the db with ST_CreateOverview and also Retile right in db as well with ST_Retile.
  • Lots of stuff going on in SFCGAL. For sure, SFCGAL will be installable as an extension in PostGIS 2.2 and that has been committed. We expect to see many new functions in PostGIS 2.2, but they have not been committed yet. Things like ST_3DUnion I'm really excited about.
  • ST_AsX3D now supports GeoCoordinates and flipping x/y coordinates. Currently only GD WE is supported. I've only tested with X3dom.js and very minimally
  • Tiger Geocoder -- updated for Tiger 2014, but I expect Census Tiger data 2015 to come before 2.2 release and hope to update to support that.
  • Address Standardizer -- Now a part of PostGIS. More on these later.

PostgreSQL Q & A

 

LATERAL WITH ORDINALITY - numbering sets



One of the neat little features that arrived at PostgreSQL 9.4 is the WITH ORDINALITY ANSI-SQL construct. What this construct does is to tack an additional column called ordinality as an additional column when you use a set returning function in the FROM part of an SQL Statement.

Basic WITH ORDINALITY

Here is the simple example of its use:

SELECT *
FROM unnest('{my,dog, eats, dog food}'::text[] ) 
    WITH ordinality;
which outputs:
  unnest  | ordinality
----------+------------
 my       |          1
 dog      |          2
 eats     |          3
 dog food |          4

If you aren't happy with the default column names, you can change them by revising your query like this

 
SELECT f.*
FROM unnest('{my,dog, eats, dog food}'::text[] ) 
    WITH ordinality As f(phrase, sort_order);

Which would output this:

  phrase  | sort_order
----------+------------
 my       |          1
 dog      |          2
 eats     |          3
 dog food |          4

LATERAL WITH ORDINALITY

The greatest value of WITH ORDINALITY comes when you apply it to rows of data. How do you do that when you need to use WITH ORDINALITY. This is where one of our favorite constructs, the LATERAL construct comes to the rescue.

First let's construct our table with a text array column for demonstration. Note that the fish has no tags.

CREATE TABLE pets(pet varchar(100) PRIMARY KEY, tags text[]);
INSERT INTO pets(pet, tags)
    VALUES ('dog', '{big, furry, friendly, eats steak}'::text[]),
        ('cat', '{small, snob, eats greenbeans, plays with mouse}'::text[]),
        ('mouse', '{very small, fits in pocket, eat peanuts, watches cat}'::text[]),
        ('fish', NULL);

If you do a cross join, you'll leave out fish because he's got no tags

SELECT pet, sort_order, tag
FROM pets, unnest(tags) 
    WITH ORDINALITY As f(tag, sort_order) ;
 pet  | sort_order |       tag
------+------------+------------------
dog   |          1 | big
dog   |          2 | furry
dog   |          3 | friendly
dog   |          4 | eats steak
cat   |          1 | small
cat   |          2 | snob
cat   |          3 | eats greenbeans
cat   |          4 | plays with mouse
mouse |          1 | very small
mouse |          2 | fits in pocket
mouse |          3 | eat peanuts
mouse |          4 | watches cat

In order to include pets that have no tags, you need to do a LEFT JOIN like so

SELECT pet, sort_order, tag
FROM pets LEFT JOIN 
    LATERAL unnest(tags) 
        WITH ORDINALITY As f(tag, sort_order) ON true;

And voila, all our pets are accounted for:

 pet  | sort_order |       tag
------+------------+------------------
dog   |          1 | big
dog   |          2 | furry
dog   |          3 | friendly
dog   |          4 | eats steak
cat   |          1 | small
cat   |          2 | snob
cat   |          3 | eats greenbeans
cat   |          4 | plays with mouse
mouse |          1 | very small
mouse |          2 | fits in pocket
mouse |          3 | eat peanuts
mouse |          4 | watches cat
fish  |            |

PostgreSQL Q & A

 

DELETE all data really fast with TRUNCATE TABLE CASCADE



Though it is a rare occurrence, we have had occasions where we need to purge ALL data from a table. Our preferred is the TRUNCATE TABLE approach because it's orders of magnitude faster than the DELETE FROM construct. You however can't use TRUNCATE TABLE unqualified, if the table you are truncating has foreign key references from other tables. In comes its extended form, the TRUNCATE TABLE .. CASCADE construct which was introduced in PostgreSQL 8.2, which will not only delete all data from the main table, but will CASCADE to all the referenced tables.

Here is a quick demonstration. Suppose you had two tables orders and order_items where the order_items table references the orders.order_id column. If you do this:

TRUNCATE TABLE orders;

You'll get notice

ERROR:  cannot truncate a table referenced in a foreign key constraint
DETAIL:  Table "order_items" references "orders".
HINT:  Truncate table "order_items" at the same time, or use TRUNCATE ... CASCADE.

It doesn't matter if you have CASCADE UPDATE or DELETE rule in place, the CASCADE approach will still be able to purge all data in the referenced tables by changing your statement to:

TUNCATE TABLE orders CASCADE;

You'll get a notice of the form:

NOTICE:  truncate cascades to table "order_items"

PostgreSQL Q & A

 

Adding properties to existing JSON object with PLV8



Lately I've been experimenting with building semi-schemaless apps. These are apps where much of the data may never be used for reporting aside from story telling and also that as time goes by some of these may be revisited and converted to more structured fields for easier roll-up and reporting. For the front-end UI, I'm using AngularJS which naturally spits out data as JSON and can autobind to JSON data of any complexity. My stored functions in PostgreSQL take JSON blobs as inputs spit it out into various tables and throws the whole thing in a jsonb field for later consumption (it's a bit redundant). Similarly they return JSON back. One of the things I wanted to be able to do was take this jsonb blob and tack on additional properties from well-structured fields or even a whole set of data like sub recordsets to feed back to my app in JSON. While there are lots of functions in PostgreSQL 9.3/9.4 that can easily build json objects from records, aggregate rows, etc. I couldn't find a function that allowed me to just add a property to an existing JSON object, so I went to my tried and true old-pal PL/V8 for some comfort. Here is a quickie function I created in PL/V8 that did what I needed. Hopefully it will be of use to others or others might have other ideas of doing this that I missed.

ujson_add_prop function: Add/Update properties of a JSON object

I used PL/V8 1.4.3, binaries for windows users here. I settled on taking json and outputting json from this function instead of jsonb, because when I tried to swap out json with jsonb, I got back the same json object as what I put in. Not sure why.

So here is the function

CREATE OR REPLACE FUNCTION ujson_add_prop(param_o json, 
        param_key text, param_value anyelement) 
    RETURNS json AS $$
    var result = param_o;
    result[param_key] = param_value;
    return (result)
$$ LANGUAGE plv8 IMMUTABLE STRICT;

Taking the function for a test drive

-- just tack on taker name 
SELECT ujson_add_prop(e_form::json, 
    'Taker Name', first_name || ' ' || last_name)::jsonb
FROM survey;

Which outputs

           ujson_add_prop
-------------------------------------------------------------------------------
 {"qa": [{"Favorite Color": "red"}, {"Significant Other": "Minnie"}], 
 "Taker Name": "Micky Mouse"}
 {"qa": [{"Favorite Color": "white"}, {"Significant Other": "Mick"}], 
 "Taker Name": "Minnie Mouse"}
(2 rows)
-- since it's defined as anyelement you can do this too
SELECT ujson_add_prop(e_form::json, 'Header', h)
FROM survey , LATERAL (SELECT id, first_name,last_name) As h;

Which outputs:

    ujson_add_prop
-----------------------------------------------------------------------------
 {"qa":[{"Favorite Color":"red"},{"Significant Other":"Minnie"}],
 "Header":{"id": 1,"first_name":"Micky","last_name":"Mouse"}}
 {"qa":[{"Favorite Color":"white"},{"Significant Other":"Mick"}],
 "Header":{"id": 2,"first_name":"Minnie","last_name":"Mouse"}}

The function also doubles as a setter. If I had for example a key already defined called "Header" in root of my JSON object, it would replace the old value with the new value.

On a side-note, I also discovered that my other tried and true friend hstore is a great companion to JSON allowing me to easily remorph JSON arrays such as survey questions into key-value for easier querying. More on that later.