Postgres OnLine Journal: September 2012 / October 2012
An in-depth Exploration of the PostgreSQL Open Source Database
 

Table Of Contents

From the Editors
Basics
PL Programming
Product Showcase

From the Editors

 

PostGIS 2.0.1 for windows PostgreSQL 9.2 and PostGIS Project changes



A couple of people have expressed concern that we have not released a PostGIS 2.0.1 for Windows 9.2. I guess 9.2 really makes people want to jump and scream for joy. Even if they don't deploy on Windows, Windows is a popular development platform to kick the tires. We originally were planning to not release one and were hoping people would not notice and just start using the PostGIS 2.1.0SVN. That plan evidently did not work. So Yes, we will be releasing a PostGIS 2.0.1 for 9.2 both 32-bit and 64-bit probably late this week. We've got some regression kinks showing in the 9.2 64-bit chain that we are troubleshooting before we release to the masses.

We've been busy beefing up the PostGIS testing and interim build infrastructure to replace the old PostGIS Hudson CentOS build bot. If you are in a super rush and really need a 2.0 micro, you can use the PostGIS 2.0.2SVN binaries, generated by Winnie the PostGIS windows build bot. If you hang around the PostGIS IRC channel, you may have stumbled on Debbie, Winnie's younger and more vocal sister (Debian 64-bit hosted on GoGrid) that does all the document builds and tar balls currently. We are hoping to have Debbie eventually pull PostgreSQL source directly from Git PostgreSQL 9.3, so we can make sure we don't introduce issues for 9.3 along the way and can catch them early.


From the Editors

 

Writing and other Happenings



This year has been action packed for us both for good and bad. On the positive side we've gotten more involved in PostGIS and PostgreSQL work on many fronts in writing, consulting, and general project involvement. On the more somber side, we were stricken with personal tragedies this year.

A little before our PostgreSQL: Up and Running book went to print, my dad died unexpectedly of a brain hemorrage. My dad was my first hero. In terms of goals, interests, and personality we were very much alike and I owe my excitement and eventual settling on engineering,data analysis, and programming to him. Although he used his time well, his death was still an ominous reminder of the little time each of us has to make a dent on existence. More recently my little sis, Vicki, was mugged outside her Brooklyn residence and suffered some broken bones and lacerations when fighting off and then chasing down her assailant. Luckily she came out fine.

Book writing: PostGIS in Action, Second Edition

On a more positive note we've started to write PostGIS In Action, Second Edition. The experience is vastly different from last time and is looking to be much smoother. The new model Manning is offering to authors is their Agile Author model which is DocBook/Subversion mix with a chat/annotation interface that allows you and your reviewers/editors to comment on each paragraph of the manuscript online. They call it their LiveBook model. Hopefully it will come in handy as we get into the review process. The successor to the current LiveBook is supposed to be similar but using Git instead of Subversion and a couple of usablity enhancements to the online interface. This is a way more pleasurable interface to work with than the old Word/OpenOffice model we had to go with in the last edition. It's easier to hyperlink, make mass changes, and because we use our source editor to edit the book, our writing workbench just becomes an extension of our coding workbench.

It's really great to see how technology is impacting the publishing industry. We haven't worked with any publishers besides Manning and O'Reilly yet so not sure how other pulbishers manage their system. For O'Reilly, we wrote a short-book so we didn't fully experience their publishing platform. They too used a DocBook/Subversion model but if they have something similar to LiveBook for interacting with Editors and Reviewers, we weren't exposed to it.

PostgreSQL book, will their be another?

The reviews so far of PostgreSQL: Up and Running of what little reviews there have been, have been fairly positive. We just really wish more people would coment on the book. You can comment here. Many people would have liked a longer book, but liked what they saw. Are we planning on writing another PostgreSQL book. We hope to, but it depends and if we do it will probably be more of a full length book of 500 or so pages, but we'll still focus on the neat things that PostgreSQL offers like how to use full text, PL/V8, JSON, R, a watered down PostGIS intro making your app location aware, etc. and the various other popular extensions. It really helps to have more people reviewing the books and commenting on Amazon and other avenues. As Baron noted in What's your opinion of High Performance MySQL, it's really important that people review and comment on a book whether they liked it or not because:

  • Potential book buyers consider how many reviewers there are before buying a book. More reviewers means more people are using the techology and thought the book valuable enough to read.
  • Increase in sales means, if we ask to write a bigger book, O'Reilly and other publishers will see it as less of a risk and sponsor our work.

Then there is the question of why don't you just self-publish? We've always wanted to do that, and perhaps in the future, but right now we think it's best to go with well-known publishers especialy as we are still getting our feet wet. Because:

  • People take a technology/Book more seriously if they see a well-known publisher behind it.
  • A good publishing company really does a lot for you: they proof-edit, they help you focus your style, they keep you on track, they index, they gather the reviewers, they handle the packaging, they provide you a platform, and their brand-name helps you sell more books. So though they take a good chunk of the profit, I think for a good publisher it's well worth the money. More importantly especially if you are writing a new topic, they take on some of the risk, by guaranteeing you at least X amount for your efforts.

From the Editors

 

Waiting for PostGIS 2.1 series and Autobuilds



In the spirit of Depesz, Waiting for PostgreSQL 9.3 series, we've started our own little PostGIS series on our Boston GIS blog called Waiting for PostGIS 2.1 that showcases all the new cool features coming in PostGIS 2.1 as they get committed to the code base. Check it out.

If you are on windows, the PostGIS 2.1 window binary builds get built whenever there is a change in PostGIS 2.1 code base by Winnie our windows PostGIS build bot, for PostgreSQL 9.0-9.2 (64-bit) and 9.2 (32-bit), so you can't use the I can't compile excuse if you are on windows, not to test out the new changes :). You can download these from Windows Experimental Builds. We are hoping to do the same for Debian Squeeze (6) soon. The main reason we built our bot Debbie on Debian (which makes the source tarballs), is because Debian is a very popular deployment platform for PostGIS, that is currently underserved so we have many Debian users frustrated at not having a readily available PostGIS 2.0 release for example.


Basics

 

Object Oriented PostGIS Syntax



Chris Travers has an interesting series going on about the Object-Oriented and Polymorphic like features that PostgreSQL has built-in. It reminded me of this syntax I have rarely seen used with PostGIS, that is perfectly valid and may be more familiar looking to Oracle and SQL Server spatial folks. It looks something like this, though sadly will only work on unary functions.

SELECT ('LINESTRING(1 2, 3 4)'::geometry).ST_Length;

More common though, if you were to have a table of say geography objects:

CREATE TABLE pois(gid serial primary key, geog geography(LINESTRING,4326));
INSERT INTO pois(geog)
    VALUES ('LINESTRING(-164.2559 54.0558,-162.0943 54.33243)'::geography)
    , ('LINESTRING(-46.2559 54.0558,-46.0943 54.33243, -47.1005 55.33243)'::geography);

SELECT (geog).ST_Length As len, (geog::geometry).ST_NPoints As n_pt
FROM pois;

If you notice though, no keystrokes were saved. We've simply changed the order of the parenthesis. Damn those ().


PL Programming

 

Windows Binaries and Installers Up for PostgreSQL 9.2: PLV8 and PostGIS 2.0.1



We've recompiled our PL/V8 for windows against PostgreSQL 9.2.0 (both 32-bit and 64-bit) with latest PL/V8 code. This includes the additional LiveScript language, which you can use by doing:

CREATE EXTENSION plls;

New release also includes some bug fixes such as Crash case with non-array in array returning function.

We have also put up binaries and installers for PostGIS 2.0.1 for PostgreSQL 9.2 (32-bit and 64-bit) downloadable from PostGIS website. We haven't deployed the installers to Application StackBuilder yet since we are doing some testing before we release to the masses.


PL Programming

 

Using PLV8 to build JSON selectors



Unfortunately we missed Postgres Open this year, but we did catch some of the slides. One of them was Embracing the web with JSON and PLV8 by Will Leinweber of Heroku. He had a great slide deck with a lot of interesting points. One surprising for us was that even in tasks that both PL/PgSQL and PL/V8 can do, PL/V8 is sometimes faster as demonstrated in his slides: #51 thru #54

Another interesting point he covered which is the topic of this article is the ease with which you can build PLV8 functions from javascript libs on the web. In particular JSON:Select library. In one article we demonstrated a JQuery app with PLV8 and one of JQuery's foundations is the CSS like selector syntax it provides for JSON and HTML document elements which allows you to drill down a document using CSS3 style referencing, much like what xpath does for xml. One of the glaring features missing in PostgreSQL 9.2 basic JSON support is a function to navigate a JSON document comparable to the PostgreSQL built-in xpath function for xml. So how do we get this json selector goodness available to us in the database? Like all good monkeys, we copy/emulate it.

In this article we'll demonstrate how to install JSON:select and conclude with how to use it to manipulate the GeoJSON Feature Collection we discussed in Creating GeoJSON Feature Collections with JSON and PostGIS.

Creating an in-db JSON selector function

We're stealing this example from Will's presentation described in his slide 71 and the documented json selector code from: https://github.com/lloyd/JSONSelect/blob/master/src/jsonselect.js. For client side javascript, you'd want to use the minified version, but since we are using this for server-side in-db code we don't need to worry about code weight and prefer readability over size. You'll need to install PL/V8 in your database before you can install the select code which means you have to have the binaries installed and then in database to plv8 enable:
CREATE EXTENSION plv8;

Remember if you are on Windows, we just made fresh compiled binaries for windows PostgreSQL 9.2, and if you are not, you can compile your own or beg your package maintainer to offer PL/V8 extension as an option if they don't have it already.

The JSON selector code is pretty long, but here is the basic structure of the function.

CREATE OR REPLACE FUNCTION
json_select(selector text, data json)
returns json as $$
  exports = {};
  //copy the contents of jsonselect.js here
  return JSON.stringify(
    exports.match(selector,
                  JSON.parse(data)));
$$ LANGUAGE plv8 IMMUTABLE STRICT;

Now let's test this out with our GeoJSON example:

SELECT json_select('.properties .loc_name', '{"type":"FeatureCollection",
"features":[
  {"type":"Feature","geometry":{"type":"Point","coordinates":[42.400469999999999,-71.2577]},
     "properties":{"loc_id":1,"loc_name":"Waltham, MA"}},
  {"type":"Feature","geometry":{"type":"Point","coordinates":[42.990189999999998,-71.462590000000006]},
    "properties":{"loc_id":2,"loc_name":"Manchester, NH"}},
 {"type":"Feature","geometry":{"type":"Point","coordinates":[-96.757239999999996,32.909770000000002]},
    "properties":{"loc_id":3,"loc_name":"TI Blvd, TX"}}
]
}');
--- output is huh JSON array object? ---
["Waltham, MA","Manchester, NH","TI Blvd, TX"]

If you are like us you prefer sets over JSON thingies, so how do we go from this JSON thing to a row set we are more familiar with working with?

Calling PLV8 functions from other PL/V8 functions

It turns out calling PL/V8 (javascript) functions from others is easy to do as documented in PLV8 Wiki. Soooo:

CREATE OR REPLACE FUNCTION json_select_multi(selector text, data json)
returns SETOF text as $$
    var func = plv8.find_function("json_select");
    var o = JSON.parse(func(selector,data));
    for(var i=0; i < o.length; i++){
        if (typeof o[i] == 'object') {
            /** we only want to stringify objects and leave numbers etc. alone **/
            if (o[i].length > 0){
                for (var j=0; j < o[i].length; j++) {
                /** if the object is an array of objects we want to expand it **/
                    plv8.return_next(JSON.stringify(o[i][j]) )
                }
            }
            else {
                plv8.return_next(JSON.stringify(o[i]));
            }
        }
        else {
            plv8.return_next(o[i]);
        }
    }
$$ LANGUAGE plv8 IMMUTABLE STRICT;

Testing the functions

Test our new set function.

SELECT json_select_multi('.properties .loc_name', '{"type":"FeatureCollection",
"features":[
  {"type":"Feature","geometry":{"type":"Point","coordinates":[42.400469999999999,-71.2577]},
     "properties":{"loc_id":1,"loc_name":"Waltham, MA"}},
  {"type":"Feature","geometry":{"type":"Point","coordinates":[42.990189999999998,-71.462590000000006]},
    "properties":{"loc_id":2,"loc_name":"Manchester, NH"}},
 {"type":"Feature","geometry":{"type":"Point","coordinates":[-96.757239999999996,32.909770000000002]},
    "properties":{"loc_id":3,"loc_name":"TI Blvd, TX"}}
]
}');
--output is a set of text much better :) ---
 json_select_multi
-------------------
 Waltham, MA
 Manchester, NH
 TI Blvd, TX

Repurposing relational technology

Now if I were a blasphemous document loving person, with little respect for relational databases, I may not be satisfied with just being able to parse JSON documents, but would also want to store them in the database?

CREATE TABLE myfeatures(id serial primary key, data json);
INSERT INTO myfeatures(data)
SELECT json_select_multi('.features',  '{"type":"FeatureCollection",
"features":[
  {"type":"Feature","geometry":{"type":"Point","coordinates":[42.400469999999999,-71.2577]},
     "properties":{"loc_id":1,"loc_name":"Waltham, MA"}},
  {"type":"Feature","geometry":{"type":"Point","coordinates":[42.990189999999998,-71.462590000000006]},
    "properties":{"loc_id":2,"loc_name":"Manchester, NH"}},
 {"type":"Feature","geometry":{"type":"Point","coordinates":[-96.757239999999996,32.909770000000002]},
    "properties":{"loc_id":3,"loc_name":"TI Blvd, TX"}}
]
}')::json;

Which would result in 3 table rows of features. There are varying degrees of blaphemy one can achieve, we can only stand this much.

Querying Spatial feature collection

If we had PostGIS installed in our database as we could get with:

CREATE EXTENSION postgis;

We could even do something as insane as this:

CREATE OR REPLACE FUNCTION json_select_geog(selector text, data json)
returns geography  as $$
    SELECT ST_Union(geom)::geography 
        FROM (SELECT ST_GeomFromGeoJSON(json_select_multi(selector, data)) As geom) As foo ;
$$ LANGUAGE sql IMMUTABLE COST 1;


CREATE INDEX idx_myfeatures_gist_geog
   ON myfeatures USING gist (json_select_geog('.geometry',data));

If we had more than a paltry sampling of data, our spatial json soup index would be used when we write a query something like:

SELECT json_select_multi('.loc_id', data)::integer As loc_id
    , json_select_multi('.loc_name', data)  As loc_name
    FROM myfeatures 
WHERE ST_DWithin(ST_GeogFromText('POINT(42.40 -71.257)')
    , json_select_geog('.geometry', data),100000 );

If you had a lot of spatial data and bigger geometry types like long linestrings or polygons, we suspect, it would probably be more efficient to not go crazier than:

DROP TABLE IF EXISTS myfeatures;
CREATE TABLE myfeatures(id serial primary key, properties json, geog geography(POINT,4326));
INSERT INTO myfeatures(properties, geog)
SELECT json_select_multi('.properties .loc_id',feat)::integer As id, json_select_multi('.properties', feat)::json As properties 
    , json_select_geog('.geometry', feat) As geog
FROM ( SELECT json_select_multi('.features',  '{"type":"FeatureCollection",
"features":[
  {"type":"Feature","geometry":{"type":"Point","coordinates":[42.400469999999999,-71.2577]},
     "properties":{"loc_id":1,"loc_name":"Waltham, MA"}},
  {"type":"Feature","geometry":{"type":"Point","coordinates":[42.990189999999998,-71.462590000000006]},
    "properties":{"loc_id":2,"loc_name":"Manchester, NH"}},
 {"type":"Feature","geometry":{"type":"Point","coordinates":[-96.757239999999996,32.909770000000002]},
    "properties":{"loc_id":3,"loc_name":"TI Blvd, TX"}}
]
}')::json As feat ) AS foo;



When we check the table:

SELECT * FROM myfeatures;
 id |                properties                |            geog
----+------------------------------------------+--------------------------------
  1 | {"loc_id":1,"loc_name":"Waltham, MA"}    | 0101000020E6100000B...
  2 | {"loc_id":2,"loc_name":"Manchester, NH"} | 0101000020E6100000C...
  3 | {"loc_id":3,"loc_name":"TI Blvd, TX"}    | 0101000020E61000004...

In short, for good or bad, PostgreSQL 9.2 provides more food for thought as to how you can organize your data. It's hard not to be a glutton about it and easy to lose sight of the fundamentals.


Product Showcase

 

Adminer web-based database administration



We were recently looking for an open source light-weight database web administration tool that would be preferably database cross-platform and would work on either ASP.NET or PHP. Adminer fit the bill. Adminer is written in PHP. It supports, out of the box PostgreSQL, MySQL, SQLite2 and SQLite3, and even SQL Server and Oracle. This covers most of the databases we commonly use. Here we'll go over some other features we liked about it that are rare in other web database administration tools.

Sinlge File Deployment

This feature really puzzled us because we'd never seen it in a php web application before. Adminer can be deployed as a single php file. You can deploy it as a folder of php files if you wish and want to add more plugins, but if you just care about using it, you can go with the default single compiled php file. Just copy and go.

Relational Diagram View

Never seen this before in a web based database application tool. In fact we usually look for tools that have just this. I used open office diagrammer to look at one of my PostGIS topology schemas a while back and was really disappointed it left out some relationships. Topology has a bunch of self-joins which tripped up LibreOffice/OpenOffice. Not sure if it's still the case with latest LibreOffice. Here is a snapshot generated by Adminer from sample topology schema from our upcoming PostGIS in Action Second Edition.

Postgis_topology_schema

Of course this feature is not perfect. For one it is not a mind-reader and doesn't lay things out the way you want it from the out-set so for large databases there is a lot of manual moving things around. It does seem to remember where you moved things though Haven't quite figured out where it stores this information since it doesn't seem to be in the database or the application folder. Perhaps it stores in browser cache. The permalink does seem to include positioning of tables.

It also seems to work schema by schema rather than covering the whole database. This works generally for us since we rarely relate tables across schemas.

Clean layout for schema view

We really liked the schema dashboard screen, very clean and minimalist but covers all the key areas - table, sequences, types and even shows a description for each table if you bothered to put one in.

Schema view

The table / view display even shows descriptions of the table/view and columns if you have them.

adminer table view description

The SQL Command window is similarly nice and minimalist. What I like most is the show of history of queries run, similar to what pgAdmin has.

SQL window

It lacks some features like User Management you'd find in phpPgAdmin or pgAdmin. It does have GUI screens to edit data, export data, import data, add tables, foreign keys, indexes, types, and triggers and of course the SQL windows lets you do anything else you need to. Haven't really explored all those features to see how good they are but on spot check they seem decent. We particularly like the crispness of the edit/filter screen:

Adminer data editor

Server configs with link to help

If you have admin rights and go to server level (select no database), you get to see a lisitng of all the active postgresql.conf variables, and each variable is hyperlinked to the relevant section of the PostgreSQL on-line manual.

It had a process list link too, which unfortunately did not work with my PostgreSQL 9.2 database since it attempted to run the query:

SELECT * FROM pg_stat_activity ORDER BY procpid;

Instead of the equivalent 9.2 version

SELECT * FROM pg_stat_activity ORDER BY pid;

But I'm sure it's probably not the only admin tool that got a little burned by that 9.2 change.