Postgres OnLine Journal: November 2012 / December 2012
An in-depth Exploration of the PostgreSQL Open Source Database
 

Table Of Contents

From the Editors
PostgreSQL Q & A
Basics
PL Programming

From the Editors

 

PostGIS in Action 2nd Edition reaching MEAP soon



We just finished the first draft of the first 5 chapters of the second edition of PostGIS in Action and is slated to be added to Manning's Early Action Program (MEAP) in the next 2-3 weeks. Some people have asked us about this when they can start purchasing the new edition. The new edition is purchaseable as soon as it hits MEAP phase. With a MEAP purchase you get the E-Book drafts as soon as they are available and if you buy the MEAP with hard-copy option, you also get the final hard-copy when released. MEAP is the same price as the regular book except it can only be bought direct thru Manning and it gives you access to early content so you can see all our mistakes and cross outs as things change.

We shuffled some of the chapters a bit from our earlier table of contents, but in these first 5 chapters you'll be exposed to new features in PostGIS 2.0, the more modern way of creating spatial tables, utilizing the new raster and topology types, and also find out about the new great stuff coming in PostGIS 2.1 that is already available in PostGIS 2.1 pre-release. More on that in the coming weeks.

What is coming in PostGIS 2.1 that you don't want to miss? Lots. Check out our Waiting for PostGIS 2.1 series and also a list of Duncan Golicher's highlights, which Pierre Racine has kindly outlined in Duncan Golicher's series of PostGIS articles


PostgreSQL Q & A

 

Searching for phrases embedded in stored function and procedures Intermediate



Issue: Ever had the situation where you decided to delete a table or some other function or type but you weren't absolutely sure if other functions in the database depended on these. For things like tables and views that depend on a function, type or other table/view, PostgreSQL won't allow you to delete the dependency object without doing a drop cascade. For stored procedures though, while it will prevent you from deleting a dependency type if the function returns or takes as input the object to be deleted, it doesn't save you if the body of the function references these objects. This dependency information is not always known and in fact may be dynamic with dynamically generated sql or schema path settings. So how do you inspect functions for usage of other items?

Solution: PostgreSQL has a table called pg_catalog.pg_proc which stores the source code of functions (non-C) in the prosrc column.

So lets say you had a table called employees you want to get rid of or simply rename, to find out all functions that reference the term "employees", you would do something like this:

SELECT proname, proargnames, prosrc 
 FROM pg_proc
 WHERE prosrc ILIKE '%employees%';

Basics

 

Biggest Obstacle to PostgreSQL Adoption: It is not Database X



We've been fighting to get clients we have thinking of upgrading or creating new apps to also choose PostgreSQL in the process. Here I'll just itemize some of the obstacles we've run into in making the sale. All of these fall under the It is not Database X line item. By database X I mean SQL Server, MySQL, and Oracle and for us in exactly that order. Our obstacle focus is probably a bit different from others since we are consultants to mostly Windows shops or consultants to ISVs who have to sell their applications to U.S. government agencies or units of agencies.

  1. Can our IT Staff support PostgreSQL?

    This is actually more of a concern with MySQL and SQL Server folks because they've been trained to click thru wizards for most things or they use such little of the database that the extra knobs of PostgreSQL scare them off. For the SQL Server folks, we can make things less painful by pointing them at pgAdmin which is enough like SQL Server Management Studio not to be too frightening. SQL Server folks are already used to things like schemas, very granular permissioning, and more advanced SQL constructs that the additional constructs like regex are more of a sell than a detraction.

    One thing that PostgreSQL doesn't have out of the box are Database tuning and Scheduled Job Agent wizards which SQL Server folks have come to depend on a lot. For Database Tuning wizard you give it a set of workload queries,and it makes recommendations on what tables could benefit from more indexes and even creates the index script for you. This missing feature means a PostgreSQL developer needs to be more attune at reading explain plans than a SQL Server developer does. All in all it's a very beneficial skill to have regardless of which database you use.

    For Schedule Job Agent -- pgAgent often fits the bill nicely as a replacement, though it lacks the backup job wizard SQL Server provides.

  2. If I don't like PostgreSQL, can I switch back?

    By this we mean you can write the same SQL statement for PostgreSQL and it will work the same in Database X. This is much easier of a task with SQL Server and Oracle than it is for MySQL since both support many of the advanced ANSI-SQL constructs PostgreSQL supports. With MySQL for an advanced app, you'll be doing so much dumbing down that you'll wonder why you bothered. In the MySQL case, we just respond, your new app requires advanced querying features that MySQL simply lacks and if you insist multiply your development cost by at least a factor of 2.

    Other things you can do to ensure this:

    • Don't create column names with spaces or mixed case and if you have a SQL Server app that you are thinking of migrating to PostgreSQL, get rid of use of [] around column names.
    • stay away from text type when you know a data field should never have more than 255 characters. This screws you not only with database data sharing but also lots of underlying tools that make assumptions about what you can do with text. E.g. in SQL Server you can't group by a text field, nor can you in MS Access if you are using MS Access as a front-end client to your PostgreSQL/Oracle/MySQL database. Not sure about Oracle though I suspect it has similar limitations.
  3. Can I share data with Database X?

    This is a similar issue to the Cross Compatibility. The more you can stay in the confines of what the other databases you have to be neighbors with support, the better your chances of being adopted. So first is at least for the end points you need to share data with don't use unsupported types like ARRAY or Avant Garde types like JSON and be cautious when using timestamp with time zone. This is not to say you can't use these internally, but you should also have a view or something that makes them more palpable to less sophisticated databases.

    Foreign Data Wrappers (FDW) -- though PostgreSQL implementation of FDW has still got some serious wrinkles like not being able to push updates to other databases, query planner smartness, packaging of FDW binaries for various OS, it still serves a lot of this need.

  4. Will this integrate with our Existing User Management? -

    This issue is pretty non-existent for a web application, but more of an issue if your database needs to be accessed by other means. One big selling point that SQL Server has for Windows shops is that it is intimately integrated with Active Directory - you can choose standard security or NT Security. You can define permissions for an Active Directory group and leave the day to day user/add stuff to the less skilled dedicated IT user management group. I believe Oracle IDM does something similar for Oracle.

    Although you can use Active Directory in PostgreSQL using SSPI or GSSAPI Authentication, in practice it ends up not saving any time since you still need to setup the user in PostgreSQL and PostgreSQL user groups so you still need the support from the PostgreSQL Db Admin. This is a hugy for a windows department with hundreds of users and where IT Management roles are very granularly defined.


Basics

 

Schema and search_path surprises Beginner



This was an article we were meaning to write and got reminded by Szymon Guz latest PostgreSQL search_path behavior. The PostgreSQL search_path variable allows you to control what order schemas are searched and which schemas do not require schema qualification to use tables/views/functions in the schema. The pg_catalog schema is one that never requires schema qualification even if you don't add it to your schema search_path. Schemas are searched in the order they are listed and when creating new objects (e.g. tables, views, function, types) that are not schema qualified, they are always created in the first schema of the search path. Any objects not in a schema listed in the search_path must be schema qualified.

Great feature but confusing

The search_path behavior in PostgreSQL while a great feature, causes many newbies and some experienced folk some major confusion. We exploit / abuse it a lot for building multi-tenant web applications where by we control the tables being hit by the application account we are using and even what tables stored functions are hitting. There's a whole art to exploiting/abusing search_paths which we'll leave as a discussion for another day.

The main confusing thing with search_path is there are so many levels at which you can change search_path and you have to remember the order of precedence. Case in point, we've had a few number of PostGIS newbie users puzzled why their tables require schema qualification and why the search_path settings they have set do not work. This often happens for users using topology or people who decide to put postgis in its own schema. Issues arise for the following reasons:

  • They set the search_path at the database or server level but did not start a new connection after setting it. Unless if set at the function or session level, the new search_path only takes effect with new connections.
  • the search_path being used is at the user level specified search_path and the database default search_path is something different
  • The user only set the search_path for the session and assumed the session set search_path is permanently set
  • The user neglected to set the database search_path and assumed all schemas are automatically searched

So many levels search_path can be set

You can have search_path set at the following levels and this is the order PostgreSQL decides which search path setting to use:

  • At the function level - only lasts for life of execution of function within function: ALTER FUNCTION some_func() SET search_path=public,utility;
  • At the session level - only lasts for the life of the session: set search_path=public,utility;
  • At the database user level - only takes affect for new sessions: ALTER ROLE postgres IN DATABASE mydb SET search_path = public, utility;
  • At the server user level -- only takes affect for new sessions: ALTER ROLE postgres SET search_path = public,utility;
  • At the database level -- only takes affect for new sessions: ALTER DATABASE mydb SET search_path = public, utility;

Views always have schema qualified tables

One other caveat, while for functions, you can get them to utilize different tables with same names in different schemas based on the current search_path, views always have the tables schema qualified. So you can't really abuse search paths in views or at least not directly.

Restore changes search_path during load: big gotcha for some indexes

When you restore data, you may have noticed search_paths are always set. This can on occasion cause certain indexes not to be recreated. For example let's say you have a function schema1.func1, which references another function in public.func2. In normal operation this works just fine except when you try to restore. You have a table in schema3 that uses said function schema1.func1. When your schema3 table is loaded the search_path is temporarily changed to schema3,pg_catalog and your index that utilizes function in schema1.func1 fails to be created because, schema1.func1 can't find it's companion public.func2 unless You schema qualify public.func2 in the definition of your schema1.func1OR You set the search_path of schema1.func1 to include public.


Basics

 

Unknown object type 84 in default privileges Beginner



We recently started upgrading some of our databases on windows 64-bit to 9.2, in doing so we ran into a nasty issue we discovered when some of our backups were failing. The first time it happened, I chucked it up to a dirty PostgreSQL 8.4 database being restored to PostgreSQL 9.2.1. The second time it happened restoring a 9.1 database to 9.2.2, I thought, better look into this to see if there is a known issue. Low and behold I found this: http://archives.postgresql.org/pgsql-bugs/2012-12/msg00091.php (Bug #7741). Apparently something to do with granting rights on Types.

As a workaround for this problem so our backups would work again was to delete the offending permissions from system tables. It's probably not the best way but only way we could think of, we delete the bad record in pg_default_acl and after that backup works without complaint.

   -- there is no such thing as T for default priviledges that pg_dump understands based on 
-- https://github.com/adunstan/postgresql-dev/blob/master/src/bin/pg_dump/pg_dump.c#L11838
-- backup bad records just in case we need them again --
SELECT * into zz_bad_pg_default_acl FROM pg_default_acl  WHERE defaclobjtype = 'T'; 
-- delete unknown records --
DELETE from pg_default_acl WHERE defaclobjtype = 'T'; 

If anyone else has further input on this, I'd be interested.


PL Programming

 

PLV8 1.3 windows binaries for PostgreSQL 9.2



As Hitoshi Hirada mentioned already in PL/v8 is now the richest procedural language , PL/V8 1.3 has been recently released and some of the great new features are automatic conversion of json objects, and ability to write window functions. Not only does PL/V8 allow you to make the most of the native JSON support in PostgreSQL 9.2, but in many cases particularly numeric processing, the speed is much better than what you get with plpgsql. Someone asked us a while back about this and we are hoping to do a demonstration of pl/v8 for numeric processing where you can see the marked difference in speed and perhaps compare with something like PL/R that is also often used for array numeric processing.

For those who are on windows we've compiled PLv8 1.3 for windows 32-bit and windows 64-bit that work with PostgreSQL EDB installs. We haven't had a chance to thoroughly test them yet, so let us know if you find issues. We've saved the last versions we compiled under a different name since some things are not backward compatible between the PLv8 1.1 and 1.3 releases.

We are also hoping to making our windows binary download links more visible. For stop gap fix, we are tagging all the articles concerning things like FDW extensions, PostGIS, plv8 we've compiled for windows under winextensions and accessible from http://www.postgresonline.com/winextensions.php

Download Links