One of the big changes in PostGIS 2.0 was that the geometry_columns table became a read only view. This change was not without sacrifice. On the plus
it meant a table no longer needed to be maintained which was a relief for those who had to constantly ensure the sideline table was up to date. Also on the plus it meant we could base the views on the current user so that if a user connected, they would only see tables they had access to. The sacrifice made was a bit of speed. In most use cases, the speed difference is hardly noticeable
and only noticeable if you have a query constantly polling this table. However if you have a database of 15,000 geometry tables it could take as long as 450ms as noted in this ticket.
So just playing with ideas, could the new PostgreSQL 9.3 event triggers and materialized views provide a solution. Keep in mind this is just food
for thought. We're not condoning people go out and hack their PostGIS install. We suspect if we implement such a thing in PostGIS core it will change from what we propose here.
If you saw our prior article Caching data with materialized views and statement level triggers you know where this is going. We'll add yet another concept to this cocktail and that is what we'll call schema hiding which by that we mean counting on the order of a search_path to hide a named table/view you don't want.
Step 1: Install PostGIS in its own schema
Step one is to install PostGIS in its own schema and make sure it comes after the public schema. This insures that if you do for some reason create
a geometry_columns view or table or what have you in public, it will be used before the postgis packaged one.
CREATE SCHEMA postgis;
CREATE EXTENSION postgis SCHEMA postgis;
ALTER DATABASE test_gis SET search_path = "$user",public,postgis;
Step 2: Create materialized geometry_columns
Now we'll create a geometry_columns materialized view in the public scheam. Note that because in step 1 we set the search path
such that the geometry_columns view provide by postgis would be used after one in the user's schema or public schema. So now we are in position
to put forth an alternative geometry_columns and one that will be used instead of the installed one.
The view we build is almost identical to the one provided with postgis except we took out the logic that limits the list of tables to just those the current user has select rights to. We don't want the materialized view to be user dependent.
CREATE MATERIALIZED VIEW public.geometry_columns AS
SELECT current_database()::character varying(256) AS f_table_catalog,
n.nspname::character varying(256) AS f_table_schema,
c.relname::character varying(256) AS f_table_name,
a.attname::character varying(256) AS f_geometry_column,
COALESCE(NULLIF(postgis.postgis_typmod_dims(a.atttypmod), 2)
, postgis.postgis_constraint_dims(n.nspname::text, c.relname::text, a.attname::text), 2) AS coord_dimension,
COALESCE(NULLIF(postgis.postgis_typmod_srid(a.atttypmod), 0)
, postgis.postgis_constraint_srid(n.nspname::text, c.relname::text, a.attname::text), 0) AS srid,
replace(replace(COALESCE(NULLIF(upper(postgis.postgis_typmod_type(a.atttypmod)), 'GEOMETRY'::text)
, postgis.postgis_constraint_type(n.nspname::text, c.relname::text, a.attname::text)::text
, 'GEOMETRY'::text), 'ZM'::text, ''::text), 'Z'::text, ''::text)::character varying(30) AS type
FROM pg_class c,
pg_attribute a,
pg_type t,
pg_namespace n
WHERE t.typname = 'geometry'::name
AND a.attisdropped = false
AND a.atttypid = t.oid
AND a.attrelid = c.oid
AND c.relnamespace = n.oid
AND (c.relkind = 'r'::"char" OR c.relkind = 'v'::"char")
AND NOT pg_is_other_temp_schema(c.relnamespace)
AND NOT (c.relname = 'raster_columns'::name) ;
Step 3: Create Event trigger on table CREATE, ALTER, DROP, CONSTRAINT ..
For this magic step, we are going to define a data definition language trigger (DDL) that refreshes our view whenever a new table is created
or altered or we add constraints. We ned the constraint one to catch constraint based geometry tables.
CREATE OR REPLACE FUNCTION trig_refresh_geometry_columns() RETURNS event_trigger AS $$
BEGIN
IF tg_tag IN('CREATE TABLE','CREATE TABLE AS'
, 'CREATE FOREIGH TABLE', 'DROP TABLE', 'ALTER TABLE'
, 'CREATE VIEW', 'ALTER VIEW') THEN
REFRESH MATERIALIZED VIEW public.geometry_columns;
END IF;
END;
$$ LANGUAGE plpgsql;
CREATE EVENT TRIGGER trig_01_refresh_geometry_columns ON ddl_command_end
EXECUTE PROCEDURE trig_refresh_geometry_columns();
Step 4: Test Drive
CREATE TABLE testgeom(gid serial primary key, geom geometry(POINT,4326) );
ALTER TABLE testgeom ADD COLUMN geom2 geometry(POLYGON, 4326);
SELECT f_table_name As tname, f_geometry_column As gc, srid, type
FROM public.geometry_columns;
tname | gc | srid | type
----------+-------+------+---------
testgeom | geom | 4326 | POINT
testgeom | geom2 | 4326 | POLYGON
Test 5: Speed test
The script below generates 1000 geometry tables.
DO LANGUAGE 'plpgsql'
$$
DECLARE i int;
BEGIN
FOR i IN 1 .. 1000 LOOP
EXECUTE 'CREATE TABLE gtest' || i::text || '(gid serial primary key, geom geometry(POINT,4326))';
END LOOP;
END
$$
If run with the trigger in place, it takes 63062 ms (a little over a minute to finish). Without the trigger it takes 3,831ms ~ 3seconds). This is because each call to CREATE TABLE causes a REFRESH VIEW. So moral of that story, if you plan to
create 1000 tables in one go, then disable the trigger and manually refresh the view yourself.
ALTER EVENT TRIGGER trig_01_refresh_geometry_columns DISABLE;
ALTER EVENT TRIGGER trig_01_refresh_geometry_columns ENABLE;
REFRESH MATERIALIZED VIEW geometry_columns;
Now to test speed of querying. Since we put the public first in our search_path, when we connect to database, the public.geometry_columns one will be used if we don't schema qualify.
SELECT count(*) from geometry_columns;
SELECT count(*) from postgis.geometry_columns;
SELECT srid, type, count(*) as tot
FROM geometry_columns
GROUP BY srid,type;
SELECT srid, type, count(*) as tot
FROM postgis.geometry_columns
GROUP BY srid,type;