Friday, May 30. 2014
Printer Friendly
A while back, we discussed using pgAdmin pgScript as a quicky way for
running a repetitive update script where you want each loop to commit right away. Since stored functions have to commit as a whole, you can't use stored functions alone for this kind of processing.
Question: Can you do similar easily with psql?
Answer: yes with the \watch command described nicely by Michael Paquier a while back.
If you are using the psql client packaged with PostgreSQL 9.3 or above,
then you can take advantage of the \watch command that was introduced in that version of psql. We'll demonstrate that
by doing a batch geocoding exercise with PostGIS tiger geocoder and also revise our example from the prior article to use the more efficient and terser LATERAL construct introduced in PostgreSQL 9.3.
Continue reading "psql watch for batch processing"
Friday, April 25. 2014
Printer Friendly
I'm not a big proponent of schemaless designs, but they have their place. One particular place where I think they are useful is for archiving of data
where even though the underlying table structure of the data you need to archive is changing, you want the archived record to have the same fields as it did back then.
This is a case where I think Hstore and the way PostgreSQL has it implemented works pretty nicely.
Side note: one of the new features of PostgreSQL 9.4 is improved GIN indexes (faster and smaller) which
is very often used with hstore data (and the new jsonb type). We're really looking forward to the GIN improvements more so than the jsonb feature. We're hoping to test out this improved index functionality with OpenStreetMap data soon and compare with our existing PostgreSQL 9.3. OpenStreetMap pbf and osm extract loaders (osm2pgsql, imposm) provide option for loading tagged data into PostgreSQL hstore fields, in addition to PostGIS geometry and other attribute fields. So 9.4 enhancements should be a nice gift for OSM data users. More on that later.
Continue reading "Using HStore for Archiving"
Saturday, December 21. 2013
Printer Friendly
One of the features new to PostgreSQL 9.3 is the COPY FROM PROGRAM sql construct and in psql the equivalent \copy from program . Michael Paquier covered the COPY TO/FROM PROGRAM in hist Postgres 9.3 feature highlight: COPY TO/FROM PROGRAM. Depesz covered the companion psql construction in
Support for piping copy to from an external program.
Michael demonstrated an example using curl. I wanted to try something similar using wget since I have wget readily available on all my Linux and Unix boxes. For this example I'll demonstrate doing it on windows, but doing it on Linux is much the same and simpler since the wget and curl are generally already in Linux default path.
Continue reading "Using wget directly from PostgreSQL using COPY FROM PROGRAM"
Thursday, August 15. 2013
Printer Friendly
One of the big changes in PostGIS 2.0 was that the geometry_columns table became a read only view. This change was not without sacrifice. On the plus
it meant a table no longer needed to be maintained which was a relief for those who had to constantly ensure the sideline table was up to date. Also on the plus it meant we could base the views on the current user so that if a user connected, they would only see tables they had access to. The sacrifice made was a bit of speed. In most use cases, the speed difference is hardly noticeable
and only noticeable if you have a query constantly polling this table. However if you have a database of 15,000 geometry tables it could take as long as 450ms as noted in this ticket.
So just playing with ideas, could the new PostgreSQL 9.3 event triggers and materialized views provide a solution. Keep in mind this is just food
for thought. We're not condoning people go out and hack their PostGIS install. We suspect if we implement such a thing in PostGIS core it will change from what we propose here.
If you saw our prior article Caching data with materialized views and statement level triggers you know where this is going. We'll add yet another concept to this cocktail and that is what we'll call schema hiding which by that we mean counting on the order of a search_path to hide a named table/view you don't want.
Continue reading "Materialized geometry_columns using Event Triggers"
Tuesday, August 13. 2013
Printer Friendly
One exciting new feature coming in PostgreSQL 9.3 is materialized views. Materialized views is really a mechanism for caching data of a query. It is especially useful if you have long running queries where the answers change infreqently. There are many things unfortunately that materialized views won't do where you are still better off with regular views.
- They don't refresh themselves automatically. Thus requiring a cron job/pgagent job or a trigger on something to refresh.
- They can't be user dependent or time dependent. For example if you have a view that does something like WHERE user=current_user(), then a materialized
view is out of the question.
Continue reading "Caching data with materialized views and statement level triggers"
Saturday, April 13. 2013
Printer Friendly
One of the features of PostGIS (pain to some however you look at it), is that PostGIS library file is versioned by minor version. The library will have for example a postgis-2.0 or postgis-1.5.dll / .so to denote the version. Each version of PostGIS can be compiled to run on usually about 3 or 4 versions of PostgreSQL.
Since PostGIS is not part of PostgreSQL proper and has to be installed separately, it is possible to run a different version of PostGIS in each database of a cluster. While this is a great feature for PostGIS developers and also great for users who want to keep their old legacy PostGIS apps, while testing or creating new apps with the PostGIS 2.0 or experiment with 2.1 development series, it does pose some obvious challenges.
For example you can't simply just upgrade your cluster to a new version of PostgreSQL. You need to make sure the new cluster has the various versions of PostGIS compiled and available. One step to that end is figuring out exactly what version of PostGIS each database in your cluster is running. Here is a quick psql script I wrote up to help with that.
Continue reading "Determine which version of PostGIS each database is running"
Monday, January 28. 2013
Printer Friendly
This exercise is a continuation of our How to bulk export tables from MS Access and How to recreate MS Access primary keys and indexes in PostgreSQL. As mentioned in the first article, when you use the built-in export feature of MS Access, it exports autonumber fields as integers instead of the more appropriate PostgreSQL pseudo serial type.
The serial type in PostgreSQL is not really a type though, so this behavior is expected. The serial type is really short-hand for defining an integer column and a corresponding sequence object
with a specific naming convention and setting the default value to the next value of the sequence. So this is what we will do in this exercise after we have already exported our data to PostgreSQL. The script we are about to demonstrate will generate an sQL script containing all the CREATE SEQUENCE , and ALTER TABLE ALTER COLUMN SET DEFAULT .. needed to convert our integer column to a serial column.
As a side note, Mark mentioned a similar approach to what we've been describing here, but builds all the table , key structures utilizing python. You can check out the python script
at http://code.activestate.com/recipes/52267-reverse-engineer-ms-accessjet-databases
Continue reading "How to map MS Access auto number to PostgreSQL serial"
Sunday, January 27. 2013
Printer Friendly
This exercise is a continuation of our How to bulk export tables from MS Access. Now while this approach will work for other databases besides PostgreSQL, you'll probably need to fiddle with the subroutine to make it work for some other databases. PostgreSQL is fairly ANSI-SQL standard so not too much fiddling should be required to port to SQL Server, MySQL, Oracle etc.
Unlike the prior Visual basic subroutine we showed that exports the tables, this just creates an SQL script that you run on the already created PostgreSQL database that contains the exported data. We didn't test the quote option though we coded it in the subroutine, since like we said we hate having to quote fields. If perchance you are one of those folks that likes to put spaces in your field names to make it more englishy, then you'll need to quote or revise the other routine to convert your spaces to _ or some other thing.
Continue reading "How to recreate MS Access primary keys and indexes in PG"
Printer Friendly
Despite all the nasty things people say about MS Access, it does have fairly database agnostic, easy to use import and export tools and a not too shabby query wizard and query tool. But of course, it's not a server side database so at a certain point won't scale for database storage. You can still use it as a front-end to a server-side database such as PostgreSQL or SQL Server. So once you outgrow it for data storage, you'll probably want a quick way to export your data out. Unfortunately, well at least in the 2003 version, while you can bulk link tables from an ODBC connection, you can only export one table at a time using the "select table" right-click export. In this article we'll show a quick and dirty export MS Access Visual basic sub routine we use to export all tables. This we've only tested on MS Access 2003, so if you are using higher, you might need to customize it a bit. This script should work fine for exporting to any database connection with modification of the connection string.
Continue reading "How to bulk export tables from MS Access"
Sunday, January 06. 2013
Printer Friendly
In the last article we said you can't have a LEFT JOIN with LATERAL. Turns out we were mistaken and YES indeed you can and when you do it is equivalent or more powerful than SQL Server's OUTER APPLY.
Let us say that in our query we wanted all my zips returned but for ones that had keys where the value is in a certain range, we want those keys returned. we'd do this. The fact we need all even if they have no such keys necessitates us putting the condition in the ON rather than the WHERE.
SELECT zip, (h).key, (h).value As val
FROM zcta5 AS foo
LEFT JOIN LATERAL each(hstore(foo) - 'zip'::text) As h
ON ((h).value BETWEEN '12345' and '14567')
ORDER BY zip
limit 5;
Output would be
zip | key | val
------+--------+----------
00601 | |
00602 | |
00603 | |
00606 | awater | 12487.00
00610 | hu10 | 12618
Monday, November 12. 2012
Printer Friendly
Issue: Ever had the situation where you decided to delete a table or some other function or type but you weren't absolutely sure if other functions in the database depended on these.
For things like tables and views that depend on a function, type or other table/view, PostgreSQL won't allow you to delete the dependency object without doing a drop cascade.
For stored procedures though, while it will prevent you from deleting a dependency type if the function returns or takes as input the object to be deleted, it doesn't save you if the body of the function references these objects. This dependency information is not always known and in fact may be dynamic with dynamically generated sql or schema path settings.
So how do you inspect functions for usage of other items?
Solution:
PostgreSQL has a table called pg_catalog.pg_proc which stores the source code of functions (non-C) in the prosrc column.
So lets say you had a table called employees you want to get rid of or simply rename, to find out all functions that reference the term "employees", you would do something like this:
SELECT proname, proargnames, prosrc
FROM pg_proc
WHERE prosrc ILIKE '%employees%';
Friday, August 10. 2012
Printer Friendly
In our last article, PL/V8JS and PL/Coffee JSON search requests
we demonstrated how to create a PostgreSQL PL/Javascript stored function that takes as input, a json wrapped search request. We generated the search request using PostgreSQL.
As mentioned, in practice, the json search request would be generated by a client side javascript API such as JQuery. This time we'll put our stored function to use in a real web app built using
PHP and JQuery. The PHP part is fairly minimalistic just involving a call to the database and return a single row back. Normally we use a database abstraction layer such as ADODB or PearDB, but this is so
simple that we are just going to use the raw PHP PostgreSQL connection library directly. This example requires PHP 5.1+ since it uses the pg_query_param function introduced in PHP 5.1.
Most of the work is happening in the JQuery client side tier and the database part we already saw. That said the PHP part is fairly trivial to swap out with something like
ASP.NET and most other web server side languages.
Continue reading "PLV8JS and PLCoffee Part 2B: PHP JQuery App"
Thursday, August 09. 2012
Printer Friendly
PostgreSQL 9.2 beta3 got released this week and so we inch ever closer to final in another 2 months or so. One of the great
new features is the built-in JSON type and companion PLV8/PLCoffee languages that allow for easy processing of JSON objects.
One of the use cases we had in mind is to take as input a JSON search request that in turn returns a JSON dataset.
We'll use our table from PLV8 and PLCoffee Upserting.
Keep in mind that in practice the json search request would be generated by a client side javascript API such as our favorite JQuery, but for quick prototyping, we'll generate the request in the database with some SQL.
If you are on windows and don't have plv8 available we have PostgreSQL 9.2 64-bit and 32-bit plv8/plcoffee experimental binaries and instructions. We haven't recompiled against
9.2beta3, but our existing binaries seem to work fine on our beta3 install.
Continue reading "PLV8JS and PLCoffee Part 2: JSON search requests"
Saturday, July 21. 2012
Printer Friendly
Today's modern web application workflow in its simplest form looks something like this:
- Get dataset as JSON object usually using yet another JSON query object to pass the request using a javascript framework like JQuery/ExtJS/OpenLayers/Leaflet
etc.
- Make changes to JSON dataset object and send back to the web server.
- On webserver unravel the JSON object and save to respective database tables. This part is really yucky as it often involves the web application
server side language doing the unraveling and then yet another step of setting up stored procedures or other update logic to consume it.
We hate the way people build tiers
for the same reason Cartman hates lines at the amusement park.
Sure tiers are great for certain things like building connected microcosms, but most of the time they are overkill
and if applied too early make your application needlessly complicated. In the end all we care about is data: serving data, analyzing data, getting good data and everything else is just peacock feathers.
The introduction of JSON type support in PostgreSQL 9.2 and languages PL/V8 (PL/Javascript) and its Pythoness-like twin PL/Coffee
provides several options for bringing your data and application closer together since they have native support for JSON.
In this first part we'll demonstrate one: An upsert stored procedure that takes a single JSON object instead of separate args and updates existing data and adds missing records.
In later articles we'll show
you the front end app and also add a sprinkle of PostGIS in there to demonstrate working with custom types.
Continue reading "PLV8JS and PLCoffee Part 1: Upserting"
Tuesday, July 10. 2012
Printer Friendly
UPDATE TO UPDATE: Bruce Momjian suggested replacing the dynamic set local sql with set_config. We've revised further to incorporate this suggestion. That got rid of our last pet peeve about this function. Thanks all.
Simon Bertrang proposed using set local which seems much nicer. We've updated our function using his revision.
One of PostgreSQL's nice features is its great support for temporal data. In fact it probably has the best support for temporal data than any other database. We'll see more of this power in PostgreSQL 9.2 with the introduction of date time range types.
One of the features we've appreciated and leveraged quite a bit in our applications is its numerous time zone aware functions. In PostgreSQL timestamp with time zone data type
always stores the time in UTC but default displays in the time zone of the server, session, user. Now one of the helper functions we've grown to depend on is
to_char() which supports timestamp and timestamp with timezone among many other types and allows you to format the pieces of a timestamp any way you like. This function is great except for one small little problem, it doesn't allow you to designate the display of the output timezone and always defaults to the TimeZone value setting of the currently running session.
This is normally just fine (since you can combine with AT TIMEZONE to get a timestamp only time that will return the right date parts, except for the case when you want your display to output the time zone -- e.g. EDT, EST, PST, PDT etc (timestamp without timezone is timezone unaware). In this article we'll demonstrate a quick hack to get around this issue. First let's take to_char for a spin.
Continue reading "Working with Timezones"
|