In our PLPython Part 4: PLPython meets aggregates we demonstrated the power of PostgreSQL in combination with PLPython to create a sequence of ascii plots by using the power of aggregation. Our friend Simon Greener over at Spatial Db Advisor told me the example was clever but ascii plots was so 70ish and ugly, why didn't I do SVG plots? He felt people being predominantly visual would be caught up in the ugliness of Ascii plots and miss the point. At least Bob found them neat and can appreciate the elegance in simplicity, thanks Bob..
To make it up to Simon, we shall demonstrate 2 features we left out in our prior article.
If you are into spatial processing in the database or how many ways you can play with XML particularly Oracle Locator, Oracle Spatial, PostGIS and SQL Server 2008, we highly recommend Simon's articles.
In this article we shall demonstrate how to create multi-column aggregates with PL/Python. Note you can do this with any PL or SQL language in PostgreSQL, but Python happens to have a lot of nifty libraries we wanted to put to use. For this exercise, We shall create a function agg_svgschedule. What it will do is plot each grouping of data and return an svg schedule plot for each grouping. This aggregate will take 3 columns - a schedule item, a start date and an end date. The steps we covered in How to create multi-column aggregates, can be applied here.
Our basic approach is as follows:
Installing Py-SVG is pretty easy. At the command-line run the following:
C:\Python25\python /path/to/pysvg/setup.py install
/usr/bin/python2.5/python /path/to/pysvg/setup.py install
Code is shown below and final output
CREATE OR REPLACE FUNCTION svgplot_schedule(param_dataset text)
RETURNS text AS
$$
from SVG import Schedule
title = "Schedule"
g = Schedule.Schedule(dict(
width = 640,
height = 480,
graph_title = title,
show_graph_title = True,
key = False,
scale_x_integers = True,
scale_y_integers = True,
show_data_labels = True,
show_y_guidelines = True,
show_x_guidelines = True,
# show_x_title = True, # not yet implemented
x_title = "Time",
show_y_title = False,
rotate_x_labels = True,
rotate_y_labels = False,
x_label_format = "%m/%d",
timescale_divisions = "1 day",
add_popups = True,
popup_format = "%m/%d/%y",
area_fill = True,
min_y_value = 0,
))
#convert our string of | delimited values to an array
adataset = param_dataset.split('|')
#pipe in converting each string to number
g.add_data(dict(data=adataset, title="Data"))
return g.burn()
$$ LANGUAGE plpythonu;
CREATE OR REPLACE FUNCTION plot_svgschedule_databuild(param_dataset text, param_schitem text, param_schstart date, param_schend date)
RETURNS text AS
$$
if param_dataset is None :
return str(param_schitem) + '|' + str(param_schstart) + '|' + str(param_schend)
return param_dataset + '|' + str(param_schitem) + '|' + str(param_schstart) + '|' + str(param_schend)
$$ LANGUAGE plpythonu;
CREATE AGGREGATE agg_svgschedule(text,date, date) (
SFUNC=plot_svgschedule_databuild,
STYPE=text,
FINALFUNC=svgplot_schedule
);
##Plotting fake data but lets pretend it is real
CREATE TABLE mysvg_scheduledata(pt_id serial, pt_person text, pt_schitem text, pt_start date, pt_end date);
INSERT INTO mysvg_scheduledata(pt_person, pt_schitem, pt_start, pt_end)
SELECT DISTINCT ON(pers, activity, startdate) pers, activity, startdate, enddate
FROM
(SELECT (ARRAY['Jack', 'Jill', 'Tom'])[i] As pers,
(ARRAY['Gym', 'Crew', 'Programming', 'Skiing'])[CAST(random()*10*j As integer) % 3 + 1] As activity,
CURRENT_DATE + CAST(CAST(k As text) || ' day' as interval) As startdate,
CURRENT_DATE + CAST(CAST(k + (CAST(random()*10*j As integer) % 3 + 1) As text) || ' day' as interval) As enddate
FROM generate_series(1,3) As i
CROSS JOIN generate_series(1,4) As j
CROSS JOIN generate_series(1,3) As k
) As foo
ORDER BY pers, activity, startdate;
#and now to test
SELECT pt_person, agg_svgschedule(pt_schitem, pt_start, pt_end) As svggraph
FROM (SELECT pt_person, pt_schitem, pt_start, pt_end
FROM mysvg_scheduledata
ORDER BY pt_person, pt_schitem, pt_start, pt_end) As foo
GROUP BY pt_person
ORDER BY pt_person;
Which gives us this:
pt_person | svggraph |
---|---|
Jack | |
Jill | |
Tom |