In Part 2 of PL/R we covered how to build PL/R functions that take arrays and output textual outputs of generated R objects. We then used this in an aggregate SQL query using array_agg. Often when you are building PL/R functions
you'll have R functions that you want to reuse many times either inside a single PL/R function or across various PL/R functions.
Unfortunately, if you wanted to call a PL/R function from another PL/R function, this is not possible unless you are doing it from spi.execute call.
There is another way to embed reusable R code in a PostgreSQL database.
In order to be able to share databases stored R code across various PL/R functions, PL/R has a feature called a plr_module. In this tutorial
we'll learn how to create and register shareable R functions with plr_module. In the next part of this series we'll start to explore generating graphs with PL/R.
Creating and Using PL/R modules
In R you define a function much like you do data. Functions are stored in variables. In fact PL/R functions
are stored as anonymous R functions.
In our last examples, we had a snippet of code that would output the contents of the screen. In this
next example, we'll rewrite our function to define this snippet as a plr_module and then reuse this in our pl/r function.
The plr_module model is described in Loading R Modules at Startup.
PL/R modules are always installed on the startup of the server, and are named R functions. If you install new modules or change existing, you can force a reload by calling the reload_plr_modules() PL/R function. In this next example, we'll define a module function called print_variable
that we will then reuse in our functions.
When you install PL/R, it doesn't automatically create a plr_module table. This you do by running the below command:
CREATE TABLE plr_modules (
modseq int4 primary key,
modsrc text
);
The modseq will defin the order in which plr_modules are registered.
Next we register our new function, but before we do, we do a:
SET standard_conforming_strings='on';
So we don't have to worry about back slashing our quotes etc in the text.
Then we register our new R function as a module in the plr_modules table.
INSERT INTO plr_modules
VALUES (1, 'postgis_tests.print_variable <-function(param_variable){
print(sprintf("%s",
paste(capture.output(param_variable) , sep ="", collapse="\r") )
)
}');
Next we do a:
SELECT reload_plr_modules();
To force a registration of our new function. Now we are able to use this new function in our PL/R functions. The below function is exactly like the other tally_report we created except we are also outputting the summary
and have encapsulated all our capture output logic in a reusuable R function.
CREATE OR REPLACE FUNCTION tally_report2(param_row_label text, param_col_label text,
param_row_data text[], param_col_data text[])
RETURNS text AS
$$
p_break <- table(param_row_data, param_col_data)
names(dimnames(p_break)) <- list(param_row_label,param_col_label)
return (sprintf("%s \r%s",postgis_tests.print_variable(ftable(p_break))
,postgis_tests.print_variable(summary(p_break))) )
$$ language plr;
Then we use as normal:
SELECT tally_report2('geometry type', func || ' timings(ms)',
array_agg(g1),
array_agg(
COALESCE(
CAST(date_part('epoch', age(log_end,log_start))*1000 As text),
'F')
)
) As basic_tally
FROM postgis_garden_log
WHERE func LIKE 'ST_3D%'
AND g1 NOT LIKE '%Empt%'
AND g2 NOT LIKE '%Empt%'
AND g1 NOT LIKE '%NULL%'AND g2 NOT LIKE '%NULL%'
GROUP BY func
ORDER BY func LIMIT 1;
Which gives us the below - note that we put in a limit 1, but if we left that out, we'd get a report for each function:
ST_3DClosestPoint timings(ms) 0 15 16 31 32 F
geometry type
CIRCULARSTRING 0 0 0 0 0 28
CURVEPOLYGON 0 0 0 0 0 28
CURVEPOLYGON2 0 0 0 0 0 28
GEOMETRYCOLLECTIONM 0 0 0 0 0 28
GEOMETRYCOLLECTIONZ 0 0 0 0 0 28
LINESTRING 7 1 4 0 0 16
LINESTRINGM 9 1 2 0 0 16
LINESTRINGZ 8 1 3 0 0 16
MULTILINESTRING 10 0 2 0 0 16
MULTILINESTRINGM 9 2 1 0 0 16
MULTILINESTRINGZ 2 5 4 0 1 16
MULTIPOINT 7 2 3 0 0 16
MULTIPOINTM 7 1 4 0 0 16
MULTIPOINTZ 9 2 1 0 0 16
MULTIPOLYGON 0 0 0 0 0 28
MULTIPOLYGONM 0 0 0 0 0 28
MULTIPOLYGONZ 0 0 0 0 0 28
MULTISURFACE 0 0 0 0 0 28
POINT 8 1 2 1 0 16
POINTM 11 0 1 0 0 16
POINTZ 8 2 2 0 0 16
POLYGON 0 0 0 0 0 28
POLYGONM 0 0 0 0 0 28
POLYGONZ 0 0 0 0 0 28
POLYGONZM 0 0 0 0 0 28
POLYHEDRALSURFACE 0 0 0 0 0 28
TIN 0 0 0 0 0 28
TRIANGLE 0 0 0 0 0 28
Number of cases in table: 784
Number of factors: 2
Test for independence of all factors:
Chisq = 346.7, df = 135, p-value = 1.466e-20
Chi-squared approximation may be incorrect
Which tells us there are 784 tests for ST_ClosestPoint executed (all the ones with 28 F means no test involving that
type returns an answer).