In Part 2 of PL/R we covered how to build PL/R functions that take arrays and output textual outputs of generated R objects. We then used this in an aggregate SQL query using array_agg. Often when you are building PL/R functions you'll have R functions that you want to reuse many times either inside a single PL/R function or across various PL/R functions.
Unfortunately, if you wanted to call a PL/R function from another PL/R function, this is not possible unless you are doing it from spi.execute call. There is another way to embed reusable R code in a PostgreSQL database. In order to be able to share databases stored R code across various PL/R functions, PL/R has a feature called a plr_module. In this tutorial we'll learn how to create and register shareable R functions with plr_module. In the next part of this series we'll start to explore generating graphs with PL/R.
In R you define a function much like you do data. Functions are stored in variables. In fact PL/R functions are stored as anonymous R functions. In our last examples, we had a snippet of code that would output the contents of the screen. In this next example, we'll rewrite our function to define this snippet as a plr_module and then reuse this in our pl/r function.
The plr_module model is described in Loading R Modules at Startup.
PL/R modules are always installed on the startup of the server, and are named R functions. If you install new modules or change existing, you can force a reload by calling the reload_plr_modules() PL/R function. In this next example, we'll define a module function called print_variable that we will then reuse in our functions.
When you install PL/R, it doesn't automatically create a plr_module table. This you do by running the below command:
CREATE TABLE plr_modules (
modseq int4 primary key,
modsrc text
);
The modseq will defin the order in which plr_modules are registered.
Next we register our new function, but before we do, we do a:
SET standard_conforming_strings='on';
Then we register our new R function as a module in the plr_modules table.
INSERT INTO plr_modules
VALUES (1, 'postgis_tests.print_variable <-function(param_variable){
print(sprintf("%s",
paste(capture.output(param_variable) , sep ="", collapse="\r") )
)
}');
Next we do a:
SELECT reload_plr_modules();
To force a registration of our new function. Now we are able to use this new function in our PL/R functions. The below function is exactly like the other tally_report we created except we are also outputting the summary and have encapsulated all our capture output logic in a reusuable R function.
CREATE OR REPLACE FUNCTION tally_report2(param_row_label text, param_col_label text,
param_row_data text[], param_col_data text[])
RETURNS text AS
$$
#create the tally table
p_break <- table(param_row_data, param_col_data)
#rename the headings to be what was passed in
names(dimnames(p_break)) <- list(param_row_label,param_col_label)
return (sprintf("%s \r%s",postgis_tests.print_variable(ftable(p_break))
,postgis_tests.print_variable(summary(p_break))) )
$$ language plr;
Then we use as normal:
-- report on only geometries that are not of some variant of empty or null
-- put in a column F and count those that did not complete
SELECT tally_report2('geometry type', func || ' timings(ms)',
array_agg(g1),
array_agg(
COALESCE(
CAST(date_part('epoch', age(log_end,log_start))*1000 As text),
'F')
)
) As basic_tally
FROM postgis_garden_log
WHERE func LIKE 'ST_3D%'
AND g1 NOT LIKE '%Empt%'
AND g2 NOT LIKE '%Empt%'
AND g1 NOT LIKE '%NULL%'AND g2 NOT LIKE '%NULL%'
GROUP BY func
ORDER BY func LIMIT 1;
Which gives us the below - note that we put in a limit 1, but if we left that out, we'd get a report for each function:
ST_3DClosestPoint timings(ms) 0 15 16 31 32 F geometry type CIRCULARSTRING 0 0 0 0 0 28 CURVEPOLYGON 0 0 0 0 0 28 CURVEPOLYGON2 0 0 0 0 0 28 GEOMETRYCOLLECTIONM 0 0 0 0 0 28 GEOMETRYCOLLECTIONZ 0 0 0 0 0 28 LINESTRING 7 1 4 0 0 16 LINESTRINGM 9 1 2 0 0 16 LINESTRINGZ 8 1 3 0 0 16 MULTILINESTRING 10 0 2 0 0 16 MULTILINESTRINGM 9 2 1 0 0 16 MULTILINESTRINGZ 2 5 4 0 1 16 MULTIPOINT 7 2 3 0 0 16 MULTIPOINTM 7 1 4 0 0 16 MULTIPOINTZ 9 2 1 0 0 16 MULTIPOLYGON 0 0 0 0 0 28 MULTIPOLYGONM 0 0 0 0 0 28 MULTIPOLYGONZ 0 0 0 0 0 28 MULTISURFACE 0 0 0 0 0 28 POINT 8 1 2 1 0 16 POINTM 11 0 1 0 0 16 POINTZ 8 2 2 0 0 16 POLYGON 0 0 0 0 0 28 POLYGONM 0 0 0 0 0 28 POLYGONZ 0 0 0 0 0 28 POLYGONZM 0 0 0 0 0 28 POLYHEDRALSURFACE 0 0 0 0 0 28 TIN 0 0 0 0 0 28 TRIANGLE 0 0 0 0 0 28 Number of cases in table: 784 Number of factors: 2 Test for independence of all factors: Chisq = 346.7, df = 135, p-value = 1.466e-20 Chi-squared approximation may be incorrect
Which tells us there are 784 tests for ST_ClosestPoint executed (all the ones with 28 F means no test involving that type returns an answer).