PL/R Part 3: Sharing Functions across PL/R functions with plr_module

In Part 2 of PL/R we covered how to build PL/R functions that take arrays and output textual outputs of generated R objects. We then used this in an aggregate SQL query using array_agg. Often when you are building PL/R functions you'll have R functions that you want to reuse many times either inside a single PL/R function or across various PL/R functions.

Unfortunately, if you wanted to call a PL/R function from another PL/R function, this is not possible unless you are doing it from spi.execute call. There is another way to embed reusable R code in a PostgreSQL database. In order to be able to share databases stored R code across various PL/R functions, PL/R has a feature called a plr_module. In this tutorial we'll learn how to create and register shareable R functions with plr_module. In the next part of this series we'll start to explore generating graphs with PL/R.

Creating and Using PL/R modules

In R you define a function much like you do data. Functions are stored in variables. In fact PL/R functions are stored as anonymous R functions. In our last examples, we had a snippet of code that would output the contents of the screen. In this next example, we'll rewrite our function to define this snippet as a plr_module and then reuse this in our pl/r function.

The plr_module model is described in Loading R Modules at Startup.

PL/R modules are always installed on the startup of the server, and are named R functions. If you install new modules or change existing, you can force a reload by calling the reload_plr_modules() PL/R function. In this next example, we'll define a module function called print_variable that we will then reuse in our functions.

When you install PL/R, it doesn't automatically create a plr_module table. This you do by running the below command:

CREATE TABLE plr_modules (
  modseq int4 primary key,
  modsrc text
);

The modseq will defin the order in which plr_modules are registered.

Next we register our new function, but before we do, we do a:

SET standard_conforming_strings='on';
So we don't have to worry about back slashing our quotes etc in the text.

Then we register our new R function as a module in the plr_modules table.

INSERT INTO plr_modules
  VALUES (1, 'postgis_tests.print_variable <-function(param_variable){
  print(sprintf("%s",  
        paste(capture.output(param_variable) , sep ="", collapse="\r") )
        )
    }');

Next we do a:

SELECT reload_plr_modules();

To force a registration of our new function. Now we are able to use this new function in our PL/R functions. The below function is exactly like the other tally_report we created except we are also outputting the summary and have encapsulated all our capture output logic in a reusuable R function.


CREATE OR REPLACE FUNCTION tally_report2(param_row_label text, param_col_label text,
    param_row_data text[], param_col_data text[]) 
  RETURNS text AS
$$
  #create the tally table
  p_break <- table(param_row_data, param_col_data)
  
  #rename the headings to be what was passed in
  names(dimnames(p_break)) <- list(param_row_label,param_col_label)

  return (sprintf("%s \r%s",postgis_tests.print_variable(ftable(p_break)) 
        ,postgis_tests.print_variable(summary(p_break))) )
$$ language plr;

Then we use as normal:

-- report on only geometries that are not of some variant of empty or null
-- put in a column F and count those that did not complete
SELECT tally_report2('geometry type',  func || ' timings(ms)',
    array_agg(g1), 
    array_agg(
        COALESCE(
            CAST(date_part('epoch', age(log_end,log_start))*1000 As text), 
            'F')
        ) 
    ) As basic_tally 
FROM postgis_garden_log
WHERE func LIKE 'ST_3D%'
AND g1 NOT LIKE '%Empt%' 
    AND g2 NOT LIKE '%Empt%' 
    AND g1 NOT LIKE '%NULL%'AND g2 NOT LIKE '%NULL%'
GROUP BY func
ORDER BY func LIMIT 1;

Which gives us the below - note that we put in a limit 1, but if we left that out, we'd get a report for each function:

                    ST_3DClosestPoint timings(ms)  0 15 16 31 32  F
geometry type                                                      
CIRCULARSTRING                                     0  0  0  0  0 28
CURVEPOLYGON                                       0  0  0  0  0 28
CURVEPOLYGON2                                      0  0  0  0  0 28
GEOMETRYCOLLECTIONM                                0  0  0  0  0 28
GEOMETRYCOLLECTIONZ                                0  0  0  0  0 28
LINESTRING                                         7  1  4  0  0 16
LINESTRINGM                                        9  1  2  0  0 16
LINESTRINGZ                                        8  1  3  0  0 16
MULTILINESTRING                                   10  0  2  0  0 16
MULTILINESTRINGM                                   9  2  1  0  0 16
MULTILINESTRINGZ                                   2  5  4  0  1 16
MULTIPOINT                                         7  2  3  0  0 16
MULTIPOINTM                                        7  1  4  0  0 16
MULTIPOINTZ                                        9  2  1  0  0 16
MULTIPOLYGON                                       0  0  0  0  0 28
MULTIPOLYGONM                                      0  0  0  0  0 28
MULTIPOLYGONZ                                      0  0  0  0  0 28
MULTISURFACE                                       0  0  0  0  0 28
POINT                                              8  1  2  1  0 16
POINTM                                            11  0  1  0  0 16
POINTZ                                             8  2  2  0  0 16
POLYGON                                            0  0  0  0  0 28
POLYGONM                                           0  0  0  0  0 28
POLYGONZ                                           0  0  0  0  0 28
POLYGONZM                                          0  0  0  0  0 28
POLYHEDRALSURFACE                                  0  0  0  0  0 28
TIN                                                0  0  0  0  0 28
TRIANGLE                                           0  0  0  0  0 28 
Number of cases in table: 784 
Number of factors: 2 
Test for independence of all factors:
	Chisq = 346.7, df = 135, p-value = 1.466e-20
	Chi-squared approximation may be incorrect

Which tells us there are 784 tests for ST_ClosestPoint executed (all the ones with 28 F means no test involving that type returns an answer).