Categories: Programming
Overview
I recently needed to generate basic reports over usage of a cluster of computer systems. Something like Prometheus would have been overkill; instead I wrote:
- a simple Python program that saved statistics into a Postgres database
- and some database stored procedures in the
plpgsql
language to generate the desired output
It worked quite well. This (very brief) article looks at the PL/pgSQL language and its integration with Postgres, particularly a few things that were not initially clear to me from the online documentation.
Note that “stored procedure” and “custom function” are the same thing, as far as I can dell.
Why Custom Functions
A SQL expression can apply logic to a parameter. There are many standard/built-in functions, eg:
select upper(name), left(team, 10), round(score, 4) from players; # per-record functions
select team, max(score), avg(score) from players group by team; # aggregate function
A custom function (aka stored procedure) allows a developer to provide their own functions that can be invoked in a similar way.
Invoking a Function
A function can be declared as returning a single value, or as returning a table.
When the function returns a value then it can be invoked within a SQL statement as:
-
select myfunction(args);
– when no input table is needed -
select a, b, myfunction(args) from sometable;
– when input-table is needed
When the function returns a table, then it can be invoked as:
select * from myfunction(args);
Invoking via DO
Code can be invoked “inline” in an interactive postgres session without defining a custom function, via:
DO PGSTRING;
where PLSTRING is a string that contains program code in whatever LANGUAGE was specified.
Embedding a program in a string can be tricky - a program quite often has quoted strings of its own. Fortunately, postgres has a generic “raw string quoting” mechanism called “dollar quoting”:
-
$foo$anytext$foo$
quotes the string “anytext”. -
$$anytext$$
also quotes the string “anytext” - as long as$$
does not occur within it.
Sadly some GUI editors (Squirrel in particular) are not dollar-quote aware, and try to split on semicolons within the string.
This is particularly useful for testing.
Defining a Function
The general format for defining a custom function is:
create function function_name(p1 type, p2 type DEFAULT defval, OUT o1 type, OUT o2 type) returns type as
PGSTRING
LANGUAGE plpgsql;
where PLSTRING is a string that contains program code in whatever LANGUAGE was specified.
Embedding a program in a string can be tricky - a program quite often has quoted strings of its own. Fortunately, postgres has a generic “raw string quoting” mechanism called “dollar quoting”:
-
$foo$anytext$foo$
quotes the string “anytext”. -
$$anytext$$
also quotes the string “anytext” - as long as$$
does not occur within it.
Sadly some GUI editors (Squirrel in particular) are not dollar-quote aware, and try to split on semicolons within the string.
Function Input and Output Parameters
The returntype can be TABLE(columndefs)
, as in the following example which uses the plpgsql language:
CREATE OR REPLACE FUNCTION get_film (p_pattern VARCHAR) RETURNS TABLE (film_title VARCHAR, film_release_year INT)
AS $$
BEGIN
RETURN QUERY SELECT
title,
cast( release_year as integer)
FROM
film
WHERE
title ILIKE p_pattern ;
END; $$
LANGUAGE 'plpgsql';
Custom Functions in Python
Python is a supported language for postgres stored procedures but it has some security concerns. It is enabled for a whole database via the command:
CREATE PROCEDURAL LANGUAGE 'plpythonu' HANDLER plpython_call_handler;
The “u” suffix on “plpythonu” indicates it is an “untrusted” runtime, ie code running in this environment within postgres simply executes as the postgres db user without any other constraints on what it can do. Such code can read and write any files the associated user could access, make network connections, start external processes, etc. Unlike Java, CPython currently has no “secure execution mechanism” that allows code to be restricted to just specific subsets of the python standard library.
Custom Functions in PL/PGSQL
Language Overview
PL/PGSQL is a language that is basically a clone of the original PL/SQL embedded language used by Oracle databases. Its syntax initially looks a little odd, but I found it actually quite reasonable to use - particularly given that the “programs” I was writing were basically two or three SQL statements combined with a little logic.
The greatest benefit of PL/PGSQL is that it is a very limited language (compared to Python for example) which is “safe” for the database to execute; there is nothing that a PL/PGSQL “program” executing as a low-privileged user can do to harm the host server, database, or other database users.
Language Structure
DECLARE
..variable..;
..variable..;
BEGIN
..statement..;
..statement..;
END;
Nested blocks can occur within BEGIN/END, just like nested scopes in other programming languages.
Variables Example
DECLARE
counter integer;
payment numeric(11,2) := 20.5;
name varchar := 'foo';
created_at CONSTANT time := now();
BEGIN
counter := counter + 1;
RAISE NOTICE 'counter is %', counter;
RAISE EXCEPTION 'counter invalid: %';
END;