Procedural language

AgensGraph can also write user-defined functions in languages other than SQL and C. These other languages are generally referred to as procedural languages (PLs). In the case of functions written in procedural languages, the database server cannot interpret the function’s source text. Thus, the task is passed to a special handler that knows the details of the language. The handler performs all the tasks, including parsing, syntax analysis, and execution. The handler itself is a C language function that, like any C other functions, is compiled into a shared object and loaded on demand. Current AgensGraph has four procedural languages: PL/pgSQL, PL/Tcl, PL/Perl, and PL/Python.

Installation

Procedural languages must be installed in each database to be used. However, the procedural languages installed in the template1 database will be automatically available in all subsequent databases created, as the entries in template1 are copied by CREATE DATABASE. The database administrator can determine which languages can be used in which databases and can make only certain languages available, if needed.

In the case of languages provided with the standard distribution, you only need to run CREATE EXTENSION language_name to install them in the current database. Alternatively, you may use the program createlang to do this from the shell command line. For example, to install a language called PL/Python in the template1 database, see the following example:

createlang plpython template1

The manual procedure described below is recommended only if you are installing a language that is not packaged into extension.

Manual Procedural Language Installation
Procedural languages can be installed in a database in five steps and must be done by the database superuser. In most cases, the necessary SQL commands can be packaged into an extension’s install script and executed using CREATE EXTENSION.

  1. The shared object of the language handler should be compiled and installed in an appropriate library directory. It works the same way as building and installing modules with regular user-defined C functions. Often the language handler relies on external libraries that provide the actual programming language engine; in such a case, the share object must be installed.

  2. The handler must be declared as a command.

    CREATE FUNCTION handler_function_name()
        RETURNS language_handler
        AS 'path-to-shared-object'
        LANGUAGE C;
    

    The special return type of language_handler tells the database system that this function does not return one of the defined SQL data types and cannot be used directly in the SQL statement.

  3. Optionally, the language handler can provide an “inline” handler function that executes anonymous code blocks (DO commands) written in this language. If an inline handler function is provided by the language, it must be declared with the following command:

    CREATE FUNCTION inline_function_name(internal)
        RETURNS void
        AS 'path-to-shared-object'
        LANGUAGE C;
    
  4. Optionally, the language handler can provide a “validator” function that checks accuracy of the function definition without actually executing it. The validator function is called by CREATE FUNCTION. If the validator function is provided by the language, you must declare it with the following command:

    CREATE FUNCTION validator_function_name(oid)
        RETURNS void
        AS 'path-to-shared-object'
        LANGUAGE C STRICT;
    
  5. Finally, PL must be declared with the following command:

    CREATE [TRUSTED] [PROCEDURAL] LANGUAGE language-name
        HANDLER handler_function_name
        [INLINE inline_function_name]
        [VALIDATOR validator_function_name] ;
    

    The optional keyword TRUSTED specifies that the language does not grant access rights to data that will not be used by the user. Trusted languages are designed for general database users (i.e. users without superuser privileges) and can safely create functions and trigger procedures. As the PL function is executed within the database server, the TRUSTED flag should only be provided for languages that do not allow access to the database server or file system. PL/pgSQL, PL/Tcl, and PL/Perl languages are considered to be trusted. Languages PL/TclU, PL/PerlU, and PL/PythonU are designed to provide unlimited functionality and should not be marked as trusted.

    In the default AgensGraph installation, a handler for the PL/pgSQL language is built and installed in the “library” directory. The PL/pgSQL language itself is installed in every database. Although Tcl support is configured and handlers for PL/Tcl and PL/TclU are built and installed in the library directory, the languages themselves are not installed by default in the database. Likewise, even if Perl support is configured, PL/Perl and PL/PerlU handlers are built and installed, Python support is configured, and a PL/PythonU handler is installed, these languages are not installed by default.

PL/pgSQL

Introduction

PL/pgSQL is a loadable procedural language in AgensGraph. The design objective of PL/pgSQL is to be a loadable procedural language with the following features:

  • Can be used to create functions and trigger procedures;

  • Adds control structures to the SQL language;

  • Can perform complex computations;

  • Inherits all user defined types, functions and operators;

  • Can be defined to be trusted by the server;

  • Is easy to use.

Functions created in PL/pgSQL can be used wherever built-in functions can be used. For example, you can create a function that processes complex conditions, and later define the function as an operator or use it in index expression.

In AgensGraph, PL/pgSQL is installed by default. However, as it is still a loadable module, administrators who are strictly security-conscious can remove PL/pgSQL.

Advantages of PL/pgSQL
SQL is a query language used in databases. Although SQL is easy to learn, all SQL statements must be executed separately per statement in the database server.
In other words, a client application sends a query to the database server individually, waits until each query is processed, takes the result, computes it, and then sends the next query to the server. These processes generate internal processing and cause network load if the database and client are on different machines.
As PL/pgSQL is easy to use in procedural languages and SQL is easy to use, you can group queries and operations within a database server and save on client/server communications loads.

  • Eliminate unnecessary communications between client and server.

  • Clients do not have to hold unnecessary intermediate results or to transfer them between client and server.

  • You do not need to do repeated query parsing.

Because of these factors, you can expect a noticeable performance improvement over applications that do not use stored functions.
In addition, PL/pgSQL may use all data types, operations, and functions of SQL.

Supported argument and result data types
Functions written in PL/pgSQL can accept scalar or array data types supported by the server as arguments and can return results. You can also use or return a specified complex type (row type). PL/pgSQL functions can also return records.

PL/pgSQL functions can be declared using the VARIADIC marker to allow arguments of varying numbers. This works in exactly the same way as SQL functions.

PL/pgSQL functions can be declared to accept and return various types, such as anyelement, anyarray, anynonarray, anyenum, and anyrange. The actual data types handled by the polymorphic function may vary from call to call.

Structure of PL/pgSQL

Functions written in PL/pgSQL are defined in the server by executing CREATE FUNCTION (command) as follows:

CREATE FUNCTION somefunc(integer, text) RETURNS integer
AS 'function body text'
LANGUAGE plpgsql;

The function body is simply a literal string associated with CREATE FUNCTION. It is more helpful to use dollar ($) quotes to write function bodies than to use a usual single quotes syntax. If there is no dollar citation mark, you should escape it by doubling single quotes or backslashes of the function body. Almost all examples in this section use dollar quote literals in function bodies.

PL/pgSQL is a block-structured language. The full text of a function definition should be a block. The block is defined as follows:

[ <<label>> ]
[ DECLARE
    declarations ]
BEGIN
    statements
END [ label ];

Each declaration and each statement within a block ends with a semicolon. A block appearing within other block, as shown above, should have a semicolon after END. However, a semicolon is not required at the end of the function body.

Tip : In general, they mistakenly use a semicolon immediately after BEGIN very often. This is incorrect and causes a syntax error.

Label is needed only if you want to identify a block to use in an EXIT statement or to restrict the name of a variable declared in a block. If label is positioned after END, it must match the label at the beginning of the block. Identifiers, like regular SQL commands, are converted to lower case by default if they do not have double quotes.

Annotations work the same way in PL/pgSQL code as in regular SQL. Two dashes (–) are recognized as a comment (from the beginning to the end of the line). To process a comment as a block, include the comment between /* and */.

Every statement in a statement section of a block can be a sub-block. Subblocks may be used for logical grouping or localization of a small group of variables. A variable declared in a subblock masks a variable of a similar name in an outer block for the duration of the subblock. By specifying a block name, you may access an external variable.

CREATE FUNCTION somefunc() RETURNS integer AS $$

DECLARE
    quantity integer := 30;
BEGIN
    RAISE NOTICE 'Quantity here is %', quantity;  -- Prints 30
    quantity := 50;
    --
    -- Create a subblock
    --
    DECLARE
        quantity integer := 80;
    BEGIN
        RAISE NOTICE 'Quantity here is %', quantity;  -- Prints 80
        RAISE NOTICE 'Outer quantity here is %', outerblock.quantity;  -- Prints 50
    END;

    RAISE NOTICE 'Quantity here is %', quantity;  -- Prints 50

    RETURN quantity;
END;
$$ LANGUAGE plpgsql;

Note : Actually there is a hidden “outer block” surrounding the body of the PL/pgSQL function. This block provides the function’s parameter declarations (if any) and special variables such as FOUND. The outer block is labeled with the name of the function. That is, you can limit parameters and special variables to the name of the function.

It is important that using BEGIN/END to group statements in PL/pgSQL is not to be confused with similarly-named SQL commands for transaction control. BEGIN/END in PL/pgSQL is only used for grouping; it does not start/end any transaction. Functions and trigger procedures are always executed within a transaction set by an outer query. You cannot start or commit a transaction because there is no context to start the transaction. However, you may construct a sub-transaction that can be rolled back without affecting external transactions since a block contains an EXCEPTION clause.

Declarations

All variables used in a block should be declared in the declaration section of the block. (The only exception is that a loop variable in a FOR loop that repeats a range of integers is automatically declared as an integer variable, and a loop variable in a FOR loop that queries the result of cursor is automatically declared as a record variable.) PL/pgSQL variables can have any SQL data type, such as integer, varchar, and char.

Below are some examples of variable declarations:

user_id integer;
quantity numeric(5);
url varchar;
myrow tablename%ROWTYPE;
myfield tablename.columnname%TYPE;
arow RECORD;

The following is a general syntax of a variable declaration:

name [ CONSTANT ] type [ COLLATE collation_name ] [ NOT NULL ] [ { DEFAULT | := | = } expression ];

If a DEFAULT clause is given, it specifies the initial value assigned to the variable at block input. If a DEFAULT clause is not provided, the variable is initialized to the SQL null value. The CONSTANT option prevents variables from being allocated after initialization so that the value remains constant for the duration of the block. The COLLATE option specifies collation to use for variables. If NNOT NULL is specified, a runtime error occurs if a null value is assigned at execution. All variables declared as NOT NULL should specify a non-null default value. The equal sign (=) can be used in place of “:=”, which is compatible with PL/SQL.

The default value of a variable is evaluated and assigned to a variable whenever a block is entered (not once per function call). For example, if you assign now() to a variable of type timestamp, the function will have the current function call time, not the precompiled time.

quantity integer DEFAULT 32;
url varchar := 'http://mysite.com';
user_id CONSTANT integer := 10;

Function parameter declaration
The parameters passed to the function are named identifiers $1, $2, and so on. Optionally, you can declare an alias for the $n parameter name to improve readability. You may then use the alias or numeric identifier to refer to the parameter value.

There are two ways to create an alias. Naming a parameter in CREATE FUNCTION command is a preferred way.

CREATE FUNCTION sales_tax(subtotal real) RETURNS real AS $$
BEGIN
    RETURN subtotal * 0.06;
END;
$$ LANGUAGE plpgsql;

Another way is to explicitly declare the alias using declaration syntax.

name ALIAS FOR $n;

Below is an example of this style:

CREATE FUNCTION sales_tax(real) RETURNS real AS $$
DECLARE
    subtotal ALIAS FOR $1;
BEGIN
    RETURN subtotal * 0.06;
END;
$$ LANGUAGE plpgsql;

Note: These two examples are not exactly the same. In the first case, subtotal can be referenced as sales_tax.subtotal, but in the second case it cannot be marked as subtotal. (If you attach label to an inner block, the partial label can be confined to that label instead.

When a PL/pgSQL function is declared as an output parameter, the output parameter has a $n name and an optional alias in the same way as a normal input parameter. The output parameter effectively represents a variable that starts with NULL. This should be assigned during execution of the function. The final value of the parameter is the value to be returned. For example, the sales-tax example could be:

CREATE FUNCTION sales_tax(subtotal real, OUT tax real) AS $$
BEGIN
    tax := subtotal * 0.06;
END;
$$ LANGUAGE plpgsql;

Output parameters are most useful when returning multiple values. Here is a simple example:

CREATE FUNCTION sum_n_product(x int, y int, OUT sum int, OUT prod int) AS $$
BEGIN
    sum := x + y;
    prod := x * y;
END;
$$ LANGUAGE plpgsql;

This effectively creates an anonymous record type for the result of the function. If a RETURNS clause is provided, a RETURNS record must be specified.

Another way to declare a PL/pgSQL function is to use RETURNS TABLE. For example:

CREATE FUNCTION extended_sales(p_itemno int)
RETURNS TABLE(quantity int, total numeric) AS $$
BEGIN
    RETURN QUERY SELECT s.quantity, s.quantity * s.price FROM sales AS s
                 WHERE s.itemno = p_itemno;
END;
$$ LANGUAGE plpgsql;

This is exactly the same as declaring one or more OUT parameters and specifying RETURNS SETOF sometype.

Alias
You can declare alias for all variables as well as function parameters. Alias is used, in an actual case, to assign a different name to a variable with a predefined name, such as NEW or OLD, in the trigger procedure. For example:

DECLARE
  prior ALIAS FOR old;
  updated ALIAS FOR new;

Since ALIAS creates two methods of naming the same object, its unrestricted use can be confusing. It is best to use ALIAS only for the purpose of ignoring predefined names.

Expressions

All expressions used in PL/pgSQL statements are processed using the server’s main SQL launcher. The PL/pgSQL statements can be written as follows:

IF expression THEN ...

PL/pgSQL evaluates expressions by providing the same query as the main SQL engine. During configuration of SELECT command, the PL/pgSQL variable name is changed to a parameter. This allows you to prepare a query plan for SELECT once and then reuse it for subsequent evaluations using different values of the variable. For example, if you write two integer variables x and y as

IF x < y THEN ...

it is used as follows:

PREPARE statement_name(integer, integer) AS SELECT $1 < $2;

This prepared statement is performed with the value of the PL/pgSQL variable provided each time the IF statement is executed. In general, specific information like this is not that useful for PL/pgSQL users. However, it is worth knowing in case of a problem diagnosis.

Basic Statements

This section and the subsequent sections describe all statement types that PL/pgSQL explicitly understands. Those that are not recognized as part of these statement types are considered SQL commands and are sent to the underlying database engine for execution.

Assignment
To assign a value to a PL/pgSQL variable:

variable { := | = } expression;

As mentioned earlier, the expressions in these statements are obtained using SQL SELECT (command) sent to the underlying database engine. An expression should have a single value (it can be a row value if the variable is a row or record variable). The target variable can be a simple variable (optionally defined by a block name), a row or record variable field, or an array element that is a simple variable or field. The equal sign (=) can be used in place of “:=”, which is compatible with PL/SQL.

If the result data type of the expression does not match the data type of the variable or does not match a specific length or precision, the PL/pgSQL interpreter attempts to implicitly convert the result value using the result type of the output function and the variable type of the input function. If the string form of the result value is of a type that is not allowed in the input function, then the input function may generate a run-time error.

For example:

tax := subtotal * 0.06;
my_record.user_id := 20;

Executing a command without a result
For SQL commands that do not return a row (for example, an INSERT without a RETURNING clause), you can execute the commands within a PL/pgSQL function simply by writing them.

The PL/pgSQL variable names that appear in the command text are treated as parameters, and the current values of the variables are provided as the parameter values at runtime. This is exactly the same as the process described hereinabove for expressions.

When you execute an SQL command in this way, PL/pgSQL can cache and reuse the command execution plan.

Calling a function without a useful result value can be sometimes useful for evaluating expressions or SELECT queries, and the results can be ignored. To do this in PL/pgSQL, we recommend you to use the PERFORM statement.

PERFORM query;

The query is then executed and the results are discarded. You should write your query the same way you would write an SQL SELECT command, and replace the initial keyword SELECT with PERFORM. For WITH queries, use PERFORM and put the query into parentheses (in this case, the query can only return one row). As with commands that do not return results, the PL/pgSQL variable is replaced by the query, and the plan is cached in the same manner. The special variable FOUND is set to true if the query generated at least one row, and set to false if no rows were generated.

Note: You may expect to get this result by writing your own SELECT, but for now, PERFORM is the only way to do it. An SQL command that can return a row such as SELECT is treated as an error and rejected if it is without an INTO clause described in the next section.

For example:

PERFORM create_mv('cs_session_page_requests_mv', my_query);

Run a query as a single row result
A single row (possibly multiple columns) resulting from an SQL command can be assigned to a record variable, row-type variable, or scalar variable list that you create. This is done by writing a basic SQL command and adding an INTO clause. For example, target can be a record variable, a row variable, or a comma-delimited simple variable, and a list of record/row fields.

SELECT select_expressions INTO [STRICT] target FROM ...;
INSERT ... RETURNING expressions INTO [STRICT] target;
UPDATE ... RETURNING expressions INTO [STRICT] target;
DELETE ... RETURNING expressions INTO [STRICT] target;

For commands that do not return rows, the PL/pgSQL variable is replaced with the rest of the query, as described above, and the plan is cached. It works in SELECT (INSERT/UPDATE/DELETE that uses RETURNING) and works according to utility commands that return row set results (e.g. EXPLAIN). Except for the INTO clause, SQL commands are the same as those written outside of PL/pgSQL.

Tip: To create a table as a result of SELECT within a PL/pgSQL function, you should use the CREATE TABLE ... AS SELECT statement.

When using a row or variable list as a target, the result column of a query should exactly match the target structure for the numbers and data types. Otherwise, a runtime error will occur. When a record variable is a target, it automatically configures row types of the query result columns.

The INTO clause may appear at almost any position in an SQL command. It is usually written immediately before or after the select_expressions list in SELECT command, or at the end of commands for other command types.

If STRICT is not specified in an INTO clause, the target is set to the first row returned by the query, or set to null if the query did not return a row. (The “first row” is not well defined unless you use ORDER BY.) The result row after the first row is discarded. You can check the special FOUND variable to see if the row has been returned.

SELECT * INTO myrec FROM emp WHERE empname = myname;
IF NOT FOUND THEN
    RAISE EXCEPTION 'employee % not found', myname;
END IF;

In the case where a STRICT option is specified, the query should return exactly a single row.
Otherwise, either NO_DATA_FOUND (no rows) or TOO_MANY_ROWS (two or more rows) is reported as a run-time error. If you want to find certain errors as follows, you may use an exception block.

BEGIN
    SELECT * INTO STRICT myrec FROM emp WHERE empname = myname;
    EXCEPTION
        WHEN NO_DATA_FOUND THEN
            RAISE EXCEPTION 'employee % not found', myname;
        WHEN TOO_MANY_ROWS THEN
            RAISE EXCEPTION 'employee % not unique', myname;
END;

Successful execution of STRICT always sets FOUND to true.

In the case of INSERT/UPDATE/DELETE with RETURNING, PL/pgSQL will report an error for more than one returned row, even if STRICT is not specified. This is because there is no such option as ORDER BY to determine if an affected row should be returned.

In the case where print_strict_params is enabled for a function and an error occurs due to unmet requirements of STRICT, DETAIL of the error message contains information about the parameters passed to the query. You can change the print_strict_params setting for all functions by setting plpgsql.print_strict_params; it is, however, affected only by editing of the subsequence function. It is also possible to enable it on a function-by-function basis using compiler options. For example:

CREATE FUNCTION get_userid(username text) RETURNS int
AS $$
#print_strict_params on
DECLARE
userid int;
BEGIN
    SELECT users.userid INTO STRICT userid
        FROM users WHERE users.username = get_userid.username;
    RETURN userid;
END
$$ LANGUAGE plpgsql;

If unsuccessful, this function may generate an error message as follows:

ERROR:  query returned no rows
DETAIL:  parameters: $1 = 'nosuchuser'
CONTEXT:  PL/pgSQL function get_userid(text) line 6 at SQL statement

Note: The STRICT option matches behavior of SELECT INTO and related statements in Oracle PL/SQL.

Executing dynamic commands
You will often need to create a dynamic command (i.e. a command containing different tables or different data types each time it is executed) within a PL/pgSQL function. The general attempt by PL/pgSQL to cache plans for commands does not work in this scenario. An EXECUTE statement is provided to handle this kind of problem.

EXECUTE command-string [ INTO [STRICT] target ] [ USING expression [, ... ] ];

Here, command-string is an expression that creates a string (type text) containing a command to execute. An optional target is a record variable, a row variable, or a comma-delimited simple variable, and a record/line field list where the result of the command will be stored. The optional USING expression provides a value to be inserted into the command.

In a calculated command string, PL/pgSQL variables cannot be replaced. All necessary variable values should be inserted into the command string as configured. Alternatively, you can use parameters as described below.

There is also no caching plan for commands executed via EXECUTE. Instead, commands are always scheduled each time the statement is executed. Therefore, the command strings can be generated dynamically within a function to perform actions in other tables and columns.

The INTO clause specifies the result of an SQL command that assigns the results of returned rows. When a row or variable list is provided, it should exactly match the structure of the query result (if a record variable is used, it is automatically configured to match the result structure). If multiple rows are returned, only the first row is assigned to the INTO variable; NULL is assigned to the INTO variable, if no rows are returned. If the INTO clause is not specified, the query result is discarded.

Given a STRICT option, an error is reported if the query does not generate exactly a single row.

Command strings may use parameter values referenced in commands as $1, $2, and so on. These symbols refer to values provided in the USING clause. This method is often preferred over inserting data values into a command string as text. That is, the runtime overhead of converting values into text and back can be avoided, and SQL-injection attacks occur less often as it does not require quotes or escapes. For example:

EXECUTE 'SELECT count(*) FROM mytable WHERE inserted_by = $1 AND inserted <= $2'
   INTO c
   USING checked_user, checked_date;

Parameter symbols can only be used for data values. To use dynamically-determined table or column names, you must insert the corresponding characters into the command string. For example, if you need to perform the preceding query on a dynamically selected table, you can do the following:

EXECUTE 'SELECT count(*) FROM '
    || quote_ident(tabname)
    || ' WHERE inserted_by = $1 AND inserted <= $2'
   INTO c
   USING checked_user, checked_date;

A simpler approach will be to use %I of format() for the table or column name (newline-delimited, concatenated strings).

EXECUTE format('SELECT count(*) FROM %I '
   'WHERE inserted_by = $1 AND inserted <= $2', tabname)
   INTO c
   USING checked_user, checked_date;

Another limitation on parameter symbols is that they only work with SELECT, INSERT, UPDATE and DELETE commands. In other syntax types (commonly referred to as utility syntax), you should insert values in text, even if they are data values.

As in the first example above, EXECUTE with a simple constant command string and USING parameters is functionally equivalent to writing commands directly in PL/pgSQL and allowing PL/pgSQL variables to be automatically replaced. The important difference is that EXECUTE re-plans the commands at each run to generate a plan associated with the current parameter values. On the other hand, PL/pgSQL may create a general plan and cache it for reuse. In situations where the best planning depends heavily on parameter values, it is a good idea to use EXECUTE to make sure that the general plan is not selected.

SELECT INTO is not currently supported within EXECUTE. Instead, you should issue a generic SELECT command and specify INTO as part of EXECUTE.

Control Structures

The control structure is probably the most useful and important part of PL/pgSQL. The PL/pgSQL control structure allows you to manipulate AgensGraph data in a very flexible and powerful way.

Return from a function
There are two commands that can return data from functions: RETURN and RETURN NEXT.

  1. RETURN

RETURN expression;

RETURN with an expression terminates the function and returns the value of the expression to the caller. This format is used for PL/pgSQL functions that do not return sets.
In a function that returns a scalar type, the result of the expression is automatically converted to the return type of the function as described for the assignment. However, to return a composite (row) value, you should write an expression that correctly conveys the set of columns requested. This may require explicit type conversion.
If you declare a function using an output parameter, you need to write only RETURN without an expression. The current value of the output parameter variable is returned.
If you declare a function that returns void, you can use the RETURN statement to terminate the function early. Do not write expressions after RETURN.
The return value of a function cannot remain undefined. If control reaches the end of the top-level block of the function without using a RETURN statement, a run-time error occurs. However, this restriction does not apply to functions with output parameters and functions that return void. In such a case, the RETURN statement is automatically executed when the top-level block completes.
For example:

-- functions returning a scalar type
RETURN 1 + 2;
RETURN scalar_var;

-- functions returning a composite type
RETURN composite_type_var;
RETURN (1, 2, 'three'::text);  -- must cast columns to correct types
  1. RETURN NEXT and RETURN QUERY

RETURN NEXT expression;
RETURN QUERY query;
RETURN QUERY EXECUTE command-string [ USING expression [, ... ] ];

If a PL/pgSQL function is declared to return SETOF sometype, you should follow a slightly-different procedure. In such a case, the individual items to be returned are specified in the order of RETURN NEXT or RETURN QUERY, and the final RETURN command without arguments is used to indicate that execution of the function is complete. RETURN NEXT can be used with both scalar and complex data types. The entire “table” of the result is returned as a composite result type. RETURN QUERY adds the result of executing the query to the result set of the function. RETURN NEXT and RETURN QUERY can be mixed freely in a single set-returning function, in which case the results are concatenated.

RETURN NEXT and RETURN QUERY are not actually returned by a function. You can simply add zero or more rows to the result set of the function. Execution continues with the next statement of the PL/pgSQL function. The result set is built when a sequential RETURN NEXT or RETURN QUERY command is executed. The last return that should have no argument causes control to terminate the function (or control may reach the end of the function).

RETURN QUERY has a variant RETURN QUERY EXECUTE that specifies the queries to be executed dynamically. Parameter expressions, like EXECUTE, can be inserted into a query string computed via USING.

If you declare a function using an output parameter, you should write RETURN NEXT without an expression. At each run, the current value of the output parameter variable is stored for the final return of the result row. If there are multiple output parameters, declare a function that returns a SETOF record; if there is only one output parameter of type sometype, then you should declare SETOF sometype to create a set-returning function using the output parameter.

The following is an example of a function that uses RETURN NEXT.

CREATE TABLE foo (fooid INT, foosubid INT, fooname TEXT);
INSERT INTO foo VALUES (1, 2, 'three');
INSERT INTO foo VALUES (4, 5, 'six');

CREATE OR REPLACE FUNCTION get_all_foo() RETURNS SETOF foo AS
$BODY$
DECLARE
    r foo%rowtype;
BEGIN
    FOR r IN
        SELECT * FROM foo WHERE fooid > 0
    LOOP
        -- can do some processing here
        RETURN NEXT r; -- return current row of SELECT
    END LOOP;
    RETURN;
END
$BODY$
LANGUAGE plpgsql;

SELECT * FROM get_all_foo();

Here is an example of a function that uses RETURN QUERY.

CREATE FUNCTION get_available_flightid(date) RETURNS SETOF integer AS
$BODY$
BEGIN
    RETURN QUERY SELECT flightid
                   FROM flight
                  WHERE flightdate >= $1
                    AND flightdate < ($1 + 1);

    -- Since execution is not finished, we can check whether rows were returned
    -- and raise exception if not.
    IF NOT FOUND THEN
        RAISE EXCEPTION 'No flight at %.', $1;
    END IF;

    RETURN;
 END
$BODY$
LANGUAGE plpgsql;

-- Returns available flights or raises exception if there are no
-- available flights.
SELECT * FROM get_available_flightid(CURRENT_DATE);

Note: As described hereinabove, the current implementation of RETURN NEXT and RETURN QUERY stores the entire result set before the function is returned. In other words, performance can be degraded if a PL/pgSQL function generates a very large result set. Data is written to disk to avoid memory depletion, but the function itself is not returned until the entire result set is created. The point at which the current data begins to be written to disk is controlled by the work_mem configuration variable. Administrators having memory large enough to store a bigger volume of result sets should increase this parameter.

Simple loop
By using the LOOP, EXIT, CONTINUE, WHILE, FOR, and FOREACH statements, you can collate a series of commands so that the PL/pgSQL function can repeat them.

  1. LOOP

[ <<label>> ]
LOOP
    statements
END LOOP [ label ];

LOOP defines an unconditional loop that repeats indefinitely until terminated by an EXIT or RETURN statement. Optional label is used by EXIT and CONTINUE statements within the nested loop to specify the loop to which the statement refers.

  1. EXIT

EXIT [ label ] [ WHEN boolean-expression ];

If label is not specified, the innermost loop is terminated and the statement following END LOOP is executed next. The given label should be a label of the current or some outer level of the nested loop or block. Then the named loop or block is terminated and the statement continues after the corresponding END of the loop/block.

If WHEN is specified, loop terminates only if boolean-expression is true. Otherwise, control passes to the statement after EXIT.

EXIT can be used in any type of loop. It is not limited to the use with an unconditional loop.

When used with a BEGIN block, EXIT passes control to the next statement after the block terminates. Label should be used for this purpose. EXIT without label is not considered to match the BEGIN block.

Here is an example:

LOOP
    -- some computations
    IF count > 0 THEN
        EXIT;  -- exit loop
    END IF;
END LOOP;

LOOP
    -- some computations
    EXIT WHEN count > 0;  -- same result as previous example
END LOOP;

<<ablock>>
BEGIN
    -- some computations
    IF stocks > 100000 THEN
        EXIT ablock;  -- causes exit from the BEGIN block
    END IF;
    -- computations here will be skipped when stocks > 100000
END;
  1. CONTINUE

CONTINUE [ label ] [ WHEN boolean-expression ];

If label is not given, the next iteration of the innermost loop begins. That is, loop control is returned (if any) to skip all remaining statements in the loop body and to determine if another loop iteration is needed. If label is present, specify label of the loop to be continued.

With WHEN specified, the next iteration of the loop starts only if boolean-expression is true. Otherwise, control passes to the statement following CONTINUE.

CONTINUE can be used in any type of loop. It is not limited to the use with an unconditional loop.

Here is an example:

LOOP
    -- some computations
    EXIT WHEN count > 100;
    CONTINUE WHEN count < 50;
    -- some computations for count IN [50 .. 100]
END LOOP;
  1. WHILE

[ <<label>> ]
WHILE boolean-expression LOOP
    statements
END LOOP [ label ];

The WHILE statement repeats a series of statements as long as boolean-expression is evaluated to be true. Expressions are checked just before each item in the loop body.

Here is an example:

WHILE amount_owed > 0 AND gift_certificate_balance > 0 LOOP
    -- some computations here
END LOOP;

WHILE NOT done LOOP
    -- some computations here
END LOOP;
  1. FOR(Integer Variant)

[ <<label>> ]
FOR name IN [ REVERSE ] expression .. expression [ BY expression ] LOOP
    statements
END LOOP [ label ];

This type of FOR creates a loop that iterates over a range of integers. Variable names are automatically defined as integer types and exist only inside a loop (existing definitions of variable names are ignored in the loop). Two expressions representing the upper and lower bounds of the range are evaluated once when they enter the loop. If a BY clause is not specified, the iteration step is 1; if specified, the value specified in the BY clause is evaluated once in the loop entry. If REVERSE is specified, step value is not added but subtracted after each time of iteration.

Some examples of integer FOR loops:

FOR i IN 1..10 LOOP
    -- i will take on the values 1,2,3,4,5,6,7,8,9,10 within the loop
END LOOP;

FOR i IN REVERSE 10..1 LOOP
    -- i will take on the values 10,9,8,7,6,5,4,3,2,1 within the loop
END LOOP;

FOR i IN REVERSE 10..1 BY 2 LOOP
    -- i will take on the values 10,8,6,4,2 within the loop
END LOOP;

If the lower bound is greater than the upper bound (or less than REVERSE), the loop body is never executed. No error occurs.

When a label is connected to a FOR loop, the integer loop variable can be referenced with the specified name using that label.

Loop through query results
Other types of FOR loops allow you to iterate over the query results and manipulate the data accordingly. The syntax is as follows:

[ <<label>> ]
FOR target IN query LOOP
    statements
END LOOP [ label ];

target is a record variable, a row variable, or a comma-delimited list of scalar variables. target is assigned to each row of the query result successively, and each row of the loop body is executed. Here is an example:

CREATE FUNCTION cs_refresh_mviews() RETURNS integer AS $$
DECLARE
    mviews RECORD;
BEGIN
    RAISE NOTICE 'Refreshing materialized views...';

    FOR mviews IN SELECT * FROM cs_materialized_views ORDER BY sort_key LOOP

        -- Now "mviews" has one record from cs_materialized_views

        RAISE NOTICE 'Refreshing materialized view %s ...', quote_ident(mviews.mv_name);
        EXECUTE format('TRUNCATE TABLE %I', mviews.mv_name);
        EXECUTE format('INSERT INTO %I %s', mviews.mv_name, mviews.mv_query);
    END LOOP;

    RAISE NOTICE 'Done refreshing materialized views.';
    RETURN 1;
END;
$$ LANGUAGE plpgsql;

If the loop is terminated by an EXIT statement, you can still access the last-assigned row value after the loop.

The query used in the FOR statement type can be any SQL command that returns a row to the caller. SELECT is the most common case, but you can also use INSERT, UPDATE, or DELETE with RETURNING. Some utility commands such as EXPLAIN also work.

PL/pgSQL variables are replaced by query text and the query plan is cached for possible reuse.

The FOR-IN-EXECUTE statement is another way to iterate through rows.

[ <<label>> ]
FOR target IN EXECUTE text_expression [ USING expression [, ... ] ] LOOP
    statements
END LOOP [ label ];

This statement is the same as the previous form, except that the source query is specified in a string expression that is evaluated and rescored in each item of the FOR loop. This allows the programmer to choose the speed of pre-planned queries or flexibility of dynamic queries, just like a normal EXECUTE statement. Like EXECUTE, parameter values can be inserted into dynamic commands via USING.

Another way to specify a query that needs to repeat the result is to declare it as cursor.

Loop through array
The FOREACH loop is very similar to the FOR loop, but it iterates through the elements of the array value, instead of repeating the rows returned by the SQL query. Typically, FOREACH is used to iterate through the components of a composite value expression. Variants that repeat complexes other than arrays can be added later.

A FOREACH statement that iterates through the array:

[ <<label>> ]
FOREACH target [ SLICE number ] IN ARRAY expression LOOP
    statements
END LOOP [ label ];

If SLICE is not present or SLICE0 is specified, the loop iterates over the individual elements of the array created by evaluating expression. In the target variable, each element value is specified in order, and the loop body is executed for each element. The following is an example of iterating over an array element.

CREATE FUNCTION sum(int[]) RETURNS int8 AS $$
DECLARE
  s int8 := 0;
  x int;
BEGIN
  FOREACH x IN ARRAY $1
  LOOP
    s := s + x;
  END LOOP;
  RETURN s;
END;
$$ LANGUAGE plpgsql;

Elements are stored in storage order regardless of the number of dimensions of the array. target is usually a single variable, but it can be a list of variables when repeating composite value arrays. In such a case, for each array element, the variables are assigned in consecutive columns of composite values.

If the SLICE value is positive, FOREACH repeats the slice of the array rather than a single element. The SLICE value should be an integer constant that is not greater than the number of dimensions of the array. target variable should be an array and receive successive slices of the array value. Each slice is the number of dimensions specified by SLICE. The following is an example of repeating a one-dimensional slice.

CREATE FUNCTION scan_rows(int[]) RETURNS void AS $$
DECLARE
  x int[];
BEGIN
  FOREACH x SLICE 1 IN ARRAY $1
  LOOP
    RAISE NOTICE 'row = %', x;
  END LOOP;
END;
$$ LANGUAGE plpgsql;

SELECT scan_rows(ARRAY[[1,2,3],[4,5,6],[7,8,9],[10,11,12]]);

NOTICE:  row = {1,2,3}
NOTICE:  row = {4,5,6}
NOTICE:  row = {7,8,9}
NOTICE:  row = {10,11,12}

Cursors

Instead of running the entire query at once, you can set cursors to encapsulate the query and then read a few lines of the query results at a time. A reason for doing this is to avoid memory overrun when the result contains a large number of rows. (However, PL/pgSQL users generally do not have to worry about this as FOR loops use cursors internally to avoid memory problems.) More interesting way of use is to return a reference to the cursor where a function was created so that the caller can read the line. By doing so, it provides an efficient way to return a large row set from a function.

Declaring cursor variables
All access to PL/pgSQL cursors is always performed via cursor variables, which are special data type refcursor. One way to create a cursor variable is to declare it as a variable of type refcursor. Another way to do this is to use the following cursor declaration syntax:

name [ [ NO ] SCROLL ] CURSOR [ ( arguments ) ] FOR query;

When SCROLL is specified, the cursor may scroll backward. If NO SCROLL is specified, the reverse fetch is denied. If this option is not specified, it depends on the query whether reverse fetching is allowed or not. If argument is specified, it is a comma-separated list of name data type pairs and defines the name to be substituted for the parameter value in the specified query. The actual value to replace this name is specified later when the cursor is opened.

Here is an example:

DECLARE
    curs1 refcursor;
    curs2 CURSOR FOR SELECT * FROM tenk1;
    curs3 CURSOR (key integer) FOR SELECT * FROM tenk1 WHERE unique1 = key;

All three of these variables have a refcursor data type, but the first query can be used with all queries, the second query bounds a fully specified query, and the last query has a parameterized query bound. (The key is replaced with the integer parameter value when the cursor is opened.) The variable curs1 is called unbound because it is not bound to a particular query.

Opening cursors
To use a cursor to search for a row, the cursor should be already open. PL/pgSQL supports three types of OPEN statements, two of which use unbound cursor variables and the third use bound cursor variables.

Note: Bound cursor variables can be used without explicitly opening cursors.

  1. OPEN FOR query

OPEN unbound_cursorvar [ [ NO ] SCROLL ] FOR query;

The cursor variable is opened and the specified query is executed. The cursor can no longer be opened, and it should be declared as an unbound cursor variable (i.e. a simple refcursor variable). The query should be SELECT or something else that returns a row (such as EXPLAIN). Queries are handled in the same way as other PL/pgSQL SQL commands. The PL/pgSQL variable name is replaced, and the query plan is cached for possible reuse. When the PL/pgSQL variable is replaced with a cursor query, the value to be replaced is the value at the time of OPEN. Subsequent changes to the variable do not affect the behavior of the cursor. The SCROLL and NO SCROLL options have the same meaning as in the bound cursors.

Here is an example:

OPEN curs1 FOR SELECT * FROM foo WHERE key = mykey;
  1. OPEN FOR EXECUTE

OPEN unbound_cursorvar [ [ NO ] SCROLL ] FOR EXECUTE query_string
                                     [ USING expression [, ... ] ];

The cursor variable is opened and the specified query is executed. The cursor can no longer be opened, and it should be declared as an unbound cursor variable (i.e. a simple refcursor variable). The query is specified in a string expression in the same way as in EXECUTE. As always, this provides flexibility; this means that the query plan can vary each time of execution and variable substitution is not performed in the command string. Like EXECUTE, parameter values can be inserted into dynamic commands via format() and USING. The SCROLL and NO SCROLL options have the same meaning as bound cursors.

Here is an example:

OPEN curs1 FOR EXECUTE format('SELECT * FROM %I WHERE col1 = $1',tabname) USING keyvalue;

In this example, the table name is inserted into the query via format(). As the comparison value of col1 is inserted via USING, no quotes are required.

  1. Opening bounding cursors

OPEN bound_cursorvar [ ( [ argument_name := ] argument_value [, ...] ) ];

This OPEN format is used when a query is declared to open a cursor variable to which the query is bound. The cursor can no longer be opened. The list of actual argument value expressions can only be used if a cursor is declared to take an argument. These values are replaced in the query.

Query plans for bound cursors are always considered cacheable. There is no EXECUTE in this case. SCROLL and NO SCROLL cannot be specified for OPEN because the cursor’s scroll behavior has already been determined.

Argument values can be passed using positional or named notation. In position notation, all arguments are specified in order. In named notation, the names of each argument are specified and separated from the argument expression using :=. Like the calling function, you can use both positional notation and named notation together.

Here’s an example (the cursor declaration example above is used):

OPEN curs2;
OPEN curs3(42);
OPEN curs3(key := 42);

Variable substitution is performed in the query of a bound cursor, which means there are two ways to pass the value to the cursor: explicitly using the OPEN argument, or implicitly passing the value by referencing the PL/pgSQL variable in the query. However, only variables declared before declaration of the bound cursor can be replaced with variables. In both cases, the value to pass is determined at the time of OPEN. For instance, another way to get the same effect as the curs3 example above is:

DECLARE
    key integer;
    curs4 CURSOR FOR SELECT * FROM tenk1 WHERE unique1 = key;
BEGIN
    key := 42;
    OPEN curs4;

Using cursors
Once a cursor is open, you can manipulate it using the statements described below.

You do not need to perform this manipulation since it works in the same manner as when the cursor is first opened. A function can return a refcursor value and allow the caller to work on the cursor. (Internally, the refcursor value is simply a string name of so-called a portal containing the active query for the cursor, and can be assigned to another refcursor variable).

All portals are closed implicitly at the end of the transaction. Thus, the value of refcursor can be used to reference the open cursor until the end of the transaction.

  1. FETCH

    FETCH [ direction { FROM | IN } ] cursor INTO target;
    

    FETCH fetches the next row as the target of cursor in a row variable, a record variable, or a comma-separated list of simple variables, such as SELECT INTO. If there is no next row, the target is set to NULL(s). As with SELECT INTO, you can check the special variable FOUND to see if you obtained the row.

    The direction clause can be one of the variations allowed by SQL FETCH, except that it can fetch a single row (i.e. NEXT, PRIOR, FIRST, LAST, ABSOLUTE count, RELATIVE count, FORWARD, or BACKWARD). Omitting direction is equivalent to specifying NEXT. If the cursor is not declared or open with SCROLL option, the direction value that moves backward may fail.

    The cursor must be the name of a refcursor variable that refers to the open cursor portal.

    Here is an example:

    FETCH curs1 INTO rowvar;
    FETCH curs2 INTO foo, bar, baz;
    FETCH LAST FROM curs3 INTO x, y;
    FETCH RELATIVE -2 FROM curs4 INTO x;
    
  2. MOVE

    MOVE [ direction { FROM | IN } ] cursor;
    

    MOVE relocates a cursor without retrieving the data. MOVE works the same as FETCH command except that it only relocates the cursor and does not return the moved rows. As with SELECT INTO, you can check the special variable FOUND to see if there is a next row to move.

    The direction clause is used in the same manner as it is used in SQL FETCH command (e.g. NEXT, PRIOR, FIRST, LAST, ABSOLUTE count, RELATIVE count, ALL, FORWARD [count|ALL] or BACKWARD [count|ALL]). Omitting direction is equivalent to specifying NEXT. If the cursor is not declared or open with SCROLL option, the direction value that moves backward may fail.
    Here is an example:

    MOVE curs1;
    MOVE LAST FROM curs3;
    MOVE RELATIVE -2 FROM curs4;
    MOVE FORWARD 2 FROM curs4;
    
  3. UPDATE/DELETE WHERE CURRENT OF

    UPDATE table SET ... WHERE CURRENT OF cursor;
    DELETE FROM table WHERE CURRENT OF cursor;
    

    When a cursor is positioned on a table row, it can be identified to update or delete the row with the cursor. There is a restriction on the cursor’s queries (especially not grouped queries) and it is best to use FOR UPDATE on the cursor.

    Here is an example:

    UPDATE foo SET dataval = myval WHERE CURRENT OF curs1;
    
  4. CLOSE

    CLOSE cursor;
    

    CLOSE closes the default portal of the open cursor. It can be used to release cursor variables that can release or reopen resources before the end of a transaction.

    Here is an example:

    CLOSE curs1;
    
  5. Returning Cursors
    The PL/pgSQL function can return a cursor to the caller. This method is useful when returning multiple rows or columns, especially when returning a very large result set. To do this, the function opens a cursor and returns the cursor name to the caller (or simply opens a cursor using the name of the caller or the name of the portal). The caller can fetch rows from the cursor. The cursor is closed by the caller or automatically closed when it is closed.

    The name of the portal used for a cursor can be specified by the programmer or automatically generated. To specify a portal name, you should first assign a string before the refcursor variable is opened. The string value of the refcursor variable is used as the name of the default portal in OPEN. However, if the refcursor variable is null, OPEN automatically creates a name that does not conflict with the existing portal and assigns it to the refcursor variable.

Note: Since the bound cursor variable is initialized to a string value representing the name, the portal name is the same as the cursor variable name, unless the programmer ignores it through assignment before opening a cursor. However, as the default value of the unbound cursor variable is initially set to null, it gets a unique, automatically generated name unless ignored.

The following example is a way a caller can supply a cursor name.

CREATE TABLE test (col text);
INSERT INTO test VALUES ('123');

CREATE FUNCTION reffunc(refcursor) RETURNS refcursor AS '
BEGIN
    OPEN $1 FOR SELECT col FROM test;
    RETURN $1;
END;
' LANGUAGE plpgsql;

BEGIN;
SELECT reffunc('funccursor');
FETCH ALL IN funccursor;
COMMIT;

The following example uses automatic cursor name generation.

CREATE FUNCTION reffunc2() RETURNS refcursor AS '
DECLARE
    ref refcursor;
BEGIN
    OPEN ref FOR SELECT col FROM test;
    RETURN ref;
END;
' LANGUAGE plpgsql;

-- need to be in a transaction to use cursors.
BEGIN;
SELECT reffunc2();

      reffunc2
--------------------
 <unnamed cursor 1>
(1 row)

FETCH ALL IN "<unnamed cursor 1>";
COMMIT;

The following example is a way to return multiple cursors from a single function.

CREATE FUNCTION myfunc(refcursor, refcursor) RETURNS SETOF refcursor AS $$
BEGIN
    OPEN $1 FOR SELECT * FROM table_1;
    RETURN NEXT $1;
    OPEN $2 FOR SELECT * FROM table_2;
    RETURN NEXT $2;
END;
$$ LANGUAGE plpgsql;

-- need to be in a transaction to use cursors.
BEGIN;

SELECT * FROM myfunc('a', 'b');

FETCH ALL FROM a;
FETCH ALL FROM b;
COMMIT;

Loop through cursor results
There is a FOR statement (example below) that can iterate through the rows returned by the cursor.

[ <<label>> ]
FOR recordvar IN bound_cursorvar [ ( [ argument_name := ] argument_value [, ...] ) ] LOOP
    statements
END LOOP [ label ];

The cursor variable should be bound to some query when it is declared and should not be already open. The FOR statement automatically opens a cursor and closes it when the loop ends. The list of actual argument values should appear only if the cursor is declared to take an argument. This value is replaced in the query in the same way as OPEN.

The recordvar variable is automatically defined as type record and exists only in the loop (the existing definition of the variable name is ignored in the loop). Each row returned by the cursor is assigned to this record variable and the loop body is executed.

Errors and Messages

Reporting errors and messages
You may use the RAISE statement to report messages and causes errors.

RAISE [ level ] 'format' [, expression [, ... ]] [ USING option = expression [, ... ] ];
RAISE [ level ] condition_name [ USING option = expression [, ... ] ];
RAISE [ level ] SQLSTATE 'sqlstate' [ USING option = expression [, ... ] ];
RAISE [ level ] USING option = expression [, ... ];
RAISE ;

The level option specifies the severity of the error. Allowable levels are DEBUG, LOG, INFO, NOTICE, WARNING, and EXCEPTION (EXCEPTION is the default). EXCEPTION causes an error (usually aborting the current transaction). The other levels generate only messages of different priority levels. Whether messages of a particular priority are reported to the client, or recorded in the server log can be controlled by the log_min_messages and client_min_messages configuration variables.

If there is level, you can create a format (it must be a simple string literal, not an expression). A format string specifies the error message text to report. Following the format string, an optional argument expression can be inserted after the message. Within the format string, % is replaced by the string expressions of the following optional argument values: To print a literal %, you should write %%. The number of arguments should match the number of % placeholders in the format string; if not, an error will occur while compiling the function.

In this example, the value of v_job_id replaces % of the string.

RAISE NOTICE 'Calling cs_create_job(%)', v_job_id;

You can attach additional information to the error report by writing an option = expression entry after USING. Each expression can be a string value expression. The allowable option keywords include:

  • MESSAGE
    Sets the error message text. This option cannot be used in a RAISE format that includes format strings before USING.

  • DETAIL
    Provides a detailed error message.

  • HINT
    Provides a hint message.

  • ERRCODE
    Specifies the error code (SQLSTATE) to be reported directly by condition name or with a 5-digit SQLSTATE code.

  • COLUMN, CONSTRAINT, DATATYPE, TABLE, SCHEMA
    Provides the names of the related objects.

This example stops a transaction with a given error message and a hint.

RAISE EXCEPTION 'Nonexistent ID --> %', user_id
      USING HINT = 'Please check your user ID';

These two examples show an identical way of setting SQLSTATE.

RAISE 'Duplicate user ID: %', user_id USING ERRCODE = 'unique_violation';
RAISE 'Duplicate user ID: %', user_id USING ERRCODE = '23505';

There is a second RAISE statement for which the main argument is the condition name or SQLSTATE to be reported. For example:

RAISE unique_violation USING MESSAGE = 'Duplicate user ID: ' || user_id;

Another way is to create RAISE USING or RAISE level USING and put everything else in the USING list. The last variant of RAISE has no parameters at all. This format can be used only within the EXCEPTION clause of a BEGIN block. Re-generate the error currently being processed.

If no condition name or SQLSTATE is specified in RAISE_EXCEPTION (P0001) command, RAISE_EXCEPTION (P0001) is used by default. If the message text is not specified, the message name becomes the condition name or SQLSTATE by default.

Note: If you specify a SQLSTATE code as the error code, you can select an error code that is not limited to the predefined error code(s) but consists of a five-digit number and/or uppercase ASCII characters (except 00000). As the three zero-terminated error codes are category codes, error codes are generated when they can only be trapped by trapping the entire category.

Verifying assertion
The ASSERT statement is a convenient shorthand for inserting debugging checks into PL/pgSQL functions.

ASSERT condition [ , message ];

condition is a Boolean expression that is always expected to be asserted to true. If true, no ASSERT statement is executed; if false or null, an ASSERT_FAILURE exception is thrown. (If an error occurs while asserting the condition, it is reported as a normal error.)

With an optional message given, if condition fails, the result (if not null) replaces the default error message text “assertion failed.” If the assertion is successful, the *message* expression is not performed.

ASSERT tests can be enabled or disabled via the configuration parameter plpgsql.check_asserts, which uses boolean values; the default is on. If this parameter is off, the ASSERT statement does nothing.

ASSERT is intended to detect program bugs, not to report common error conditions. Use the RAISE statement described above for reporting errors.

Trigger Procedures

PL/pgSQL can be used to define trigger procedures for data changes or database events. A trigger procedure is created with CREATE FUNCTION (command); it does not have any argument and is declared with a return type of trigger (in the case of data change triggers) or event_trigger (in the case of database event triggers). A special local variable called PG_something is automatically defined to describe the condition that triggered the call.

Data change triggers
A data change trigger is declared as a function with a return type of trigger without arguments. The function should be declared without arguments, even if it is expected to receive the arguments specified in CREATE TRIGGER. These arguments are passed through TG_ARGV as described below.

When the PL/pgSQL function is called by trigger, several special variables are automatically created in the top-level block; the created items are as follows:

  • NEW
    Data type RECORD; variable that holds new database rows on INSERT/UPDATE operations on row-level triggers. This variable is not specified for statement-level triggers and DELETE operation.

  • OLD
    Data type RECORD; variable that holds old database row on UPDATE/DELETE operation on the row level triggers. This variable is not specified for statement-level triggers and INSERT operation.

  • TG_NAME
    Data type name; variable that contains the name of the trigger actually fired.

  • TG_WHEN
    Data type text; a string of BEFORE, AFTER, or INSTEAD OF, depending on the trigger definition.

  • TG_LEVEL
    Data type text; a string of ROW or STATEMENT, depending on the trigger definition.

  • TG_OP
    Data type text; a string of INSERT, UPDATE, DELETE, or TRUNCATE telling for which operation the trigger is actually fired.

  • TG_RELID
    Data type oid; the object ID of the table that caused the trigger invocation.

  • TG_TABLE_NAME
    Data type name; the name of the table that caused the trigger invocation.

  • TG_TABLE_SCHEMA
    Data type name; the schema name of the table that caused the trigger call.

  • TG_NARGS
    Data type integer; the number of arguments given to the trigger procedure in the CREATE TRIGGER statement.

  • TG_ARGV[]
    Data type text; the argument index of the CREATE TRIGGER statement is zero-based. Invalid indices (less than 0 or greater than or equal to tg_nargs) result in a NULL value.

A trigger function must return NULL or a record/row value that exactly matches the structure of the table on which the trigger was executed.

A row-level trigger triggered by BEFORE can return null to signal the trigger manager to skip the rest of the work for this row (i.e. no subsequent trigger is executed and INSERT/UPDATE/DELETE of this row will not occur). If nonnull is returned, the job advances to the corresponding row value. Returning a row value different from the original value of NEW changes the row to be inserted or updated. Accordingly, if the trigger function does not change the row value and you want the triggering action to succeed normally, NEW (or its corresponding value) should be returned. It is possible to change the row to be saved by modifying a single value directly from NEW and returning a modified NEW or by creating a completely-new record/row to be returned. In the case of before-trigger of DELETE, the return value does not have a direct effect, but should not be null to continue the triggering operation. As NEW is null in DELETE trigger, returning is meaningless in general. A common idiom for DELETE trigger is to return OLD.

An INSTEAD OF trigger (which is always a row-level trigger and can only be used in a view) can return null to indicate that no update has been made; skipping the rest of the work for this row should be possible (e.g. the trigger will not start and the row will not be calculated when it is affected by surrounding INSERT/UPDATE/DELETE). Otherwise, a non-null value should be returned to indicate that the trigger has performed the requested operation. The return value should be NEW in the case of INSERT and UPDATE operations, and the trigger function can be modified to support INSERT RETURNING and UPDATE RETURNING (this affects the row values passed to the subsequent trigger, or is passed by the specific EXCLUDED alias reference contained in the ON CONFLICT DO UPDATE clause of the INSERT statement). For DELETE operation, the return value should be OLD.

The return value of a row-level trigger executed after BEFORE or AFTER is always ignored. It may be null. However, if any of these types of triggers fail, the entire operation can be suspended.

The following example shows an example of a trigger procedure in PL/pgSQL.
This trigger in the example below allows the current user name and time to be stamped on a row whenever a row is inserted or updated in the table. Make sure the employee’s name is given and the salary is positive (number).

CREATE TABLE emp (
    empname text,
    salary integer,
    last_date timestamp,
    last_user text
);

CREATE FUNCTION emp_stamp() RETURNS trigger AS $emp_stamp$
    BEGIN
        -- Check that empname and salary are given
        IF NEW.empname IS NULL THEN
            RAISE EXCEPTION 'empname cannot be null';
        END IF;
        IF NEW.salary IS NULL THEN
            RAISE EXCEPTION '% cannot have null salary', NEW.empname;
        END IF;

        -- Who works for us when they must pay for it?
        IF NEW.salary < 0 THEN
            RAISE EXCEPTION '% cannot have a negative salary', NEW.empname;
        END IF;

        -- Remember who changed the payroll when
        NEW.last_date := current_timestamp;
        NEW.last_user := current_user;
        RETURN NEW;
    END;
$emp_stamp$ LANGUAGE plpgsql;

CREATE TRIGGER emp_stamp BEFORE INSERT OR UPDATE ON emp
    FOR EACH ROW EXECUTE PROCEDURE emp_stamp();

Another way to log changes to a table is to create a new table that holds rows for each insert, update, or delete. This approach can be thought of as auditing any changes of the table. The following example shows an example of an audit trigger procedure in PL/pgSQL.
This example trigger causes the row insert, update, or delete in emp table to be written to the emp_audit table. The current time and user name are stamped on the row together with the type of the operation performed.

CREATE TABLE emp (
    empname           text NOT NULL,
    salary            integer
);

CREATE TABLE emp_audit(
    operation         char(1)   NOT NULL,
    stamp             timestamp NOT NULL,
    userid            text      NOT NULL,
    empname           text      NOT NULL,
    salary integer
);

CREATE OR REPLACE FUNCTION process_emp_audit() RETURNS TRIGGER AS $emp_audit$
    BEGIN
        --
        -- Create a row in emp_audit to reflect the operation performed on emp,
        -- make use of the special variable TG_OP to work out the operation.
        --
        IF (TG_OP = 'DELETE') THEN
            INSERT INTO emp_audit SELECT 'D', now(), user, OLD.*;
            RETURN OLD;
        ELSIF (TG_OP = 'UPDATE') THEN
            INSERT INTO emp_audit SELECT 'U', now(), user, NEW.*;
            RETURN NEW;
        ELSIF (TG_OP = 'INSERT') THEN
            INSERT INTO emp_audit SELECT 'I', now(), user, NEW.*;
            RETURN NEW;
        END IF;
        RETURN NULL; -- result is ignored since this is an AFTER trigger
    END;
$emp_audit$ LANGUAGE plpgsql;

CREATE TRIGGER emp_audit
AFTER INSERT OR UPDATE OR DELETE ON emp
    FOR EACH ROW EXECUTE PROCEDURE process_emp_audit();

A variation on the previous example indicates when each entry was last modified by using view that joins the main table to the audit table. This approach still logs the entire audit trail of changes to the table, but provides a brief view of the audit trail, showing only the last modified timestamp derived from the audit trail for each entry. The following example shows an example of an audit trigger on view in PL/pgSQL.
This example allows view to use a trigger to update the view and to make inserting, updating, or deleting rows of the view to be written to the emp_audit table. The current time and user name are recorded with the type of operation performed, and the view shows the last modification time of each row.

CREATE TABLE emp (
    empname           text PRIMARY KEY,
    salary            integer
);

CREATE TABLE emp_audit(
    operation         char(1)   NOT NULL,
    userid            text      NOT NULL,
    empname           text      NOT NULL,
    salary            integer,
    stamp             timestamp NOT NULL
);

CREATE VIEW emp_view AS
    SELECT e.empname,
           e.salary,
           max(ea.stamp) AS last_updated
      FROM emp e
      LEFT JOIN emp_audit ea ON ea.empname = e.empname
     GROUP BY 1, 2;

CREATE OR REPLACE FUNCTION update_emp_view() RETURNS TRIGGER AS $$
    BEGIN
        --
        -- Perform the required operation on emp, and create a row in emp_audit
        -- to reflect the change made to emp.
        --
        IF (TG_OP = 'DELETE') THEN
            DELETE FROM emp WHERE empname = OLD.empname;
            IF NOT FOUND THEN RETURN NULL; END IF;

            OLD.last_updated = now();
            INSERT INTO emp_audit VALUES('D', user, OLD.*);
            RETURN OLD;
        ELSIF (TG_OP = 'UPDATE') THEN
            UPDATE emp SET salary = NEW.salary WHERE empname = OLD.empname;
            IF NOT FOUND THEN RETURN NULL; END IF;

            NEW.last_updated = now();
            INSERT INTO emp_audit VALUES('U', user, NEW.*);
            RETURN NEW;
        ELSIF (TG_OP = 'INSERT') THEN
            INSERT INTO emp VALUES(NEW.empname, NEW.salary);

            NEW.last_updated = now();
            INSERT INTO emp_audit VALUES('I', user, NEW.*);
            RETURN NEW;
        END IF;
    END;
$$ LANGUAGE plpgsql;

CREATE TRIGGER emp_audit
INSTEAD OF INSERT OR UPDATE OR DELETE ON emp_view
    FOR EACH ROW EXECUTE PROCEDURE update_emp_view();

One of the uses of triggers is to maintain a summary table of other tables. The result summary can be used in place of the original table for a particular query (usually it requires a much shorter execution time). This technique is commonly used in data warehousing where measured or observed data tables (called fact tables) can be very large. The following example shows an example of a PL/pgSQL trigger procedure that maintains a summary table for the data warehouse’s fact tables.

The schemas described here are based in part on a grocery store example included in The Data Warehouse Toolkit by Ralph Kimball.

--
-- Main tables - time dimension and sales fact.
--
CREATE TABLE time_dimension (
    time_key                    integer NOT NULL,
    day_of_week                 integer NOT NULL,
    day_of_month                integer NOT NULL,
    month                       integer NOT NULL,
    quarter                     integer NOT NULL,
    year                        integer NOT NULL
);
CREATE UNIQUE INDEX time_dimension_key ON time_dimension(time_key);

CREATE TABLE sales_fact (
    time_key                    integer NOT NULL,
    product_key                 integer NOT NULL,
    store_key                   integer NOT NULL,
    amount_sold                 numeric(12,2) NOT NULL,
    units_sold                  integer NOT NULL,
    amount_cost                 numeric(12,2) NOT NULL
);
CREATE INDEX sales_fact_time ON sales_fact(time_key);

--
-- Summary table - sales by time.
--
CREATE TABLE sales_summary_bytime (
    time_key                    integer NOT NULL,
    amount_sold                 numeric(15,2) NOT NULL,
    units_sold                  numeric(12) NOT NULL,
    amount_cost                 numeric(15,2) NOT NULL
);
CREATE UNIQUE INDEX sales_summary_bytime_key ON sales_summary_bytime(time_key);

--
-- Function and trigger to amend summarized column(s) on UPDATE, INSERT, DELETE.
--
CREATE OR REPLACE FUNCTION maint_sales_summary_bytime() RETURNS TRIGGER
AS $maint_sales_summary_bytime$
    DECLARE
        delta_time_key          integer;
        delta_amount_sold       numeric(15,2);
        delta_units_sold        numeric(12);
        delta_amount_cost       numeric(15,2);
    BEGIN

        -- Work out the increment/decrement amount(s).
        IF (TG_OP = 'DELETE') THEN

            delta_time_key = OLD.time_key;
            delta_amount_sold = -1 * OLD.amount_sold;
            delta_units_sold = -1 * OLD.units_sold;
            delta_amount_cost = -1 * OLD.amount_cost;

        ELSIF (TG_OP = 'UPDATE') THEN

            -- forbid updates that change the time_key -
            -- (probably not too onerous, as DELETE + INSERT is how most
            -- changes will be made).
            IF ( OLD.time_key != NEW.time_key) THEN
                RAISE EXCEPTION 'Update of time_key : % -> % not allowed',
                                                      OLD.time_key, NEW.time_key;
            END IF;

            delta_time_key = OLD.time_key;
            delta_amount_sold = NEW.amount_sold - OLD.amount_sold;
            delta_units_sold = NEW.units_sold - OLD.units_sold;
            delta_amount_cost = NEW.amount_cost - OLD.amount_cost;

        ELSIF (TG_OP = 'INSERT') THEN

            delta_time_key = NEW.time_key;
            delta_amount_sold = NEW.amount_sold;
            delta_units_sold = NEW.units_sold;
            delta_amount_cost = NEW.amount_cost;

        END IF;


        -- Insert or update the summary row with the new values.
        <<insert_update>>
        LOOP
            UPDATE sales_summary_bytime
                SET amount_sold = amount_sold + delta_amount_sold,
                    units_sold = units_sold + delta_units_sold,
                    amount_cost = amount_cost + delta_amount_cost
                WHERE time_key = delta_time_key;

            EXIT insert_update WHEN found;

            BEGIN
                INSERT INTO sales_summary_bytime (
                            time_key,
                            amount_sold,
                            units_sold,
                            amount_cost)
                    VALUES (
                            delta_time_key,
                            delta_amount_sold,
                            delta_units_sold,
                            delta_amount_cost
                           );

                EXIT insert_update;

            EXCEPTION
                WHEN UNIQUE_VIOLATION THEN
                    -- do nothing
            END;
        END LOOP insert_update;

        RETURN NULL;

    END;
$maint_sales_summary_bytime$ LANGUAGE plpgsql;

CREATE TRIGGER maint_sales_summary_bytime
AFTER INSERT OR UPDATE OR DELETE ON sales_fact
    FOR EACH ROW EXECUTE PROCEDURE maint_sales_summary_bytime();

INSERT INTO sales_fact VALUES(1,1,1,10,3,15);
INSERT INTO sales_fact VALUES(1,2,1,20,5,35);
INSERT INTO sales_fact VALUES(2,2,1,40,15,135);
INSERT INTO sales_fact VALUES(2,3,1,10,1,13);
SELECT * FROM sales_summary_bytime;
DELETE FROM sales_fact WHERE product_key = 1;
SELECT * FROM sales_summary_bytime;
UPDATE sales_fact SET units_sold = units_sold * 2;
SELECT * FROM sales_summary_bytime;

Event triggers
PL/pgSQL can be used to define event triggers. In AgensGraph, procedures to be called as event triggers have no arguments and event_trigger should be declared as return type.

When a PL/pgSQL function is called with an event trigger, several special variables are automatically created in the top-level block.

  • TG_EVENT
    Data type text; a string representing the event under which the trigger will be executed.

  • TG_TAG
    Data type text; a variable that contains a command tag where the trigger is executed.

The trigger in the example below generates a NOTICE message each time a supported command is executed.

CREATE OR REPLACE FUNCTION snitch() RETURNS event_trigger AS $$
BEGIN
    RAISE NOTICE 'snitch: % %', tg_event, tg_tag;
END;
$$ LANGUAGE plpgsql;

CREATE EVENT TRIGGER snitch ON ddl_command_start EXECUTE PROCEDURE snitch();

Tips for Developing in PL/pgSQL

A good way of development in PL/pgSQL is to write a function using a text editor of your choice and load and test it using psql in another window. If you work like this, we recommend you to write a function using CREATE OR REPLACE FUNCTION. You can then update the function definition by reloading the file. An example is shown below:

CREATE OR REPLACE FUNCTION testfunc(integer) RETURNS integer AS $$
          ....
$$ LANGUAGE plpgsql;

You can load or reload the following function definition file while running psql as follows:

\i filename.sql

Another good way to develop in PL/pgSQL is to use a GUI database access tool that facilitates development with a procedural language. These tools often provide convenient features (e.g. removing single quotes, easier play back).

Quotation processing
The code of a PL/pgSQL function is specified as a string literal in CREATE FUNCTION. If you use a usual way to enclose a string in single quotes, you need to double the single quotes in the body of the function. Likewise, you should use two backslashes as well (assuming escape string syntax is used). Using quotes twice can make your code difficult to understand. This is because it is easy for the user to know that many adjacent quotes are needed.
Therefore, you are recommended to write function bodies using “dollar quoted” string literals. In the dollar citation method, you should never use quotation marks twice, but instead choose a different dollar citation delimiter for each required nesting level. For example, you can create a CREATE FUNCTION command as follows:

CREATE OR REPLACE FUNCTION testfunc(integer) RETURNS integer AS $PROC$
          ....
$PROC$ LANGUAGE plpgsql;

In this case, you may use quotes around a simple literal string in SQL commands and use $$ to separate the SQL command to assemble into strings. If you need to quote text that contains $$, you can use $Q$.

The following chart shows what to do when writing quotes without dollar citation marks. This can be useful when converting dollar citation marks to something more understandable.

1 quotation mark
To start and end a function body:

CREATE FUNCTION foo() RETURNS integer AS '
          ....
' LANGUAGE plpgsql;

Within the function body enclosed in single quotes, the quotes should appear in pairs.

2 quotation marks
It is used to express a string literal inside a function body. For example:

a_output := ''Blah'';
SELECT * FROM users WHERE f_name=''foobar'';

In the dollar citation scheme, you can write:

a_output := 'Blah';
SELECT * FROM users WHERE f_name='foobar';

The PL/pgSQL parser figures out both cases exactly.

4 quotation marks
This is used when a string constant inside the function body requires a single quote. For example:

a_output := a_output || '' AND name LIKE ''''foobar'''' AND xyz''

The value that has actually been added to a_output: AND name LIKE 'foobar'AND xyz.

In the dollar citation scheme, you can write:

a_output := a_output || $$ AND name LIKE 'foobar' AND xyz$$

6 quotation marks

Used when a single quote of a string inside the function body is adjacent to the end of the string constant. For example:

a_output := a_output || '' AND name LIKE ''''foobar''''''

The value added to a_output: AND name LIKE 'foobar'.

In the dollar citation scheme, you can write:

a_output := a_output || $$ AND name LIKE 'foobar'$$

10 quotation marks
Two single quotes are needed in a string constant (representing eight quote marks) and are adjacent to the end of the string constant (two or more). You will only need it if you are writing a function that creates another function.

a_output := a_output || '' if v_'' ||
    referrer_keys.kind || '' like ''''''''''
    || referrer_keys.key_string || ''''''''''
    then return ''''''  || referrer_keys.referrer_type
    || ''''''; end if;'';

The value of a_output is as follows

if v_... like ''...'' then return ''...''; end if;

In the dollar citation scheme, you can write:

a_output := a_output || $$ if v_$$ || referrer_keys.kind || $$ like '$$
    || referrer_keys.key_string || $$'
    then return '$$  || referrer_keys.referrer_type
    || $$'; end if;$$;

PL/Python

PL/Python Functions

Functions in PL/Python are declared using standard CREATE FUNCTION syntax.

CREATE FUNCTION funcname (argument-list)
  RETURNS return-type
AS $$
  # PL/Python function body
$$ LANGUAGE plpythonu;

The body of this function is a simple python script. When the function is called, its arguments are passed to the list args; named arguments are passed to the Python script as ordinary variables as well. Such named arguments are generally more readable. The result is returned in the Python code with return or yield (in case of a result-set statement). If you do not provide a return value, Python returns the default (None). PL/Python converts Python’s None to SQL’s Null.

For example, a function that returns the greater of two integers can be defined as:

CREATE FUNCTION pymax (a integer, b integer)
  RETURNS integer
AS $$
  if a > b:
    return a
  return b
$$ LANGUAGE plpythonu;

The Python code that is given as the body of the function definition is transformed into a Python function. For example, the above results in:

def __plpython_procedure_pymax_23456():
  if a > b:
    return a
  return b

Assuming that 23456 is the OID assigned to the function.

The argument is set to a global variable. Unless a variable is declared globally in a block, the argument variable cannot be reassigned within the function as an expression value containing the variable name itself according to the Python’s scoping rules. For example, the following may not work:

CREATE FUNCTION pystrip(x text)
  RETURNS text
AS $$
  x = x.strip()  # error
  return x
$$ LANGUAGE plpythonu;

When you assign it to x, x becomes a local variable for the entire block. Thus, the x on the right side, which is not a PL/Python function parameter, only refers to the local variable x that has not yet been allocated. You can use a global statement to perform the following:

CREATE FUNCTION pystrip(x text)
  RETURNS text
AS $$
  global x
  x = x.strip()  # ok now
  return x
$$ LANGUAGE plpythonu;

However, it is better not to rely on the detailed implementation of PL/Python; you are recommended to use function parameters as read-only.

Data Values

In general, the purpose of PL/Python is to provide a “natural” mapping between AgensGraph and Python. Information on the data mapping rules is described below.

Data Type Mapping

When a PL/Python function is called, the argument is converted from the AgensGraph data type to the corresponding Python type.

  • AgensGraph boolean is converted to Python boolean.

  • AgensGraph smallint and int are converted to Python int. AgensGraph bigint and oid are converted to long in Python2 and int in Python3.

  • AgensGraph real and double are converted to Python float.

  • AgensGraph numeric is converted to Python Decimal. If this type is available, bring it from the cdecimal package. Otherwise, the standard library decimal.Decimal is used. cdecimal is considerably faster than decimal. In Python3.3 or later versions, there is no difference between the two as cdecimal is integrated into the standard library under the name decimal.

  • AgensGraph bytea is converted to str in Python2 and bytes in Python3. In Python2, you need to treat the string as a sequence of bytes without character encoding.

  • All other data types, including the AgensGraph string format, are converted to Python str. In Python2, this string is in the AgensGraph server encoding. In Python3, it becomes the same Unicode string as all strings.

  • For nonscalar data type, see below.

When returning a PL/Python function, the return value is converted to the declared AgensGraph return data type of the function as follows:

  • When the AgensGraph return type is boolean, the return value is evaluated to see if it is true according to the Python rules. That is, while 0 and an empty string are false, ‘f’ is true.

  • When the AgensGraph return type is bytea, the return value is converted to string (Python2) or bytes (Python3) using each Python built-in function, and the result is converted to bytea.

  • For all other AgensGraph return types, the conversion value is converted to a string using the Python built-in str, and the result is passed to the input function of AgensGraph data type. (If the Python value is float, use the repr built-in instead of str to avoid loss of precision.)

In Python2, strings should be in the AgensGraph server encoding when passed to AgensGraph. A string that is not valid in the current server encoding will cause an error; since not all encoding discrepancies can be detected, garbage values can continue to occur if this is not done correctly. As Unicode strings are automatically converted to the correct encoding, using them can be safer and more convenient. In Python3, all strings are Unicode strings.

*See below for more information on nonscalar data types.

The logical discrepancy between the declared AgensGraph return type and actual return object’s Python data type is not displayed; in any case the value will be returned.

Null, None

If SQL null is passed to a function, the argument value in Python is displayed as none. For example, the function definition in [PL/Python Functions] pymax returns an incorrect answer for null input. You can add STRICT to your function definition to make the work more reasonable. If null is passed, the function will not be called at all and will automatically return null. You may also check for null input in the function body.

CREATE FUNCTION pymax (a integer, b integer)
  RETURNS integer
AS $$
  if (a is None) or (b is None):
    return None
  if a > b:
    return a
  return b
$$ LANGUAGE plpythonu;

As shown above, to return an SQL null value from a PL/Python function, return the value None. This can be done whether the function is strict or not.

Arrays, Lists

SQL array values are passed to PL/Python as a Python list. To return a SQL array value from a PL/Python function, return a Python sequence (e.g. a list or tuple).

CREATE FUNCTION return_arr()
  RETURNS int[]
AS $$
return (1, 2, 3, 4, 5)
$$ LANGUAGE plpythonu;

SELECT return_arr();
 return_arr  
-------------
 {1,2,3,4,5}
(1 row)

Strings are sequences in Python; Python programmers who are not familiar with this can have undesirable consequences.

CREATE FUNCTION return_str_arr()
  RETURNS varchar[]
AS $$
return "hello"
$$ LANGUAGE plpythonu;

SELECT return_str_arr();
 return_str_arr
----------------
 {h,e,l,l,o}
(1 row)

Composite Types

Composite-type arguments are passed to the function as Python mappings. The element names of the mapping are the attribute names of the composite type. If an attribute in the passed row has a null value, it has a None value in the mapping. Here is an example:

CREATE TABLE employee (
  name text,
  salary integer,
  age integer
);

CREATE FUNCTION overpaid (e employee)
  RETURNS boolean
AS $$
  if e["salary"] > 200000:
    return True
  if (e["age"] < 30) and (e["salary"] > 100000):
    return True
  return False
$$ LANGUAGE plpythonu;

There are multiple ways to return row or composite types from a Python function. To execute a composite type result example, create a TYPE as shown below:

CREATE TYPE named_value AS (
  name   text,
  value  integer
);

A composite result can be returned as a:

  • Sequence type (a tuple or list, but not a set because it is not indexable) Returned sequence objects must have the same number of items as the composite result type has fields. The item with index 0 is assigned to the first field of the composite type, 1 to the second and so on. For example:

    CREATE FUNCTION make_pair (name text, value integer)
      RETURNS named_value
    AS $$
      return [ name, value ]
      # or alternatively, as tuple: return ( name, value )
    $$ LANGUAGE plpythonu;
    

    To return a SQL null for any column, insert none at the corresponding position.

  • Mapping (dictionary)

    The value for each result type column is retrieved from the mapping with the column name as key. Example:

    CREATE FUNCTION make_pair (name text, value integer)
      RETURNS named_value
    AS $$
      return { "name": name, "value": value }
    $$ LANGUAGE plpythonu;
    

    Any extra dictionary key/value pairs are ignored. Missing keys are treated as errors. To return a SQL null value for any column, insert None with the corresponding column name as the key.

  • Object (any object providing method __getattr__)

    This works the same as a mapping. Here is an example:

    CREATE FUNCTION make_pair (name text, value integer)
      RETURNS named_value
    AS $$
      class named_value:
        def __init__ (self, n, v):
          self.name = n
          self.value = v
      return named_value(name, value)
    
      # or simply
      class nv: pass
      nv.name = name
      nv.value = value
      return nv
    $$ LANGUAGE plpythonu;
    

    Functions with OUT parameters are also supported. Here is an example:

    CREATE FUNCTION multiout_simple(OUT i integer, OUT j integer) AS $$
    return (1, 2)
    $$ LANGUAGE plpythonu;
    
    SELECT * FROM multiout_simple();
    

Set-returning Functions

A PL/Python function can also return sets of scalar or composite types. There are several ways to achieve this because the returned object is internally turned into an iterator. The following examples assume we have composite type:

CREATE TYPE greeting AS (
how text,
who text
);

A set result can be returned from a:

  • Sequence type (tuple, list, set)

    CREATE FUNCTION greet (how text)
      RETURNS SETOF greeting
    AS $$
      # return tuple containing lists as composite types
      # all other combinations work also
      return ( [ how, "World" ], [ how, "PostgreSQL" ], [ how, "PL/Python" ] )
    $$ LANGUAGE plpythonu;
    
  • Iterator (any object providing __iter__ and next methods)

    CREATE FUNCTION greet (how text)
      RETURNS SETOF greeting
    AS $$
      class producer:
        def __init__ (self, how, who):
          self.how = how
          self.who = who
          self.ndx = -1
    
        def __iter__ (self):
          return self
    
        def next (self):
          self.ndx += 1
          if self.ndx == len(self.who):
            raise StopIteration
          return ( self.how, self.who[self.ndx] )
    
      return producer(how, [ "World", "PostgreSQL", "PL/Python" ])
    $$ LANGUAGE plpythonu;
    
  • Generator (yield)

    CREATE FUNCTION greet (how text)
      RETURNS SETOF greeting
    AS $$
      for who in [ "World", "PostgreSQL", "PL/Python" ]:
        yield ( how, who )
    $$ LANGUAGE plpythonu;
    

The set-returning function with OUT parameters (using RETURNS SETOF record) is also supported. Here is an example:

CREATE FUNCTION multiout_simple_setof(n integer, OUT integer, OUT integer) 
 RETURNS SETOF record 
AS $$
 return [(1, 2)] * n
$$ LANGUAGE plpythonu;

SELECT * FROM multiout_simple_setof(3);

Sharing Data

Global dictionary SD can be used to store data between function calls. This variable is of individual static data. You should be careful since Global dictionary GD is public data that can be used in all Python functions in the session.

As each function has its own execution environment in the Python interpreter, global data and function arguments of myfunc cannot be used in myfunc2. As mentioned above, data in the GD dictionary is an exception.

Anonymous Code Blocks

PL/Python also supports anonymous code blocks that are called together with DO statements.

DO $$
    # PL/Python code
$$ LANGUAGE plpythonu;

Anonymous code blocks do not take arguments and discard all the returned values. Otherwise, they behave like a function.

Trigger Functions

When a function is used as a trigger, the dictionary TD contains the trigger-related value.

TD[“event”]
Contains an event as a string (INSERT, UPDATE, DELETE, or TRUNCATE).

TD[“when”]
Contains one of BEFORE, AFTER, and INSTEAD OF.

TD[“level”]
Contains ROW or STATEMENT.

TD[“new”], TD[“old”]
Contains each trigger row according to one or both trigger events of the fields, in the case of row-level triggers.

TD[“name”]
Contains the trigger name.

TD[“table_name”]
Contains the name of the table on which the trigger occurred.

TD[“table_schema”]
Contains the schema of the table where the trigger occurred.

TD[“relid”]
Contains the OID of the table where the trigger occurred.

TD[“args”]
In the case of CREATE TRIGGER (command) with arguments, you can use from TD["args"][0] to TD["args"][n-1].

When TD ["when"] is BEFORE or INSTEAD OF and TD ["level"] is ROW, the Python function may return None or “OK” to indicate that the row has not changed. “SKIP” aborts the event. When TD ["event"] applies INSERT or UPDATE, it can return “MODIFY” to modify the new row. Otherwise, the return value is ignored.

Database Access

The PL/Python language module automatically imports a Python module called plpy. The functions and constants in this module are available to you in the Python code as plpy.foo.

Database Access Functions

The plpy module provides several functions to execute database commands:

plpy.execute(query [, max-rows])

If you call plpy.execute with a query string and select an optional row limit argument, the corresponding query will be run and the result will be returned in a result object.

The result object emulates a list or dictionary object. The result object can be accessed by row number and column name. For example:

rv = plpy.execute("SELECT * FROM my_table", 5)

returns up to 5 rows from my_table. If my_table has a column my_column, it would be accessed as:

foo = rv[i]["my_column"]

The number of rows returned can be obtained using the built-in len function.

The result object has these additional methods:

  • nrows() Returns the number of rows processed by the command. Note that this is not necessarily the same as the number of rows returned. For example, UPDATE command will set this value but won’t return any rows (unless RETURNING is used).

  • status() Returns the SPI_execute() value.

  • colnames(), coltypes(), coltypmods() Returns a list of column names, list of column type OIDs, and list of type-specific type modifiers for the columns, respectively. These methods raise an exception when called on a result object from a command that did not produce a result set (e.g. UPDATE without RETURNING, or DROP TABLE). But it is OK to use these methods on a result set containing zero rows.

  • str() The standard __str__ method is defined so that it is possible, for example, to debug query execution results using plpy.debug(rv).

The result object can be modified.

Calling plpy.execute will cause the entire result set to be read into memory. Use the function only when the result set will be relatively small. If you don’t want to risk excessive memory usage when fetching large results, use plpy.cursor rather than plpy.execute.

plpy.prepare(query [, argtypes]) plpy.execute(plan [, arguments [, max-rows]])

plpy.prepare prepares the execution plan for a query. It is called with a query string and a list of parameter types, if you have parameter references in the query. Here is an example:

plan = plpy.prepare("SELECT last_name FROM my_users WHERE first_name = $1", ["text"])

text is the type of the variable you will be passing for $1. The second argument is optional if you don’t want to pass any parameters to the query. After preparing a statement, you use a variant of the function plpy.execute to run it:

rv = plpy.execute(plan, ["name"], 5)

Pass the plan as the first argument (instead of the query string), and a list of values to substitute into the query as the second argument. The second argument is optional if the query cannot expect any parameters. The third argument is the optional row limit as before.

Query parameters and result row fields are converted between PostgreSQL and Python data types as described in Data Values.

When you prepare a plan using the PL/Python module it is automatically saved. Refer to this link for more information. In order to use this function effectively through this feature, you need to use one of the persistent storage dictionaries SD or GD (see [Sharing Data]). Here is an example:

CREATE FUNCTION usesavedplan() RETURNS trigger AS $$
    if "plan" in SD:
        plan = SD["plan"]
    else:
        plan = plpy.prepare("SELECT 1")
        SD["plan"] = plan
    # rest of function
$$ LANGUAGE plpythonu;

plpy.cursor(query) plpy.cursor(plan [, arguments])

The plpy.cursor function accepts the same arguments as plpy.execute (except for the row limit) and returns a cursor object, which allows you to process large result sets in smaller chunks. As with plpy.execute, either a query string or a plan object along with a list of arguments can be used.

The cursor object provides a fetch method that accepts an integer parameter and returns a result object. Each time you call fetch, the returned object will contain the next batch of rows, which is no larger than the parameter value. Once all rows are used, fetch starts returning an empty result object. Cursor objects also provide an iterator interface, creating one row at a time until all rows are exhausted. Data fetched that way is not returned as result objects, but rather as dictionaries, each dictionary corresponding to a single result row.

An example of two ways of processing data from a large table is:

CREATE FUNCTION count_odd_iterator() RETURNS integer AS $$
odd = 0
for row in plpy.cursor("select num from largetable"):
    if row['num'] % 2:
         odd += 1
return odd
$$ LANGUAGE plpythonu;

CREATE FUNCTION count_odd_fetch(batch_size integer) RETURNS integer AS $$
odd = 0
cursor = plpy.cursor("select num from largetable")
while True:
    rows = cursor.fetch(batch_size)
    if not rows:
        break
    for row in rows:
        if row['num'] % 2:
            odd += 1
return odd
$$ LANGUAGE plpythonu;

CREATE FUNCTION count_odd_prepared() RETURNS integer AS $$
odd = 0
plan = plpy.prepare("select num from largetable where num % $1 <> 0", ["integer"])
rows = list(plpy.cursor(plan, [2]))

return len(rows)
$$ LANGUAGE plpythonu;

Cursors are automatically disposed of. However, if you want to explicitly release all resources held by a cursor, use the close method. Once closed, a cursor cannot be fetched from anymore.

Trapping Errors

Functions accessing the database might encounter errors, which will cause them to abort and raise an exception. Both plpy.execute and plpy.prepare can raise an instance of a subclass of plpy.SPIError, which by default will terminate the function. This error can be handled just like any other Python exception, by using the try/except construct. For example:

CREATE FUNCTION try_adding_joe() RETURNS text AS $$
    try:
        plpy.execute("INSERT INTO users(username) VALUES ('joe')")
    except plpy.SPIError:
        return "something went wrong"
    else:
        return "Joe added"
$$ LANGUAGE plpythonu;

The actual class of the exception being raised corresponds to the specific condition that caused the error. Refer to this link for a list of possible conditions. The module plpy.spiexceptions defines an exception class for each condition, deriving their names from the condition name. For instance, division_by_zero becomes DivisionByZero, unique_violation becomes UniqueViolation, fdw_error becomes FdwError, and so on. Each of these exception classes inherits from SPIError. This separation makes it easier to handle specific errors. For example:

CREATE FUNCTION insert_fraction(numerator int, denominator int) RETURNS text AS $$
from plpy import spiexceptions
try:
    plan = plpy.prepare("INSERT INTO fractions (frac) VALUES ($1 / $2)", ["int", "int"])
    plpy.execute(plan, [numerator, denominator])
except spiexceptions.DivisionByZero:
    return "denominator cannot equal zero"
except spiexceptions.UniqueViolation:
    return "already have that fraction"
except plpy.SPIError, e:
    return "other error, SQLSTATE %s" % e.sqlstate
else:
    return "fraction inserted"
$$ LANGUAGE plpythonu;

As all exceptions from the plpy.spiexceptions module inherit from SPIError, the except clause handling it will catch any database access error.

As an alternative way of handling different error conditions, you can catch the SPIError exception and determine the specific error condition inside the except block by looking at the sqlstate attribute of the exception object. This attribute is a string value containing the “SQLSTATE” error code. This approach provides approximately the same functionality.

Explicit Subtransactions

Recovering from errors caused by database access as described in [Trapping Errors] can lead to an undesirable situation where some operations succeed before one of them fails; after recovering from that error, the data is left in an inconsistent state. PL/Python offers a solution to this problem in the form of explicit subtransactions.

Subtransaction Context Managers

Consider a function that implements a transfer between two accounts:

CREATE FUNCTION transfer_funds() RETURNS void AS $$
try:
    plpy.execute("UPDATE accounts SET balance = balance - 100 WHERE account_name = 'joe'")
    plpy.execute("UPDATE accounts SET balance = balance + 100 WHERE account_name = 'mary'")
except plpy.SPIError, e:
    result = "error transferring funds: %s" % e.args
else:
    result = "funds transferred correctly"
plan = plpy.prepare("INSERT INTO operations (result) VALUES ($1)", ["text"])
plpy.execute(plan, [result])
$$ LANGUAGE plpythonu;

If the second UPDATE statement results in an exception being raised, this function will report the error, but the result of the first UPDATE will nevertheless be committed. In other words, the funds will be withdrawn from Joe’s account, but will not be transferred to Mary’s account.

To avoid such issues, you can wrap your plpy.execute calls in an explicit subtransaction. The plpy module provides a helper object to manage explicit subtransactions that gets created with the plpy.subtransaction() function. Objects created by this function implement the context manager interface. By using explicit subtransactions, you may rewrite the function as:

CREATE FUNCTION transfer_funds2() RETURNS void AS $$
try:
    with plpy.subtransaction():
        plpy.execute("UPDATE accounts SET balance = balance - 100 WHERE account_name = 'joe'")
        plpy.execute("UPDATE accounts SET balance = balance + 100 WHERE account_name = 'mary'")
except plpy.SPIError, e:
    result = "error transferring funds: %s" % e.args
else:
    result = "funds transferred correctly"
plan = plpy.prepare("INSERT INTO operations (result) VALUES ($1)", ["text"])
plpy.execute(plan, [result])
$$ LANGUAGE plpythonu;

The use of try/catch is still required. Otherwise the exception would be transferred to the top of the Python stack and would cause the whole function to abort due to an AgensGraph error; as a result, the operations table would not have any row inserted into it. The subtransaction context manager does not trap errors; it only assures that all database operations executed inside its scope will be atomically committed or rolled back. A rollback of the subtransaction block occurs on any kind of exception exit, not only ones caused by errors originating from database access. A regular Python exception raised inside an explicit subtransaction block would also cause the subtransaction to be rolled back.

Older Python Versions

Context managers syntax using the WITH keyword is available by default in Python 2.6. If using PL/Python with an older Python version, it is still possible to use explicit subtransactions, although not as transparently. You can call the subtransaction manager’s __enter__ and __exit__ functions using the enter and exit convenience aliases. The example function that transfers funds could be written as:

CREATE FUNCTION transfer_funds_old() RETURNS void AS $$
try:
    subxact = plpy.subtransaction()
    subxact.enter()
    try:
        plpy.execute("UPDATE accounts SET balance = balance - 100 WHERE account_name = 'joe'")
        plpy.execute("UPDATE accounts SET balance = balance + 100 WHERE account_name = 'mary'")
    except:
        import sys
        subxact.exit(*sys.exc_info())
        raise
    else:
        subxact.exit(None, None, None)
except plpy.SPIError, e:
    result = "error transferring funds: %s" % e.args
else:
    result = "funds transferred correctly"

plan = plpy.prepare("INSERT INTO operations (result) VALUES ($1)", ["text"])
plpy.execute(plan, [result])
$$ LANGUAGE plpythonu;

Utility Functions

The plpy module also provides functions:

plpy.debug(msg, **kwargs)
plpy.log(msg, **kwargs)
plpy.info(msg, **kwargs)
plpy.notice(msg, **kwargs)
plpy.warning(msg, **kwargs)
plpy.error(msg, **kwargs)
plpy.fatal(msg, **kwargs)

plpy.error and plpy.fatal actually raise a Python exception that is not actually called, causing the current transaction or subtransaction to be aborted. raise plpy.Error(msg) and raise plpy.Fatal(msg) are equivalent to calling plpy.error(msg) and plpy.fatal(msg), respectively, but you cannot pass keyword arguments in raise type. Other functions only generate messages of different priority levels. You may determine whether messages of a particular priority should be reported to the client, written to the server log, or both are controlled by the log_min_messages and client_min_messages configuration variables. See this link for more information.

The msg argument is provided as a positional argument. Two or more positional arguments may be provided for backward compatibility. In this case, the tuple string expression of the positional argument is the message reported to the client.

The following keyword-specific arguments are allowed.

detail
hint
sqlstate
schema_name
table_name
column_name
datatype_name

The string expression of the object passed as a keyword-specific argument is used to enforce the messages reported to the client. Here is an example:

CREATE FUNCTION raise_custom_exception() RETURNS void AS $$
plpy.error("custom exception message",
           detail="some info about exception",
           hint="hint for users")
$$ LANGUAGE plpythonu;

=# SELECT raise_custom_exception();
ERROR:  plpy.Error: custom exception message
DETAIL:  some info about exception
HINT:  hint for users
CONTEXT:  Traceback (most recent call last):
  PL/Python function "raise_custom_exception", line 4, in <module>
    hint="hint for users")
PL/Python function "raise_custom_exception"

Another set of utility functions are plpy.quote_literal(string), plpy.quote_nullable(string),
and plpy.quote_ident(string). They are equivalent to the built-in quoting functions. They are useful when constructing ad-hoc queries. Dynamic PL/Python would be:

plpy.execute("UPDATE tbl SET %s = %s WHERE key = %s" % (
    plpy.quote_ident(colname),
    plpy.quote_nullable(newvalue),
    plpy.quote_literal(keyvalue)))

Environment Variables

Some of the environment variables that are accepted by the Python interpreter may affect PL/Python behavior. They would need to be set in the environment of the main AgensGraph server process, for example in a start script. The available environment variables depend on the version of Python; see the Python manuals for more information.

  • PYTHONHOME

  • PYTHONPATH

  • PYTHONY2K

  • PYTHONOPTIMIZE

  • PYTHONDEBUG

  • PYTHONVERBOSE

  • PYTHONCASEOK

  • PYTHONDONTWRITEBYTECODE

  • PYTHONIOENCODING

  • PYTHONUSERBASE

  • PYTHONHASHSEED

(It appears to be a Python implementation detail beyond the control of PL/Python that some of the environment variables listed on the python man page are only effective in a command-line interpreter and not an embedded Python interpreter.)