Procedural language
AgensGraph can also write user-defined functions in languages other than SQL and C. These other languages are generally referred to as procedural languages (PLs). In the case of functions written in procedural languages, the database server cannot interpret the function’s source text. Thus, the task is passed to a special handler that knows the details of the language. The handler performs all the tasks, including parsing, syntax analysis, and execution. The handler itself is a C language function that, like any C other functions, is compiled into a shared object and loaded on demand. Current AgensGraph has four procedural languages: PL/pgSQL, PL/Tcl, PL/Perl, and PL/Python.
Installation
Procedural languages must be installed in each database to be used. However, the procedural languages installed in the template1 database will be automatically available in all subsequent databases created, as the entries in template1 are copied by CREATE DATABASE. The database administrator can determine which languages can be used in which databases and can make only certain languages available, if needed.
In the case of languages provided with the standard distribution, you only need to run CREATE EXTENSION
language_name to install them in the current database. Alternatively, you may use the program createlang to do this from the shell command line. For example, to install a language called PL/Python in the template1
database, see the following example:
createlang plpython template1
The manual procedure described below is recommended only if you are installing a language that is not packaged into extension.
Manual Procedural Language Installation
Procedural languages can be installed in a database in five steps and must be done by the database superuser. In most cases, the necessary SQL commands can be packaged into an extension’s install script and executed using CREATE EXTENSION.
The shared object of the language handler should be compiled and installed in an appropriate library directory. It works the same way as building and installing modules with regular user-defined C functions. Often the language handler relies on external libraries that provide the actual programming language engine; in such a case, the share object must be installed.
The handler must be declared as a command.
CREATE FUNCTION handler_function_name() RETURNS language_handler AS 'path-to-shared-object' LANGUAGE C;
The special return type of
language_handler
tells the database system that this function does not return one of the defined SQL data types and cannot be used directly in the SQL statement.Optionally, the language handler can provide an “inline” handler function that executes anonymous code blocks (DO commands) written in this language. If an inline handler function is provided by the language, it must be declared with the following command:
CREATE FUNCTION inline_function_name(internal) RETURNS void AS 'path-to-shared-object' LANGUAGE C;
Optionally, the language handler can provide a “validator” function that checks accuracy of the function definition without actually executing it. The validator function is called by CREATE FUNCTION. If the validator function is provided by the language, you must declare it with the following command:
CREATE FUNCTION validator_function_name(oid) RETURNS void AS 'path-to-shared-object' LANGUAGE C STRICT;
Finally, PL must be declared with the following command:
CREATE [TRUSTED] [PROCEDURAL] LANGUAGE language-name HANDLER handler_function_name [INLINE inline_function_name] [VALIDATOR validator_function_name] ;
The optional keyword
TRUSTED
specifies that the language does not grant access rights to data that will not be used by the user. Trusted languages are designed for general database users (i.e. users without superuser privileges) and can safely create functions and trigger procedures. As the PL function is executed within the database server, theTRUSTED
flag should only be provided for languages that do not allow access to the database server or file system. PL/pgSQL, PL/Tcl, and PL/Perl languages are considered to be trusted. Languages PL/TclU, PL/PerlU, and PL/PythonU are designed to provide unlimited functionality and should not be marked as trusted.In the default AgensGraph installation, a handler for the PL/pgSQL language is built and installed in the “library” directory. The PL/pgSQL language itself is installed in every database. Although Tcl support is configured and handlers for PL/Tcl and PL/TclU are built and installed in the library directory, the languages themselves are not installed by default in the database. Likewise, even if Perl support is configured, PL/Perl and PL/PerlU handlers are built and installed, Python support is configured, and a PL/PythonU handler is installed, these languages are not installed by default.
PL/pgSQL
Introduction
PL/pgSQL is a loadable procedural language in AgensGraph. The design objective of PL/pgSQL is to be a loadable procedural language with the following features:
Can be used to create functions and trigger procedures;
Adds control structures to the SQL language;
Can perform complex computations;
Inherits all user defined types, functions and operators;
Can be defined to be trusted by the server;
Is easy to use.
Functions created in PL/pgSQL can be used wherever built-in functions can be used. For example, you can create a function that processes complex conditions, and later define the function as an operator or use it in index expression.
In AgensGraph, PL/pgSQL is installed by default. However, as it is still a loadable module, administrators who are strictly security-conscious can remove PL/pgSQL.
Advantages of PL/pgSQL
SQL is a query language used in databases. Although SQL is easy to learn, all SQL statements must be executed separately per statement in the database server.
In other words, a client application sends a query to the database server individually, waits until each query is processed, takes the result, computes it, and then sends the next query to the server. These processes generate internal processing and cause network load if the database and client are on different machines.
As PL/pgSQL is easy to use in procedural languages and SQL is easy to use, you can group queries and operations within a database server and save on client/server communications loads.
Eliminate unnecessary communications between client and server.
Clients do not have to hold unnecessary intermediate results or to transfer them between client and server.
You do not need to do repeated query parsing.
Because of these factors, you can expect a noticeable performance improvement over applications that do not use stored functions.
In addition, PL/pgSQL may use all data types, operations, and functions of SQL.
Supported argument and result data types
Functions written in PL/pgSQL can accept scalar or array data types supported by the server as arguments and can return results. You can also use or return a specified complex type (row type). PL/pgSQL functions can also return records.
PL/pgSQL functions can be declared using the VARIADIC marker to allow arguments of varying numbers. This works in exactly the same way as SQL functions.
PL/pgSQL functions can be declared to accept and return various types, such as anyelement
, anyarray
, anynonarray
, anyenum
, and anyrange
. The actual data types handled by the polymorphic function may vary from call to call.
Structure of PL/pgSQL
Functions written in PL/pgSQL are defined in the server by executing CREATE FUNCTION (command) as follows:
CREATE FUNCTION somefunc(integer, text) RETURNS integer
AS 'function body text'
LANGUAGE plpgsql;
The function body is simply a literal string associated with CREATE FUNCTION. It is more helpful to use dollar ($) quotes to write function bodies than to use a usual single quotes syntax. If there is no dollar citation mark, you should escape it by doubling single quotes or backslashes of the function body. Almost all examples in this section use dollar quote literals in function bodies.
PL/pgSQL is a block-structured language. The full text of a function definition should be a block. The block is defined as follows:
[ <<label>> ]
[ DECLARE
declarations ]
BEGIN
statements
END [ label ];
Each declaration and each statement within a block ends with a semicolon. A block appearing within other block, as shown above, should have a semicolon after END. However, a semicolon is not required at the end of the function body.
Tip : In general, they mistakenly use a semicolon immediately after BEGIN very often. This is incorrect and causes a syntax error.
Label is needed only if you want to identify a block to use in an EXIT statement or to restrict the name of a variable declared in a block. If label is positioned after END, it must match the label at the beginning of the block. Identifiers, like regular SQL commands, are converted to lower case by default if they do not have double quotes.
Annotations work the same way in PL/pgSQL code as in regular SQL. Two dashes (–) are recognized as a comment (from the beginning to the end of the line). To process a comment as a block, include the comment between /* and */.
Every statement in a statement section of a block can be a sub-block. Subblocks may be used for logical grouping or localization of a small group of variables. A variable declared in a subblock masks a variable of a similar name in an outer block for the duration of the subblock. By specifying a block name, you may access an external variable.
CREATE FUNCTION somefunc() RETURNS integer AS $$
DECLARE
quantity integer := 30;
BEGIN
RAISE NOTICE 'Quantity here is %', quantity; -- Prints 30
quantity := 50;
--
-- Create a subblock
--
DECLARE
quantity integer := 80;
BEGIN
RAISE NOTICE 'Quantity here is %', quantity; -- Prints 80
RAISE NOTICE 'Outer quantity here is %', outerblock.quantity; -- Prints 50
END;
RAISE NOTICE 'Quantity here is %', quantity; -- Prints 50
RETURN quantity;
END;
$$ LANGUAGE plpgsql;
Note : Actually there is a hidden “outer block” surrounding the body of the PL/pgSQL function. This block provides the function’s parameter declarations (if any) and special variables such as FOUND. The outer block is labeled with the name of the function. That is, you can limit parameters and special variables to the name of the function.
It is important that using BEGIN/END to group statements in PL/pgSQL is not to be confused with similarly-named SQL commands for transaction control. BEGIN/END in PL/pgSQL is only used for grouping; it does not start/end any transaction. Functions and trigger procedures are always executed within a transaction set by an outer query. You cannot start or commit a transaction because there is no context to start the transaction. However, you may construct a sub-transaction that can be rolled back without affecting external transactions since a block contains an EXCEPTION clause.
Declarations
All variables used in a block should be declared in the declaration section of the block. (The only exception is that a loop variable in a FOR loop that repeats a range of integers is automatically declared as an integer variable, and a loop variable in a FOR loop that queries the result of cursor is automatically declared as a record variable.) PL/pgSQL variables can have any SQL data type, such as integer
, varchar
, and char
.
Below are some examples of variable declarations:
user_id integer;
quantity numeric(5);
url varchar;
myrow tablename%ROWTYPE;
myfield tablename.columnname%TYPE;
arow RECORD;
The following is a general syntax of a variable declaration:
name [ CONSTANT ] type [ COLLATE collation_name ] [ NOT NULL ] [ { DEFAULT | := | = } expression ];
If a DEFAULT
clause is given, it specifies the initial value assigned to the variable at block input. If a DEFAULT
clause is not provided, the variable is initialized to the SQL null value. The CONSTANT
option prevents variables from being allocated after initialization so that the value remains constant for the duration of the block. The COLLATE
option specifies collation to use for variables. If NNOT NULL
is specified, a runtime error occurs if a null value is assigned at execution. All variables declared as NOT NULL
should specify a non-null default value. The equal sign (=) can be used in place of “:=”, which is compatible with PL/SQL.
The default value of a variable is evaluated and assigned to a variable whenever a block is entered (not once per function call). For example, if you assign now()
to a variable of type timestamp
, the function will have the current function call time, not the precompiled time.
quantity integer DEFAULT 32;
url varchar := 'http://mysite.com';
user_id CONSTANT integer := 10;
Function parameter declaration
The parameters passed to the function are named identifiers $1
, $2
, and so on. Optionally, you can declare an alias for the $n
parameter name to improve readability. You may then use the alias or numeric identifier to refer to the parameter value.
There are two ways to create an alias. Naming a parameter in CREATE FUNCTION
command is a preferred way.
CREATE FUNCTION sales_tax(subtotal real) RETURNS real AS $$
BEGIN
RETURN subtotal * 0.06;
END;
$$ LANGUAGE plpgsql;
Another way is to explicitly declare the alias using declaration syntax.
name ALIAS FOR $n;
Below is an example of this style:
CREATE FUNCTION sales_tax(real) RETURNS real AS $$
DECLARE
subtotal ALIAS FOR $1;
BEGIN
RETURN subtotal * 0.06;
END;
$$ LANGUAGE plpgsql;
Note: These two examples are not exactly the same. In the first case,
subtotal
can be referenced assales_tax.subtotal
, but in the second case it cannot be marked assubtotal
. (If you attach label to an inner block, the partial label can be confined to that label instead.
When a PL/pgSQL function is declared as an output parameter, the output parameter has a $n
name and an optional alias in the same way as a normal input parameter. The output parameter effectively represents a variable that starts with NULL. This should be assigned during execution of the function. The final value of the parameter is the value to be returned. For example, the sales-tax example could be:
CREATE FUNCTION sales_tax(subtotal real, OUT tax real) AS $$
BEGIN
tax := subtotal * 0.06;
END;
$$ LANGUAGE plpgsql;
Output parameters are most useful when returning multiple values. Here is a simple example:
CREATE FUNCTION sum_n_product(x int, y int, OUT sum int, OUT prod int) AS $$
BEGIN
sum := x + y;
prod := x * y;
END;
$$ LANGUAGE plpgsql;
This effectively creates an anonymous record type for the result of the function. If a RETURNS
clause is provided, a RETURNS
record must be specified.
Another way to declare a PL/pgSQL function is to use RETURNS TABLE. For example:
CREATE FUNCTION extended_sales(p_itemno int)
RETURNS TABLE(quantity int, total numeric) AS $$
BEGIN
RETURN QUERY SELECT s.quantity, s.quantity * s.price FROM sales AS s
WHERE s.itemno = p_itemno;
END;
$$ LANGUAGE plpgsql;
This is exactly the same as declaring one or more OUT
parameters and specifying RETURNS SETOF
sometype
.
Alias
You can declare alias for all variables as well as function parameters. Alias is used, in an actual case, to assign a different name to a variable with a predefined name, such as NEW
or OLD
, in the trigger procedure. For example:
DECLARE
prior ALIAS FOR old;
updated ALIAS FOR new;
Since ALIAS
creates two methods of naming the same object, its unrestricted use can be confusing. It is best to use ALIAS only for the purpose of ignoring predefined names.
Expressions
All expressions used in PL/pgSQL statements are processed using the server’s main SQL launcher. The PL/pgSQL statements can be written as follows:
IF expression THEN ...
PL/pgSQL evaluates expressions by providing the same query as the main SQL engine. During configuration of SELECT
command, the PL/pgSQL variable name is changed to a parameter. This allows you to prepare a query plan for SELECT
once and then reuse it for subsequent evaluations using different values of the variable. For example, if you write two integer variables x
and y
as
IF x < y THEN ...
it is used as follows:
PREPARE statement_name(integer, integer) AS SELECT $1 < $2;
This prepared statement is performed with the value of the PL/pgSQL variable provided each time the IF statement is executed. In general, specific information like this is not that useful for PL/pgSQL users. However, it is worth knowing in case of a problem diagnosis.
Basic Statements
This section and the subsequent sections describe all statement types that PL/pgSQL explicitly understands. Those that are not recognized as part of these statement types are considered SQL commands and are sent to the underlying database engine for execution.
Assignment
To assign a value to a PL/pgSQL variable:
variable { := | = } expression;
As mentioned earlier, the expressions in these statements are obtained using SQL SELECT
(command) sent to the underlying database engine. An expression should have a single value (it can be a row value if the variable is a row or record variable). The target variable can be a simple variable (optionally defined by a block name), a row or record variable field, or an array element that is a simple variable or field. The equal sign (=) can be used in place of “:=”, which is compatible with PL/SQL.
If the result data type of the expression does not match the data type of the variable or does not match a specific length or precision, the PL/pgSQL interpreter attempts to implicitly convert the result value using the result type of the output function and the variable type of the input function. If the string form of the result value is of a type that is not allowed in the input function, then the input function may generate a run-time error.
For example:
tax := subtotal * 0.06;
my_record.user_id := 20;
Executing a command without a result
For SQL commands that do not return a row (for example, an INSERT
without a RETURNING
clause), you can execute the commands within a PL/pgSQL function simply by writing them.
The PL/pgSQL variable names that appear in the command text are treated as parameters, and the current values of the variables are provided as the parameter values at runtime. This is exactly the same as the process described hereinabove for expressions.
When you execute an SQL command in this way, PL/pgSQL can cache and reuse the command execution plan.
Calling a function without a useful result value can be sometimes useful for evaluating expressions or SELECT
queries, and the results can be ignored. To do this in PL/pgSQL, we recommend you to use the PERFORM
statement.
PERFORM query;
The query is then executed and the results are discarded. You should write your query the same way you would write an SQL SELECT
command, and replace the initial keyword SELECT
with PERFORM
. For WITH
queries, use PERFORM
and put the query into parentheses (in this case, the query can only return one row). As with commands that do not return results, the PL/pgSQL variable is replaced by the query, and the plan is cached in the same manner. The special variable FOUND
is set to true if the query generated at least one row, and set to false if no rows were generated.
Note: You may expect to get this result by writing your own
SELECT
, but for now,PERFORM
is the only way to do it. An SQL command that can return a row such asSELECT
is treated as an error and rejected if it is without anINTO
clause described in the next section.
For example:
PERFORM create_mv('cs_session_page_requests_mv', my_query);
Run a query as a single row result
A single row (possibly multiple columns) resulting from an SQL command can be assigned to a record variable, row-type variable, or scalar variable list that you create. This is done by writing a basic SQL command and adding an INTO
clause. For example, target can be a record variable, a row variable, or a comma-delimited simple variable, and a list of record/row fields.
SELECT select_expressions INTO [STRICT] target FROM ...;
INSERT ... RETURNING expressions INTO [STRICT] target;
UPDATE ... RETURNING expressions INTO [STRICT] target;
DELETE ... RETURNING expressions INTO [STRICT] target;
For commands that do not return rows, the PL/pgSQL variable is replaced with the rest of the query, as described above, and the plan is cached. It works in SELECT
(INSERT
/UPDATE
/DELETE
that uses RETURNING
) and works according to utility commands that return row set results (e.g. EXPLAIN
). Except for the INTO
clause, SQL commands are the same as those written outside of PL/pgSQL.
Tip: To create a table as a result of
SELECT
within a PL/pgSQL function, you should use theCREATE TABLE ... AS SELECT
statement.
When using a row or variable list as a target, the result column of a query should exactly match the target structure for the numbers and data types. Otherwise, a runtime error will occur. When a record variable is a target, it automatically configures row types of the query result columns.
The INTO
clause may appear at almost any position in an SQL command. It is usually written immediately before or after the select_expressions
list in SELECT
command, or at the end of commands for other command types.
If STRICT
is not specified in an INTO
clause, the target
is set to the first row returned by the query, or set to null if the query did not return a row. (The “first row” is not well defined unless you use ORDER BY
.) The result row after the first row is discarded. You can check the special FOUND variable to see if the row has been returned.
SELECT * INTO myrec FROM emp WHERE empname = myname;
IF NOT FOUND THEN
RAISE EXCEPTION 'employee % not found', myname;
END IF;
In the case where a STRICT
option is specified, the query should return exactly a single row.
Otherwise, either NO_DATA_FOUND
(no rows) or TOO_MANY_ROWS
(two or more rows) is reported as a run-time error. If you want to find certain errors as follows, you may use an exception block.
BEGIN
SELECT * INTO STRICT myrec FROM emp WHERE empname = myname;
EXCEPTION
WHEN NO_DATA_FOUND THEN
RAISE EXCEPTION 'employee % not found', myname;
WHEN TOO_MANY_ROWS THEN
RAISE EXCEPTION 'employee % not unique', myname;
END;
Successful execution of STRICT
always sets FOUND
to true.
In the case of INSERT
/UPDATE
/DELETE
with RETURNING
, PL/pgSQL will report an error for more than one returned row, even if STRICT
is not specified. This is because there is no such option as ORDER BY
to determine if an affected row should be returned.
In the case where print_strict_params
is enabled for a function and an error occurs due to unmet requirements of STRICT
, DETAIL
of the error message contains information about the parameters passed to the query. You can change the print_strict_params
setting for all functions by setting plpgsql.print_strict_params
; it is, however, affected only by editing of the subsequence function. It is also possible to enable it on a function-by-function basis using compiler options. For example:
CREATE FUNCTION get_userid(username text) RETURNS int
AS $$
#print_strict_params on
DECLARE
userid int;
BEGIN
SELECT users.userid INTO STRICT userid
FROM users WHERE users.username = get_userid.username;
RETURN userid;
END
$$ LANGUAGE plpgsql;
If unsuccessful, this function may generate an error message as follows:
ERROR: query returned no rows
DETAIL: parameters: $1 = 'nosuchuser'
CONTEXT: PL/pgSQL function get_userid(text) line 6 at SQL statement
Note: The
STRICT
option matches behavior ofSELECT INTO
and related statements in Oracle PL/SQL.
Executing dynamic commands
You will often need to create a dynamic command (i.e. a command containing different tables or different data types each time it is executed) within a PL/pgSQL function. The general attempt by PL/pgSQL to cache plans for commands does not work in this scenario. An EXECUTE
statement is provided to handle this kind of problem.
EXECUTE command-string [ INTO [STRICT] target ] [ USING expression [, ... ] ];
Here, command-string
is an expression that creates a string (type text) containing a command to execute. An optional target
is a record variable, a row variable, or a comma-delimited simple variable, and a record/line field list where the result of the command will be stored. The optional USING
expression provides a value to be inserted into the command.
In a calculated command string, PL/pgSQL variables cannot be replaced. All necessary variable values should be inserted into the command string as configured. Alternatively, you can use parameters as described below.
There is also no caching plan for commands executed via EXECUTE
. Instead, commands are always scheduled each time the statement is executed. Therefore, the command strings can be generated dynamically within a function to perform actions in other tables and columns.
The INTO
clause specifies the result of an SQL command that assigns the results of returned rows. When a row or variable list is provided, it should exactly match the structure of the query result (if a record variable is used, it is automatically configured to match the result structure). If multiple rows are returned, only the first row is assigned to the INTO
variable; NULL is assigned to the INTO
variable, if no rows are returned. If the INTO
clause is not specified, the query result is discarded.
Given a STRICT
option, an error is reported if the query does not generate exactly a single row.
Command strings may use parameter values referenced in commands as $1
, $2
, and so on. These symbols refer to values provided in the USING
clause. This method is often preferred over inserting data values into a command string as text. That is, the runtime overhead of converting values into text and back can be avoided, and SQL-injection attacks occur less often as it does not require quotes or escapes. For example:
EXECUTE 'SELECT count(*) FROM mytable WHERE inserted_by = $1 AND inserted <= $2'
INTO c
USING checked_user, checked_date;
Parameter symbols can only be used for data values. To use dynamically-determined table or column names, you must insert the corresponding characters into the command string. For example, if you need to perform the preceding query on a dynamically selected table, you can do the following:
EXECUTE 'SELECT count(*) FROM '
|| quote_ident(tabname)
|| ' WHERE inserted_by = $1 AND inserted <= $2'
INTO c
USING checked_user, checked_date;
A simpler approach will be to use %I
of format()
for the table or column name (newline-delimited, concatenated strings).
EXECUTE format('SELECT count(*) FROM %I '
'WHERE inserted_by = $1 AND inserted <= $2', tabname)
INTO c
USING checked_user, checked_date;
Another limitation on parameter symbols is that they only work with SELECT
, INSERT
, UPDATE
and DELETE
commands. In other syntax types (commonly referred to as utility syntax), you should insert values in text, even if they are data values.
As in the first example above, EXECUTE
with a simple constant command string and USING
parameters is functionally equivalent to writing commands directly in PL/pgSQL and allowing PL/pgSQL variables to be automatically replaced. The important difference is that EXECUTE
re-plans the commands at each run to generate a plan associated with the current parameter values. On the other hand, PL/pgSQL may create a general plan and cache it for reuse. In situations where the best planning depends heavily on parameter values, it is a good idea to use EXECUTE
to make sure that the general plan is not selected.
SELECT INTO
is not currently supported within EXECUTE
. Instead, you should issue a generic SELECT
command and specify INTO
as part of EXECUTE
.
Control Structures
The control structure is probably the most useful and important part of PL/pgSQL. The PL/pgSQL control structure allows you to manipulate AgensGraph data in a very flexible and powerful way.
Return from a function
There are two commands that can return data from functions: RETURN and RETURN NEXT.
RETURN
RETURN expression;
RETURN
with an expression terminates the function and returns the value of the expression to the caller. This format is used for PL/pgSQL functions that do not return sets.
In a function that returns a scalar type, the result of the expression is automatically converted to the return type of the function as described for the assignment. However, to return a composite (row) value, you should write an expression that correctly conveys the set of columns requested. This may require explicit type conversion.
If you declare a function using an output parameter, you need to write only RETURN
without an expression. The current value of the output parameter variable is returned.
If you declare a function that returns void
, you can use the RETURN
statement to terminate the function early. Do not write expressions after RETURN
.
The return value of a function cannot remain undefined. If control reaches the end of the top-level block of the function without using a RETURN
statement, a run-time error occurs. However, this restriction does not apply to functions with output parameters and functions that return void
. In such a case, the RETURN
statement is automatically executed when the top-level block completes.
For example:
-- functions returning a scalar type
RETURN 1 + 2;
RETURN scalar_var;
-- functions returning a composite type
RETURN composite_type_var;
RETURN (1, 2, 'three'::text); -- must cast columns to correct types
RETURN NEXT and RETURN QUERY
RETURN NEXT expression;
RETURN QUERY query;
RETURN QUERY EXECUTE command-string [ USING expression [, ... ] ];
If a PL/pgSQL function is declared to return SETOF
sometype
, you should follow a slightly-different procedure. In such a case, the individual items to be returned are specified in the order of RETURN NEXT
or RETURN QUERY
, and the final RETURN
command without arguments is used to indicate that execution of the function is complete. RETURN NEXT
can be used with both scalar and complex data types. The entire “table” of the result is returned as a composite result type. RETURN QUERY
adds the result of executing the query to the result set of the function. RETURN NEXT
and RETURN QUERY
can be mixed freely in a single set-returning function, in which case the results are concatenated.
RETURN NEXT
and RETURN QUERY
are not actually returned by a function. You can simply add zero or more rows to the result set of the function. Execution continues with the next statement of the PL/pgSQL function. The result set is built when a sequential RETURN NEXT
or RETURN QUERY
command is executed. The last return that should have no argument causes control to terminate the function (or control may reach the end of the function).
RETURN QUERY
has a variant RETURN QUERY EXECUTE
that specifies the queries to be executed dynamically. Parameter expressions, like EXECUTE
, can be inserted into a query string computed via USING
.
If you declare a function using an output parameter, you should write RETURN NEXT
without an expression. At each run, the current value of the output parameter variable is stored for the final return of the result row. If there are multiple output parameters, declare a function that returns a SETOF record; if there is only one output parameter of type sometype
, then you should declare SETOF sometype
to create a set-returning function using the output parameter.
The following is an example of a function that uses RETURN NEXT
.
CREATE TABLE foo (fooid INT, foosubid INT, fooname TEXT);
INSERT INTO foo VALUES (1, 2, 'three');
INSERT INTO foo VALUES (4, 5, 'six');
CREATE OR REPLACE FUNCTION get_all_foo() RETURNS SETOF foo AS
$BODY$
DECLARE
r foo%rowtype;
BEGIN
FOR r IN
SELECT * FROM foo WHERE fooid > 0
LOOP
-- can do some processing here
RETURN NEXT r; -- return current row of SELECT
END LOOP;
RETURN;
END
$BODY$
LANGUAGE plpgsql;
SELECT * FROM get_all_foo();
Here is an example of a function that uses RETURN QUERY
.
CREATE FUNCTION get_available_flightid(date) RETURNS SETOF integer AS
$BODY$
BEGIN
RETURN QUERY SELECT flightid
FROM flight
WHERE flightdate >= $1
AND flightdate < ($1 + 1);
-- Since execution is not finished, we can check whether rows were returned
-- and raise exception if not.
IF NOT FOUND THEN
RAISE EXCEPTION 'No flight at %.', $1;
END IF;
RETURN;
END
$BODY$
LANGUAGE plpgsql;
-- Returns available flights or raises exception if there are no
-- available flights.
SELECT * FROM get_available_flightid(CURRENT_DATE);
Note: As described hereinabove, the current implementation of RETURN NEXT and RETURN QUERY stores the entire result set before the function is returned. In other words, performance can be degraded if a PL/pgSQL function generates a very large result set. Data is written to disk to avoid memory depletion, but the function itself is not returned until the entire result set is created. The point at which the current data begins to be written to disk is controlled by the work_mem configuration variable. Administrators having memory large enough to store a bigger volume of result sets should increase this parameter.
Simple loop
By using the LOOP
, EXIT
, CONTINUE
, WHILE
, FOR
, and FOREACH
statements, you can collate a series of commands so that the PL/pgSQL function can repeat them.
LOOP
[ <<label>> ]
LOOP
statements
END LOOP [ label ];
LOOP
defines an unconditional loop that repeats indefinitely until terminated by an EXIT
or RETURN
statement. Optional label
is used by EXIT
and CONTINUE
statements within the nested loop to specify the loop to which the statement refers.
EXIT
EXIT [ label ] [ WHEN boolean-expression ];
If label
is not specified, the innermost loop is terminated and the statement following END LOOP
is executed next. The given label
should be a label of the current or some outer level of the nested loop or block. Then the named loop or block is terminated and the statement continues after the corresponding END of the loop/block.
If WHEN
is specified, loop terminates only if boolean-expression
is true. Otherwise, control passes to the statement after EXIT
.
EXIT
can be used in any type of loop. It is not limited to the use with an unconditional loop.
When used with a BEGIN
block, EXIT
passes control to the next statement after the block terminates. Label should be used for this purpose. EXIT
without label is not considered to match the BEGIN
block.
Here is an example:
LOOP
-- some computations
IF count > 0 THEN
EXIT; -- exit loop
END IF;
END LOOP;
LOOP
-- some computations
EXIT WHEN count > 0; -- same result as previous example
END LOOP;
<<ablock>>
BEGIN
-- some computations
IF stocks > 100000 THEN
EXIT ablock; -- causes exit from the BEGIN block
END IF;
-- computations here will be skipped when stocks > 100000
END;
CONTINUE
CONTINUE [ label ] [ WHEN boolean-expression ];
If label
is not given, the next iteration of the innermost loop begins. That is, loop control is returned (if any) to skip all remaining statements in the loop body and to determine if another loop iteration is needed. If label
is present, specify label of the loop to be continued.
With WHEN
specified, the next iteration of the loop starts only if boolean-expression
is true. Otherwise, control passes to the statement following CONTINUE
.
CONTINUE
can be used in any type of loop. It is not limited to the use with an unconditional loop.
Here is an example:
LOOP
-- some computations
EXIT WHEN count > 100;
CONTINUE WHEN count < 50;
-- some computations for count IN [50 .. 100]
END LOOP;
WHILE
[ <<label>> ]
WHILE boolean-expression LOOP
statements
END LOOP [ label ];
The WHILE
statement repeats a series of statements as long as boolean-expression
is evaluated to be true. Expressions are checked just before each item in the loop body.
Here is an example:
WHILE amount_owed > 0 AND gift_certificate_balance > 0 LOOP
-- some computations here
END LOOP;
WHILE NOT done LOOP
-- some computations here
END LOOP;
FOR(Integer Variant)
[ <<label>> ]
FOR name IN [ REVERSE ] expression .. expression [ BY expression ] LOOP
statements
END LOOP [ label ];
This type of FOR
creates a loop that iterates over a range of integers. Variable names
are automatically defined as integer types and exist only inside a loop (existing definitions of variable names are ignored in the loop). Two expressions representing the upper and lower bounds of the range are evaluated once when they enter the loop. If a BY
clause is not specified, the iteration step is 1; if specified, the value specified in the BY
clause is evaluated once in the loop entry. If REVERSE
is specified, step value is not added but subtracted after each time of iteration.
Some examples of integer FOR loops:
FOR i IN 1..10 LOOP
-- i will take on the values 1,2,3,4,5,6,7,8,9,10 within the loop
END LOOP;
FOR i IN REVERSE 10..1 LOOP
-- i will take on the values 10,9,8,7,6,5,4,3,2,1 within the loop
END LOOP;
FOR i IN REVERSE 10..1 BY 2 LOOP
-- i will take on the values 10,8,6,4,2 within the loop
END LOOP;
If the lower bound is greater than the upper bound (or less than REVERSE), the loop body is never executed. No error occurs.
When a label
is connected to a FOR
loop, the integer loop variable can be referenced with the specified name using that label
.
Loop through query results
Other types of FOR
loops allow you to iterate over the query results and manipulate the data accordingly. The syntax is as follows:
[ <<label>> ]
FOR target IN query LOOP
statements
END LOOP [ label ];
target
is a record variable, a row variable, or a comma-delimited list of scalar variables. target
is assigned to each row of the query result successively, and each row of the loop body is executed. Here is an example:
CREATE FUNCTION cs_refresh_mviews() RETURNS integer AS $$
DECLARE
mviews RECORD;
BEGIN
RAISE NOTICE 'Refreshing materialized views...';
FOR mviews IN SELECT * FROM cs_materialized_views ORDER BY sort_key LOOP
-- Now "mviews" has one record from cs_materialized_views
RAISE NOTICE 'Refreshing materialized view %s ...', quote_ident(mviews.mv_name);
EXECUTE format('TRUNCATE TABLE %I', mviews.mv_name);
EXECUTE format('INSERT INTO %I %s', mviews.mv_name, mviews.mv_query);
END LOOP;
RAISE NOTICE 'Done refreshing materialized views.';
RETURN 1;
END;
$$ LANGUAGE plpgsql;
If the loop is terminated by an EXIT
statement, you can still access the last-assigned row value after the loop.
The query
used in the FOR
statement type can be any SQL command that returns a row to the caller. SELECT
is the most common case, but you can also use INSERT
, UPDATE
, or DELETE
with RETURNING
. Some utility commands such as EXPLAIN
also work.
PL/pgSQL variables are replaced by query text and the query plan is cached for possible reuse.
The FOR-IN-EXECUTE
statement is another way to iterate through rows.
[ <<label>> ]
FOR target IN EXECUTE text_expression [ USING expression [, ... ] ] LOOP
statements
END LOOP [ label ];
This statement is the same as the previous form, except that the source query is specified in a string expression that is evaluated and rescored in each item of the FOR
loop. This allows the programmer to choose the speed of pre-planned queries or flexibility of dynamic queries, just like a normal EXECUTE
statement. Like EXECUTE
, parameter values can be inserted into dynamic commands via USING
.
Another way to specify a query that needs to repeat the result is to declare it as cursor.
Loop through array
The FOREACH
loop is very similar to the FOR
loop, but it iterates through the elements of the array value, instead of repeating the rows returned by the SQL query. Typically, FOREACH
is used to iterate through the components of a composite value expression. Variants that repeat complexes other than arrays can be added later.
A FOREACH
statement that iterates through the array:
[ <<label>> ]
FOREACH target [ SLICE number ] IN ARRAY expression LOOP
statements
END LOOP [ label ];
If SLICE
is not present or SLICE0
is specified, the loop iterates over the individual elements of the array created by evaluating expression
. In the target
variable, each element value is specified in order, and the loop body is executed for each element. The following is an example of iterating over an array element.
CREATE FUNCTION sum(int[]) RETURNS int8 AS $$
DECLARE
s int8 := 0;
x int;
BEGIN
FOREACH x IN ARRAY $1
LOOP
s := s + x;
END LOOP;
RETURN s;
END;
$$ LANGUAGE plpgsql;
Elements are stored in storage order regardless of the number of dimensions of the array. target
is usually a single variable, but it can be a list of variables when repeating composite value arrays. In such a case, for each array element, the variables are assigned in consecutive columns of composite values.
If the SLICE
value is positive, FOREACH
repeats the slice of the array rather than a single element. The SLICE
value should be an integer constant that is not greater than the number of dimensions of the array. target
variable should be an array and receive successive slices of the array value. Each slice is the number of dimensions specified by SLICE
. The following is an example of repeating a one-dimensional slice.
CREATE FUNCTION scan_rows(int[]) RETURNS void AS $$
DECLARE
x int[];
BEGIN
FOREACH x SLICE 1 IN ARRAY $1
LOOP
RAISE NOTICE 'row = %', x;
END LOOP;
END;
$$ LANGUAGE plpgsql;
SELECT scan_rows(ARRAY[[1,2,3],[4,5,6],[7,8,9],[10,11,12]]);
NOTICE: row = {1,2,3}
NOTICE: row = {4,5,6}
NOTICE: row = {7,8,9}
NOTICE: row = {10,11,12}
Cursors
Instead of running the entire query at once, you can set cursors to encapsulate the query and then read a few lines of the query results at a time. A reason for doing this is to avoid memory overrun when the result contains a large number of rows. (However, PL/pgSQL users generally do not have to worry about this as FOR
loops use cursors internally to avoid memory problems.) More interesting way of use is to return a reference to the cursor where a function was created so that the caller can read the line. By doing so, it provides an efficient way to return a large row set from a function.
Declaring cursor variables
All access to PL/pgSQL cursors is always performed via cursor variables, which are special data type refcursor
. One way to create a cursor variable is to declare it as a variable of type refcursor
. Another way to do this is to use the following cursor declaration syntax:
name [ [ NO ] SCROLL ] CURSOR [ ( arguments ) ] FOR query;
When SCROLL
is specified, the cursor may scroll backward. If NO SCROLL
is specified, the reverse fetch is denied. If this option is not specified, it depends on the query whether reverse fetching is allowed or not. If argument
is specified, it is a comma-separated list of name data type pairs
and defines the name to be substituted for the parameter value in the specified query. The actual value to replace this name is specified later when the cursor is opened.
Here is an example:
DECLARE
curs1 refcursor;
curs2 CURSOR FOR SELECT * FROM tenk1;
curs3 CURSOR (key integer) FOR SELECT * FROM tenk1 WHERE unique1 = key;
All three of these variables have a refcursor
data type, but the first query can be used with all queries, the second query bounds a fully specified query, and the last query has a parameterized query bound. (The key
is replaced with the integer parameter value when the cursor is opened.) The variable curs1
is called unbound because it is not bound to a particular query.
Opening cursors
To use a cursor to search for a row, the cursor should be already open. PL/pgSQL supports three types of OPEN
statements, two of which use unbound cursor variables and the third use bound cursor variables.
Note: Bound cursor variables can be used without explicitly opening cursors.
OPEN FOR query
OPEN unbound_cursorvar [ [ NO ] SCROLL ] FOR query;
The cursor variable is opened and the specified query is executed. The cursor can no longer be opened, and it should be declared as an unbound cursor variable (i.e. a simple refcursor
variable). The query should be SELECT
or something else that returns a row (such as EXPLAIN
). Queries are handled in the same way as other PL/pgSQL SQL commands. The PL/pgSQL variable name is replaced, and the query plan is cached for possible reuse. When the PL/pgSQL variable is replaced with a cursor query, the value to be replaced is the value at the time of OPEN
. Subsequent changes to the variable do not affect the behavior of the cursor. The SCROLL
and NO SCROLL
options have the same meaning as in the bound cursors.
Here is an example:
OPEN curs1 FOR SELECT * FROM foo WHERE key = mykey;
OPEN FOR EXECUTE
OPEN unbound_cursorvar [ [ NO ] SCROLL ] FOR EXECUTE query_string
[ USING expression [, ... ] ];
The cursor variable is opened and the specified query is executed. The cursor can no longer be opened, and it should be declared as an unbound cursor variable (i.e. a simple refcursor
variable). The query is specified in a string expression in the same way as in EXECUTE
. As always, this provides flexibility; this means that the query plan can vary each time of execution and variable substitution is not performed in the command string. Like EXECUTE
, parameter values can be inserted into dynamic commands via format()
and USING. The SCROLL
and NO SCROLL
options have the same meaning as bound cursors.
Here is an example:
OPEN curs1 FOR EXECUTE format('SELECT * FROM %I WHERE col1 = $1',tabname) USING keyvalue;
In this example, the table name is inserted into the query via format()
. As the comparison value of col1
is inserted via USING
, no quotes are required.
Opening bounding cursors
OPEN bound_cursorvar [ ( [ argument_name := ] argument_value [, ...] ) ];
This OPEN
format is used when a query is declared to open a cursor variable to which the query is bound. The cursor can no longer be opened. The list of actual argument value expressions can only be used if a cursor is declared to take an argument. These values are replaced in the query.
Query plans for bound cursors are always considered cacheable. There is no EXECUTE
in this case. SCROLL
and NO SCROLL
cannot be specified for OPEN
because the cursor’s scroll behavior has already been determined.
Argument values can be passed using positional or named notation. In position notation, all arguments are specified in order. In named notation, the names of each argument are specified and separated from the argument expression using :=. Like the calling function, you can use both positional notation and named notation together.
Here’s an example (the cursor declaration example above is used):
OPEN curs2;
OPEN curs3(42);
OPEN curs3(key := 42);
Variable substitution is performed in the query of a bound cursor, which means there are two ways to pass the value to the cursor: explicitly using the OPEN
argument, or implicitly passing the value by referencing the PL/pgSQL variable in the query. However, only variables declared before declaration of the bound cursor can be replaced with variables. In both cases, the value to pass is determined at the time of OPEN
. For instance, another way to get the same effect as the curs3
example above is:
DECLARE
key integer;
curs4 CURSOR FOR SELECT * FROM tenk1 WHERE unique1 = key;
BEGIN
key := 42;
OPEN curs4;
Using cursors
Once a cursor is open, you can manipulate it using the statements described below.
You do not need to perform this manipulation since it works in the same manner as when the cursor is first opened. A function can return a refcursor
value and allow the caller to work on the cursor. (Internally, the refcursor
value is simply a string name of so-called a portal containing the active query for the cursor, and can be assigned to another refcursor
variable).
All portals are closed implicitly at the end of the transaction. Thus, the value of refcursor
can be used to reference the open cursor until the end of the transaction.
FETCH
FETCH [ direction { FROM | IN } ] cursor INTO target;
FETCH
fetches the next row as the target of cursor in a row variable, a record variable, or a comma-separated list of simple variables, such asSELECT INTO
. If there is no next row, the target is set to NULL(s). As withSELECT INTO
, you can check the special variableFOUND
to see if you obtained the row.The direction clause can be one of the variations allowed by SQL FETCH, except that it can fetch a single row (i.e.
NEXT
,PRIOR
,FIRST
,LAST
,ABSOLUTE
count,RELATIVE
count,FORWARD
, orBACKWARD
). Omitting direction is equivalent to specifyingNEXT
. If the cursor is not declared or open withSCROLL
option, the direction value that moves backward may fail.The cursor must be the name of a
refcursor
variable that refers to the open cursor portal.Here is an example:
FETCH curs1 INTO rowvar; FETCH curs2 INTO foo, bar, baz; FETCH LAST FROM curs3 INTO x, y; FETCH RELATIVE -2 FROM curs4 INTO x;
MOVE
MOVE [ direction { FROM | IN } ] cursor;
MOVE
relocates a cursor without retrieving the data.MOVE
works the same asFETCH
command except that it only relocates the cursor and does not return the moved rows. As withSELECT INTO
, you can check the special variableFOUND
to see if there is a next row to move.The direction clause is used in the same manner as it is used in SQL FETCH command (e.g.
NEXT
,PRIOR
,FIRST
,LAST
,ABSOLUTE
count,RELATIVE
count,ALL
,FORWARD
[count|ALL
] orBACKWARD
[count|ALL
]). Omitting direction is equivalent to specifyingNEXT
. If the cursor is not declared or open withSCROLL
option, the direction value that moves backward may fail.
Here is an example:MOVE curs1; MOVE LAST FROM curs3; MOVE RELATIVE -2 FROM curs4; MOVE FORWARD 2 FROM curs4;
UPDATE/DELETE WHERE CURRENT OF
UPDATE table SET ... WHERE CURRENT OF cursor; DELETE FROM table WHERE CURRENT OF cursor;
When a cursor is positioned on a table row, it can be identified to update or delete the row with the cursor. There is a restriction on the cursor’s queries (especially not grouped queries) and it is best to use
FOR UPDATE
on the cursor.Here is an example:
UPDATE foo SET dataval = myval WHERE CURRENT OF curs1;
CLOSE
CLOSE cursor;
CLOSE
closes the default portal of the open cursor. It can be used to release cursor variables that can release or reopen resources before the end of a transaction.Here is an example:
CLOSE curs1;
Returning Cursors
The PL/pgSQL function can return a cursor to the caller. This method is useful when returning multiple rows or columns, especially when returning a very large result set. To do this, the function opens a cursor and returns the cursor name to the caller (or simply opens a cursor using the name of the caller or the name of the portal). The caller can fetch rows from the cursor. The cursor is closed by the caller or automatically closed when it is closed.The name of the portal used for a cursor can be specified by the programmer or automatically generated. To specify a portal name, you should first assign a string before the
refcursor
variable is opened. The string value of therefcursor
variable is used as the name of the default portal in OPEN. However, if therefcursor
variable is null,OPEN
automatically creates a name that does not conflict with the existing portal and assigns it to therefcursor
variable.
Note: Since the bound cursor variable is initialized to a string value representing the name, the portal name is the same as the cursor variable name, unless the programmer ignores it through assignment before opening a cursor. However, as the default value of the unbound cursor variable is initially set to null, it gets a unique, automatically generated name unless ignored.
The following example is a way a caller can supply a cursor name.
CREATE TABLE test (col text);
INSERT INTO test VALUES ('123');
CREATE FUNCTION reffunc(refcursor) RETURNS refcursor AS '
BEGIN
OPEN $1 FOR SELECT col FROM test;
RETURN $1;
END;
' LANGUAGE plpgsql;
BEGIN;
SELECT reffunc('funccursor');
FETCH ALL IN funccursor;
COMMIT;
The following example uses automatic cursor name generation.
CREATE FUNCTION reffunc2() RETURNS refcursor AS '
DECLARE
ref refcursor;
BEGIN
OPEN ref FOR SELECT col FROM test;
RETURN ref;
END;
' LANGUAGE plpgsql;
-- need to be in a transaction to use cursors.
BEGIN;
SELECT reffunc2();
reffunc2
--------------------
<unnamed cursor 1>
(1 row)
FETCH ALL IN "<unnamed cursor 1>";
COMMIT;
The following example is a way to return multiple cursors from a single function.
CREATE FUNCTION myfunc(refcursor, refcursor) RETURNS SETOF refcursor AS $$
BEGIN
OPEN $1 FOR SELECT * FROM table_1;
RETURN NEXT $1;
OPEN $2 FOR SELECT * FROM table_2;
RETURN NEXT $2;
END;
$$ LANGUAGE plpgsql;
-- need to be in a transaction to use cursors.
BEGIN;
SELECT * FROM myfunc('a', 'b');
FETCH ALL FROM a;
FETCH ALL FROM b;
COMMIT;
Loop through cursor results
There is a FOR
statement (example below) that can iterate through the rows returned by the cursor.
[ <<label>> ]
FOR recordvar IN bound_cursorvar [ ( [ argument_name := ] argument_value [, ...] ) ] LOOP
statements
END LOOP [ label ];
The cursor variable should be bound to some query when it is declared and should not be already open. The FOR statement automatically opens a cursor and closes it when the loop ends. The list of actual argument values should appear only if the cursor is declared to take an argument. This value is replaced in the query in the same way as OPEN
.
The recordvar
variable is automatically defined as type record
and exists only in the loop (the existing definition of the variable name is ignored in the loop). Each row returned by the cursor is assigned to this record variable and the loop body is executed.
Errors and Messages
Reporting errors and messages
You may use the RAISE
statement to report messages and causes errors.
RAISE [ level ] 'format' [, expression [, ... ]] [ USING option = expression [, ... ] ];
RAISE [ level ] condition_name [ USING option = expression [, ... ] ];
RAISE [ level ] SQLSTATE 'sqlstate' [ USING option = expression [, ... ] ];
RAISE [ level ] USING option = expression [, ... ];
RAISE ;
The level
option specifies the severity of the error. Allowable levels are DEBUG
, LOG
, INFO
, NOTICE
, WARNING
, and EXCEPTION
(EXCEPTION
is the default). EXCEPTION
causes an error (usually aborting the current transaction). The other levels generate only messages of different priority levels. Whether messages of a particular priority are reported to the client, or recorded in the server log can be controlled by the log_min_messages and client_min_messages configuration variables.
If there is level
, you can create a format
(it must be a simple string literal, not an expression). A format string specifies the error message text to report. Following the format string, an optional argument expression can be inserted after the message. Within the format string, %
is replaced by the string expressions of the following optional argument values: To print a literal %
, you should write %%
. The number of arguments should match the number of %
placeholders in the format string; if not, an error will occur while compiling the function.
In this example, the value of v_job_id
replaces %
of the string.
RAISE NOTICE 'Calling cs_create_job(%)', v_job_id;
You can attach additional information to the error report by writing an option = expression
entry after USING
. Each expression
can be a string value expression. The allowable option
keywords include:
MESSAGE
Sets the error message text. This option cannot be used in aRAISE
format that includes format strings beforeUSING
.DETAIL
Provides a detailed error message.HINT
Provides a hint message.ERRCODE
Specifies the error code (SQLSTATE) to be reported directly by condition name or with a 5-digit SQLSTATE code.COLUMN, CONSTRAINT, DATATYPE, TABLE, SCHEMA
Provides the names of the related objects.
This example stops a transaction with a given error message and a hint.
RAISE EXCEPTION 'Nonexistent ID --> %', user_id
USING HINT = 'Please check your user ID';
These two examples show an identical way of setting SQLSTATE.
RAISE 'Duplicate user ID: %', user_id USING ERRCODE = 'unique_violation';
RAISE 'Duplicate user ID: %', user_id USING ERRCODE = '23505';
There is a second RAISE statement for which the main argument is the condition name or SQLSTATE to be reported. For example:
RAISE unique_violation USING MESSAGE = 'Duplicate user ID: ' || user_id;
Another way is to create RAISE USING
or RAISE
level
USING
and put everything else in the USING
list.
The last variant of RAISE
has no parameters at all. This format can be used only within the EXCEPTION
clause of a BEGIN
block. Re-generate the error currently being processed.
If no condition name or SQLSTATE is specified in RAISE_EXCEPTION (P0001)
command, RAISE_EXCEPTION (P0001)
is used by default. If the message text is not specified, the message name becomes the condition name or SQLSTATE by default.
Note: If you specify a SQLSTATE code as the error code, you can select an error code that is not limited to the predefined error code(s) but consists of a five-digit number and/or uppercase ASCII characters (except
00000
). As the three zero-terminated error codes are category codes, error codes are generated when they can only be trapped by trapping the entire category.
Verifying assertion
The ASSERT
statement is a convenient shorthand for inserting debugging checks into PL/pgSQL functions.
ASSERT condition [ , message ];
condition
is a Boolean expression that is always expected to be asserted to true. If true, no ASSERT statement is executed; if false or null, an ASSERT_FAILURE
exception is thrown. (If an error occurs while asserting the condition
, it is reported as a normal error.)
With an optional message
given, if condition
fails, the result (if not null) replaces the default error message text “assertion failed.” If the assertion is successful, the *message*
expression is not performed.
ASSERT
tests can be enabled or disabled via the configuration parameter plpgsql.check_asserts
, which uses boolean values; the default is on
. If this parameter is off
, the ASSERT
statement does nothing.
ASSERT
is intended to detect program bugs, not to report common error conditions. Use the RAISE
statement described above for reporting errors.
Trigger Procedures
PL/pgSQL can be used to define trigger procedures for data changes or database events. A trigger procedure is created with CREATE FUNCTION (command); it does not have any argument and is declared with a return type of trigger
(in the case of data change triggers) or event_trigger
(in the case of database event triggers). A special local variable called PG_
something
is automatically defined to describe the condition that triggered the call.
Data change triggers
A data change trigger is declared as a function with a return type of trigger without arguments. The function should be declared without arguments, even if it is expected to receive the arguments specified in CREATE TRIGGER
. These arguments are passed through TG_ARGV
as described below.
When the PL/pgSQL function is called by trigger, several special variables are automatically created in the top-level block; the created items are as follows:
NEW
Data typeRECORD
; variable that holds new database rows onINSERT
/UPDATE
operations on row-level triggers. This variable is not specified for statement-level triggers andDELETE
operation.OLD
Data typeRECORD
; variable that holds old database row onUPDATE
/DELETE
operation on the row level triggers. This variable is not specified for statement-level triggers andINSERT
operation.TG_NAME
Data typename
; variable that contains the name of the trigger actually fired.TG_WHEN
Data typetext
; a string ofBEFORE
,AFTER
, orINSTEAD OF
, depending on the trigger definition.TG_LEVEL
Data typetext
; a string ofROW
orSTATEMENT
, depending on the trigger definition.TG_OP
Data typetext
; a string ofINSERT
,UPDATE
,DELETE
, orTRUNCATE
telling for which operation the trigger is actually fired.TG_RELID
Data typeoid
; the object ID of the table that caused the trigger invocation.TG_TABLE_NAME
Data typename
; the name of the table that caused the trigger invocation.TG_TABLE_SCHEMA
Data typename
; the schema name of the table that caused the trigger call.TG_NARGS
Data typeinteger
; the number of arguments given to the trigger procedure in theCREATE TRIGGER
statement.TG_ARGV[]
Data typetext
; the argument index of theCREATE TRIGGER
statement is zero-based. Invalid indices (less than 0 or greater than or equal totg_nargs
) result in a NULL value.
A trigger function must return NULL or a record/row value that exactly matches the structure of the table on which the trigger was executed.
A row-level trigger triggered by BEFORE
can return null to signal the trigger manager to skip the rest of the work for this row (i.e. no subsequent trigger is executed and INSERT
/UPDATE
/DELETE
of this row will not occur). If nonnull is returned, the job advances to the corresponding row value. Returning a row value different from the original value of NEW
changes the row to be inserted or updated. Accordingly, if the trigger function does not change the row value and you want the triggering action to succeed normally, NEW
(or its corresponding value) should be returned. It is possible to change the row to be saved by modifying a single value directly from NEW
and returning a modified NEW
or by creating a completely-new record/row to be returned. In the case of before-trigger of DELETE
, the return value does not have a direct effect, but should not be null to continue the triggering operation. As NEW
is null in DELETE trigger, returning is meaningless in general. A common idiom for DELETE trigger is to return OLD
.
An INSTEAD OF
trigger (which is always a row-level trigger and can only be used in a view) can return null to indicate that no update has been made; skipping the rest of the work for this row should be possible (e.g. the trigger will not start and the row will not be calculated when it is affected by surrounding INSERT
/UPDATE
/DELETE
). Otherwise, a non-null value should be returned to indicate that the trigger has performed the requested operation. The return value should be NEW
in the case of INSERT
and UPDATE
operations, and the trigger function can be modified to support INSERT RETURNING
and UPDATE RETURNING
(this affects the row values passed to the subsequent trigger, or is passed by the specific EXCLUDED
alias reference contained in the ON CONFLICT DO UPDATE
clause of the INSERT
statement). For DELETE
operation, the return value should be OLD
.
The return value of a row-level trigger executed after BEFORE
or AFTER
is always ignored. It may be null. However, if any of these types of triggers fail, the entire operation can be suspended.
The following example shows an example of a trigger procedure in PL/pgSQL.
This trigger in the example below allows the current user name and time to be stamped on a row whenever a row is inserted or updated in the table. Make sure the employee’s name is given and the salary is positive (number).
CREATE TABLE emp (
empname text,
salary integer,
last_date timestamp,
last_user text
);
CREATE FUNCTION emp_stamp() RETURNS trigger AS $emp_stamp$
BEGIN
-- Check that empname and salary are given
IF NEW.empname IS NULL THEN
RAISE EXCEPTION 'empname cannot be null';
END IF;
IF NEW.salary IS NULL THEN
RAISE EXCEPTION '% cannot have null salary', NEW.empname;
END IF;
-- Who works for us when they must pay for it?
IF NEW.salary < 0 THEN
RAISE EXCEPTION '% cannot have a negative salary', NEW.empname;
END IF;
-- Remember who changed the payroll when
NEW.last_date := current_timestamp;
NEW.last_user := current_user;
RETURN NEW;
END;
$emp_stamp$ LANGUAGE plpgsql;
CREATE TRIGGER emp_stamp BEFORE INSERT OR UPDATE ON emp
FOR EACH ROW EXECUTE PROCEDURE emp_stamp();
Another way to log changes to a table is to create a new table that holds rows for each insert, update, or delete. This approach can be thought of as auditing any changes of the table. The following example shows an example of an audit trigger procedure in PL/pgSQL.
This example trigger causes the row insert, update, or delete in emp table to be written to the emp_audit table. The current time and user name are stamped on the row together with the type of the operation performed.
CREATE TABLE emp (
empname text NOT NULL,
salary integer
);
CREATE TABLE emp_audit(
operation char(1) NOT NULL,
stamp timestamp NOT NULL,
userid text NOT NULL,
empname text NOT NULL,
salary integer
);
CREATE OR REPLACE FUNCTION process_emp_audit() RETURNS TRIGGER AS $emp_audit$
BEGIN
--
-- Create a row in emp_audit to reflect the operation performed on emp,
-- make use of the special variable TG_OP to work out the operation.
--
IF (TG_OP = 'DELETE') THEN
INSERT INTO emp_audit SELECT 'D', now(), user, OLD.*;
RETURN OLD;
ELSIF (TG_OP = 'UPDATE') THEN
INSERT INTO emp_audit SELECT 'U', now(), user, NEW.*;
RETURN NEW;
ELSIF (TG_OP = 'INSERT') THEN
INSERT INTO emp_audit SELECT 'I', now(), user, NEW.*;
RETURN NEW;
END IF;
RETURN NULL; -- result is ignored since this is an AFTER trigger
END;
$emp_audit$ LANGUAGE plpgsql;
CREATE TRIGGER emp_audit
AFTER INSERT OR UPDATE OR DELETE ON emp
FOR EACH ROW EXECUTE PROCEDURE process_emp_audit();
A variation on the previous example indicates when each entry was last modified by using view that joins the main table to the audit table. This approach still logs the entire audit trail of changes to the table, but provides a brief view of the audit trail, showing only the last modified timestamp derived from the audit trail for each entry. The following example shows an example of an audit trigger on view in PL/pgSQL.
This example allows view to use a trigger to update the view and to make inserting, updating, or deleting rows of the view to be written to the emp_audit
table. The current time and user name are recorded with the type of operation performed, and the view shows the last modification time of each row.
CREATE TABLE emp (
empname text PRIMARY KEY,
salary integer
);
CREATE TABLE emp_audit(
operation char(1) NOT NULL,
userid text NOT NULL,
empname text NOT NULL,
salary integer,
stamp timestamp NOT NULL
);
CREATE VIEW emp_view AS
SELECT e.empname,
e.salary,
max(ea.stamp) AS last_updated
FROM emp e
LEFT JOIN emp_audit ea ON ea.empname = e.empname
GROUP BY 1, 2;
CREATE OR REPLACE FUNCTION update_emp_view() RETURNS TRIGGER AS $$
BEGIN
--
-- Perform the required operation on emp, and create a row in emp_audit
-- to reflect the change made to emp.
--
IF (TG_OP = 'DELETE') THEN
DELETE FROM emp WHERE empname = OLD.empname;
IF NOT FOUND THEN RETURN NULL; END IF;
OLD.last_updated = now();
INSERT INTO emp_audit VALUES('D', user, OLD.*);
RETURN OLD;
ELSIF (TG_OP = 'UPDATE') THEN
UPDATE emp SET salary = NEW.salary WHERE empname = OLD.empname;
IF NOT FOUND THEN RETURN NULL; END IF;
NEW.last_updated = now();
INSERT INTO emp_audit VALUES('U', user, NEW.*);
RETURN NEW;
ELSIF (TG_OP = 'INSERT') THEN
INSERT INTO emp VALUES(NEW.empname, NEW.salary);
NEW.last_updated = now();
INSERT INTO emp_audit VALUES('I', user, NEW.*);
RETURN NEW;
END IF;
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER emp_audit
INSTEAD OF INSERT OR UPDATE OR DELETE ON emp_view
FOR EACH ROW EXECUTE PROCEDURE update_emp_view();
One of the uses of triggers is to maintain a summary table of other tables. The result summary can be used in place of the original table for a particular query (usually it requires a much shorter execution time). This technique is commonly used in data warehousing where measured or observed data tables (called fact tables) can be very large. The following example shows an example of a PL/pgSQL trigger procedure that maintains a summary table for the data warehouse’s fact tables.
The schemas described here are based in part on a grocery store example included in The Data Warehouse Toolkit by Ralph Kimball.
--
-- Main tables - time dimension and sales fact.
--
CREATE TABLE time_dimension (
time_key integer NOT NULL,
day_of_week integer NOT NULL,
day_of_month integer NOT NULL,
month integer NOT NULL,
quarter integer NOT NULL,
year integer NOT NULL
);
CREATE UNIQUE INDEX time_dimension_key ON time_dimension(time_key);
CREATE TABLE sales_fact (
time_key integer NOT NULL,
product_key integer NOT NULL,
store_key integer NOT NULL,
amount_sold numeric(12,2) NOT NULL,
units_sold integer NOT NULL,
amount_cost numeric(12,2) NOT NULL
);
CREATE INDEX sales_fact_time ON sales_fact(time_key);
--
-- Summary table - sales by time.
--
CREATE TABLE sales_summary_bytime (
time_key integer NOT NULL,
amount_sold numeric(15,2) NOT NULL,
units_sold numeric(12) NOT NULL,
amount_cost numeric(15,2) NOT NULL
);
CREATE UNIQUE INDEX sales_summary_bytime_key ON sales_summary_bytime(time_key);
--
-- Function and trigger to amend summarized column(s) on UPDATE, INSERT, DELETE.
--
CREATE OR REPLACE FUNCTION maint_sales_summary_bytime() RETURNS TRIGGER
AS $maint_sales_summary_bytime$
DECLARE
delta_time_key integer;
delta_amount_sold numeric(15,2);
delta_units_sold numeric(12);
delta_amount_cost numeric(15,2);
BEGIN
-- Work out the increment/decrement amount(s).
IF (TG_OP = 'DELETE') THEN
delta_time_key = OLD.time_key;
delta_amount_sold = -1 * OLD.amount_sold;
delta_units_sold = -1 * OLD.units_sold;
delta_amount_cost = -1 * OLD.amount_cost;
ELSIF (TG_OP = 'UPDATE') THEN
-- forbid updates that change the time_key -
-- (probably not too onerous, as DELETE + INSERT is how most
-- changes will be made).
IF ( OLD.time_key != NEW.time_key) THEN
RAISE EXCEPTION 'Update of time_key : % -> % not allowed',
OLD.time_key, NEW.time_key;
END IF;
delta_time_key = OLD.time_key;
delta_amount_sold = NEW.amount_sold - OLD.amount_sold;
delta_units_sold = NEW.units_sold - OLD.units_sold;
delta_amount_cost = NEW.amount_cost - OLD.amount_cost;
ELSIF (TG_OP = 'INSERT') THEN
delta_time_key = NEW.time_key;
delta_amount_sold = NEW.amount_sold;
delta_units_sold = NEW.units_sold;
delta_amount_cost = NEW.amount_cost;
END IF;
-- Insert or update the summary row with the new values.
<<insert_update>>
LOOP
UPDATE sales_summary_bytime
SET amount_sold = amount_sold + delta_amount_sold,
units_sold = units_sold + delta_units_sold,
amount_cost = amount_cost + delta_amount_cost
WHERE time_key = delta_time_key;
EXIT insert_update WHEN found;
BEGIN
INSERT INTO sales_summary_bytime (
time_key,
amount_sold,
units_sold,
amount_cost)
VALUES (
delta_time_key,
delta_amount_sold,
delta_units_sold,
delta_amount_cost
);
EXIT insert_update;
EXCEPTION
WHEN UNIQUE_VIOLATION THEN
-- do nothing
END;
END LOOP insert_update;
RETURN NULL;
END;
$maint_sales_summary_bytime$ LANGUAGE plpgsql;
CREATE TRIGGER maint_sales_summary_bytime
AFTER INSERT OR UPDATE OR DELETE ON sales_fact
FOR EACH ROW EXECUTE PROCEDURE maint_sales_summary_bytime();
INSERT INTO sales_fact VALUES(1,1,1,10,3,15);
INSERT INTO sales_fact VALUES(1,2,1,20,5,35);
INSERT INTO sales_fact VALUES(2,2,1,40,15,135);
INSERT INTO sales_fact VALUES(2,3,1,10,1,13);
SELECT * FROM sales_summary_bytime;
DELETE FROM sales_fact WHERE product_key = 1;
SELECT * FROM sales_summary_bytime;
UPDATE sales_fact SET units_sold = units_sold * 2;
SELECT * FROM sales_summary_bytime;
Event triggers
PL/pgSQL can be used to define event triggers. In AgensGraph, procedures to be called as event triggers have no arguments and event_trigger
should be declared as return type.
When a PL/pgSQL function is called with an event trigger, several special variables are automatically created in the top-level block.
TG_EVENT
Data typetext
; a string representing the event under which the trigger will be executed.TG_TAG
Data typetext
; a variable that contains a command tag where the trigger is executed.
The trigger in the example below generates a NOTICE
message each time a supported command is executed.
CREATE OR REPLACE FUNCTION snitch() RETURNS event_trigger AS $$
BEGIN
RAISE NOTICE 'snitch: % %', tg_event, tg_tag;
END;
$$ LANGUAGE plpgsql;
CREATE EVENT TRIGGER snitch ON ddl_command_start EXECUTE PROCEDURE snitch();
Tips for Developing in PL/pgSQL
A good way of development in PL/pgSQL is to write a function using a text editor of your choice and load and test it using psql in another window. If you work like this, we recommend you to write a function using CREATE OR REPLACE FUNCTION
. You can then update the function definition by reloading the file. An example is shown below:
CREATE OR REPLACE FUNCTION testfunc(integer) RETURNS integer AS $$
....
$$ LANGUAGE plpgsql;
You can load or reload the following function definition file while running psql as follows:
\i filename.sql
Another good way to develop in PL/pgSQL is to use a GUI database access tool that facilitates development with a procedural language. These tools often provide convenient features (e.g. removing single quotes, easier play back).
Quotation processing
The code of a PL/pgSQL function is specified as a string literal in CREATE FUNCTION
. If you use a usual way to enclose a string in single quotes, you need to double the single quotes in the body of the function. Likewise, you should use two backslashes as well (assuming escape string syntax is used). Using quotes twice can make your code difficult to understand. This is because it is easy for the user to know that many adjacent quotes are needed.
Therefore, you are recommended to write function bodies using “dollar quoted” string literals. In the dollar citation method, you should never use quotation marks twice, but instead choose a different dollar citation delimiter for each required nesting level. For example, you can create a CREATE FUNCTION
command as follows:
CREATE OR REPLACE FUNCTION testfunc(integer) RETURNS integer AS $PROC$
....
$PROC$ LANGUAGE plpgsql;
In this case, you may use quotes around a simple literal string in SQL commands and use $$
to separate the SQL command to assemble into strings. If you need to quote text that contains $$
, you can use $Q$
.
The following chart shows what to do when writing quotes without dollar citation marks. This can be useful when converting dollar citation marks to something more understandable.
1 quotation mark
To start and end a function body:
CREATE FUNCTION foo() RETURNS integer AS '
....
' LANGUAGE plpgsql;
Within the function body enclosed in single quotes, the quotes should appear in pairs.
2 quotation marks
It is used to express a string literal inside a function body. For example:
a_output := ''Blah'';
SELECT * FROM users WHERE f_name=''foobar'';
In the dollar citation scheme, you can write:
a_output := 'Blah';
SELECT * FROM users WHERE f_name='foobar';
The PL/pgSQL parser figures out both cases exactly.
4 quotation marks
This is used when a string constant inside the function body requires a single quote. For example:
a_output := a_output || '' AND name LIKE ''''foobar'''' AND xyz''
The value that has actually been added to a_output
: AND name LIKE 'foobar'AND xyz
.
In the dollar citation scheme, you can write:
a_output := a_output || $$ AND name LIKE 'foobar' AND xyz$$
6 quotation marks
Used when a single quote of a string inside the function body is adjacent to the end of the string constant. For example:
a_output := a_output || '' AND name LIKE ''''foobar''''''
The value added to a_output
: AND name LIKE 'foobar'
.
In the dollar citation scheme, you can write:
a_output := a_output || $$ AND name LIKE 'foobar'$$
10 quotation marks
Two single quotes are needed in a string constant (representing eight quote marks) and are adjacent to the end of the string constant (two or more). You will only need it if you are writing a function that creates another function.
a_output := a_output || '' if v_'' ||
referrer_keys.kind || '' like ''''''''''
|| referrer_keys.key_string || ''''''''''
then return '''''' || referrer_keys.referrer_type
|| ''''''; end if;'';
The value of a_output
is as follows
if v_... like ''...'' then return ''...''; end if;
In the dollar citation scheme, you can write:
a_output := a_output || $$ if v_$$ || referrer_keys.kind || $$ like '$$
|| referrer_keys.key_string || $$'
then return '$$ || referrer_keys.referrer_type
|| $$'; end if;$$;
PL/Python
PL/Python Functions
Functions in PL/Python are declared using standard CREATE FUNCTION syntax.
CREATE FUNCTION funcname (argument-list)
RETURNS return-type
AS $$
# PL/Python function body
$$ LANGUAGE plpythonu;
The body of this function is a simple python script. When the function is called, its arguments are passed to the list args
; named arguments are passed to the Python script as ordinary variables as well. Such named arguments are generally more readable. The result is returned in the Python code with return or yield (in case of a result-set statement). If you do not provide a return value, Python returns the default (None
). PL/Python converts Python’s None
to SQL’s Null.
For example, a function that returns the greater of two integers can be defined as:
CREATE FUNCTION pymax (a integer, b integer)
RETURNS integer
AS $$
if a > b:
return a
return b
$$ LANGUAGE plpythonu;
The Python code that is given as the body of the function definition is transformed into a Python function. For example, the above results in:
def __plpython_procedure_pymax_23456():
if a > b:
return a
return b
Assuming that 23456 is the OID assigned to the function.
The argument is set to a global variable. Unless a variable is declared globally in a block, the argument variable cannot be reassigned within the function as an expression value containing the variable name itself according to the Python’s scoping rules. For example, the following may not work:
CREATE FUNCTION pystrip(x text)
RETURNS text
AS $$
x = x.strip() # error
return x
$$ LANGUAGE plpythonu;
When you assign it to x
, x
becomes a local variable for the entire block. Thus, the x
on the right side, which is not a PL/Python function parameter, only refers to the local variable x
that has not yet been allocated. You can use a global
statement to perform the following:
CREATE FUNCTION pystrip(x text)
RETURNS text
AS $$
global x
x = x.strip() # ok now
return x
$$ LANGUAGE plpythonu;
However, it is better not to rely on the detailed implementation of PL/Python; you are recommended to use function parameters as read-only.
Data Values
In general, the purpose of PL/Python is to provide a “natural” mapping between AgensGraph and Python. Information on the data mapping rules is described below.
Data Type Mapping
When a PL/Python function is called, the argument is converted from the AgensGraph data type to the corresponding Python type.
AgensGraph
boolean
is converted to Pythonboolean
.AgensGraph
smallint
andint
are converted to Pythonint
. AgensGraphbigint
andoid
are converted tolong
in Python2 andint
in Python3.AgensGraph
real
anddouble
are converted to Pythonfloat
.AgensGraph
numeric
is converted to PythonDecimal
. If this type is available, bring it from thecdecimal
package. Otherwise, the standard librarydecimal.Decimal
is used.cdecimal
is considerably faster thandecimal
. In Python3.3 or later versions, there is no difference between the two ascdecimal
is integrated into the standard library under the namedecimal
.AgensGraph
bytea
is converted tostr
in Python2 andbytes
in Python3. In Python2, you need to treat the string as a sequence of bytes without character encoding.All other data types, including the AgensGraph string format, are converted to Python
str
. In Python2, this string is in the AgensGraph server encoding. In Python3, it becomes the same Unicode string as all strings.For nonscalar data type, see below.
When returning a PL/Python function, the return value is converted to the declared AgensGraph return data type of the function as follows:
When the AgensGraph return type is
boolean
, the return value is evaluated to see if it is true according to the Python rules. That is, while 0 and an empty string are false, ‘f’ is true.When the AgensGraph return type is
bytea
, the return value is converted to string (Python2) or bytes (Python3) using each Python built-in function, and the result is converted to bytea.For all other AgensGraph return types, the conversion value is converted to a string using the Python built-in
str
, and the result is passed to the input function of AgensGraph data type. (If the Python value isfloat
, use therepr
built-in instead ofstr
to avoid loss of precision.)
In Python2, strings should be in the AgensGraph server encoding when passed to AgensGraph. A string that is not valid in the current server encoding will cause an error; since not all encoding discrepancies can be detected, garbage values can continue to occur if this is not done correctly. As Unicode strings are automatically converted to the correct encoding, using them can be safer and more convenient. In Python3, all strings are Unicode strings.
*See below for more information on nonscalar data types.
The logical discrepancy between the declared AgensGraph return type and actual return object’s Python data type is not displayed; in any case the value will be returned.
Null, None
If SQL null is passed to a function, the argument value in Python is displayed as none. For example, the function definition in [PL/Python Functions] pymax
returns an incorrect answer for null input. You can add STRICT
to your function definition to make the work more reasonable. If null is passed, the function will not be called at all and will automatically return null. You may also check for null input in the function body.
CREATE FUNCTION pymax (a integer, b integer)
RETURNS integer
AS $$
if (a is None) or (b is None):
return None
if a > b:
return a
return b
$$ LANGUAGE plpythonu;
As shown above, to return an SQL null value from a PL/Python function, return the value None
. This can be done whether the function is strict or not.
Arrays, Lists
SQL array values are passed to PL/Python as a Python list. To return a SQL array value from a PL/Python function, return a Python sequence (e.g. a list or tuple).
CREATE FUNCTION return_arr()
RETURNS int[]
AS $$
return (1, 2, 3, 4, 5)
$$ LANGUAGE plpythonu;
SELECT return_arr();
return_arr
-------------
{1,2,3,4,5}
(1 row)
Strings are sequences in Python; Python programmers who are not familiar with this can have undesirable consequences.
CREATE FUNCTION return_str_arr()
RETURNS varchar[]
AS $$
return "hello"
$$ LANGUAGE plpythonu;
SELECT return_str_arr();
return_str_arr
----------------
{h,e,l,l,o}
(1 row)
Composite Types
Composite-type arguments are passed to the function as Python mappings. The element names of the mapping are the attribute names of the composite type. If an attribute in the passed row has a null value, it has a None
value in the mapping. Here is an example:
CREATE TABLE employee (
name text,
salary integer,
age integer
);
CREATE FUNCTION overpaid (e employee)
RETURNS boolean
AS $$
if e["salary"] > 200000:
return True
if (e["age"] < 30) and (e["salary"] > 100000):
return True
return False
$$ LANGUAGE plpythonu;
There are multiple ways to return row or composite types from a Python function. To execute a composite type result example, create a TYPE as shown below:
CREATE TYPE named_value AS (
name text,
value integer
);
A composite result can be returned as a:
Sequence type (a tuple or list, but not a set because it is not indexable) Returned sequence objects must have the same number of items as the composite result type has fields. The item with index 0 is assigned to the first field of the composite type, 1 to the second and so on. For example:
CREATE FUNCTION make_pair (name text, value integer) RETURNS named_value AS $$ return [ name, value ] # or alternatively, as tuple: return ( name, value ) $$ LANGUAGE plpythonu;
To return a SQL null for any column, insert
none
at the corresponding position.Mapping (dictionary)
The value for each result type column is retrieved from the mapping with the column name as key. Example:
CREATE FUNCTION make_pair (name text, value integer) RETURNS named_value AS $$ return { "name": name, "value": value } $$ LANGUAGE plpythonu;
Any extra dictionary key/value pairs are ignored. Missing keys are treated as errors. To return a SQL null value for any column, insert
None
with the corresponding column name as the key.Object (any object providing method
__getattr__
)This works the same as a mapping. Here is an example:
CREATE FUNCTION make_pair (name text, value integer) RETURNS named_value AS $$ class named_value: def __init__ (self, n, v): self.name = n self.value = v return named_value(name, value) # or simply class nv: pass nv.name = name nv.value = value return nv $$ LANGUAGE plpythonu;
Functions with
OUT
parameters are also supported. Here is an example:CREATE FUNCTION multiout_simple(OUT i integer, OUT j integer) AS $$ return (1, 2) $$ LANGUAGE plpythonu; SELECT * FROM multiout_simple();
Set-returning Functions
A PL/Python function can also return sets of scalar or composite types. There are several ways to achieve this because the returned object is internally turned into an iterator. The following examples assume we have composite type:
CREATE TYPE greeting AS (
how text,
who text
);
A set result can be returned from a:
Sequence type (tuple, list, set)
CREATE FUNCTION greet (how text) RETURNS SETOF greeting AS $$ # return tuple containing lists as composite types # all other combinations work also return ( [ how, "World" ], [ how, "PostgreSQL" ], [ how, "PL/Python" ] ) $$ LANGUAGE plpythonu;
Iterator (any object providing
__iter__
andnext
methods)CREATE FUNCTION greet (how text) RETURNS SETOF greeting AS $$ class producer: def __init__ (self, how, who): self.how = how self.who = who self.ndx = -1 def __iter__ (self): return self def next (self): self.ndx += 1 if self.ndx == len(self.who): raise StopIteration return ( self.how, self.who[self.ndx] ) return producer(how, [ "World", "PostgreSQL", "PL/Python" ]) $$ LANGUAGE plpythonu;
Generator (
yield
)CREATE FUNCTION greet (how text) RETURNS SETOF greeting AS $$ for who in [ "World", "PostgreSQL", "PL/Python" ]: yield ( how, who ) $$ LANGUAGE plpythonu;
The set-returning function with OUT
parameters (using RETURNS SETOF
record) is also supported. Here is an example:
CREATE FUNCTION multiout_simple_setof(n integer, OUT integer, OUT integer)
RETURNS SETOF record
AS $$
return [(1, 2)] * n
$$ LANGUAGE plpythonu;
SELECT * FROM multiout_simple_setof(3);
Anonymous Code Blocks
PL/Python also supports anonymous code blocks that are called together with DO statements.
DO $$
# PL/Python code
$$ LANGUAGE plpythonu;
Anonymous code blocks do not take arguments and discard all the returned values. Otherwise, they behave like a function.
Trigger Functions
When a function is used as a trigger, the dictionary TD
contains the trigger-related value.
TD[“event”]
Contains an event as a string (INSERT
, UPDATE
, DELETE
, or TRUNCATE
).
TD[“when”]
Contains one of BEFORE
, AFTER
, and INSTEAD OF
.
TD[“level”]
Contains ROW
or STATEMENT
.
TD[“new”], TD[“old”]
Contains each trigger row according to one or both trigger events of the fields, in the case of row-level triggers.
TD[“name”]
Contains the trigger name.
TD[“table_name”]
Contains the name of the table on which the trigger occurred.
TD[“table_schema”]
Contains the schema of the table where the trigger occurred.
TD[“relid”]
Contains the OID of the table where the trigger occurred.
TD[“args”]
In the case of CREATE TRIGGER
(command) with arguments, you can use from TD["args"][0]
to TD["args"][n-1]
.
When TD ["when"]
is BEFORE
or INSTEAD OF
and TD ["level"]
is ROW
, the Python function may return None
or “OK” to indicate that the row has not changed. “SKIP” aborts the event. When TD ["event"]
applies INSERT
or UPDATE
, it can return “MODIFY” to modify the new row. Otherwise, the return value is ignored.
Database Access
The PL/Python language module automatically imports a Python module called plpy
. The functions and constants in this module are available to you in the Python code as plpy.foo
.
Database Access Functions
The plpy
module provides several functions to execute database commands:
plpy.execute(query [, max-rows])
If you call plpy.execute
with a query string and select an optional row limit argument, the corresponding query will be run and the result will be returned in a result object.
The result object emulates a list or dictionary object. The result object can be accessed by row number and column name. For example:
rv = plpy.execute("SELECT * FROM my_table", 5)
returns up to 5 rows from my_table
. If my_table
has a column my_column
, it would be accessed as:
foo = rv[i]["my_column"]
The number of rows returned can be obtained using the built-in len
function.
The result object has these additional methods:
nrows() Returns the number of rows processed by the command. Note that this is not necessarily the same as the number of rows returned. For example,
UPDATE
command will set this value but won’t return any rows (unlessRETURNING
is used).status() Returns the
SPI_execute()
value.colnames(), coltypes(), coltypmods() Returns a list of column names, list of column type OIDs, and list of type-specific type modifiers for the columns, respectively. These methods raise an exception when called on a result object from a command that did not produce a result set (e.g.
UPDATE
withoutRETURNING
, orDROP TABLE
). But it is OK to use these methods on a result set containing zero rows.str() The standard
__str__
method is defined so that it is possible, for example, to debug query execution results usingplpy.debug(rv)
.
The result object can be modified.
Calling plpy.execute
will cause the entire result set to be read into memory. Use the function only when the result set will be relatively small. If you don’t want to risk excessive memory usage when fetching large results, use plpy.cursor
rather than plpy.execute
.
plpy.prepare(query [, argtypes]) plpy.execute(plan [, arguments [, max-rows]])
plpy.prepare
prepares the execution plan for a query. It is called with a query string and a list of parameter types, if you have parameter references in the query. Here is an example:
plan = plpy.prepare("SELECT last_name FROM my_users WHERE first_name = $1", ["text"])
text
is the type of the variable you will be passing for $1. The second argument is optional if you don’t want to pass any parameters to the query. After preparing a statement, you use a variant of the function plpy.execute
to run it:
rv = plpy.execute(plan, ["name"], 5)
Pass the plan as the first argument (instead of the query string), and a list of values to substitute into the query as the second argument. The second argument is optional if the query cannot expect any parameters. The third argument is the optional row limit as before.
Query parameters and result row fields are converted between PostgreSQL and Python data types as described in Data Values.
When you prepare a plan using the PL/Python module it is automatically saved. Refer to this link for more information. In order to use this function effectively through this feature, you need to use one of the persistent storage dictionaries SD
or GD
(see [Sharing Data]). Here is an example:
CREATE FUNCTION usesavedplan() RETURNS trigger AS $$
if "plan" in SD:
plan = SD["plan"]
else:
plan = plpy.prepare("SELECT 1")
SD["plan"] = plan
# rest of function
$$ LANGUAGE plpythonu;
plpy.cursor(query) plpy.cursor(plan [, arguments])
The plpy.cursor
function accepts the same arguments as plpy.execute
(except for the row limit) and returns a cursor object, which allows you to process large result sets in smaller chunks. As with plpy.execute
, either a query string or a plan object along with a list of arguments can be used.
The cursor object provides a fetch
method that accepts an integer parameter and returns a result object. Each time you call fetch
, the returned object will contain the next batch of rows, which is no larger than the parameter value. Once all rows are used, fetch
starts returning an empty result object. Cursor objects also provide an iterator interface, creating one row at a time until all rows are exhausted. Data fetched that way is not returned as result objects, but rather as dictionaries, each dictionary corresponding to a single result row.
An example of two ways of processing data from a large table is:
CREATE FUNCTION count_odd_iterator() RETURNS integer AS $$
odd = 0
for row in plpy.cursor("select num from largetable"):
if row['num'] % 2:
odd += 1
return odd
$$ LANGUAGE plpythonu;
CREATE FUNCTION count_odd_fetch(batch_size integer) RETURNS integer AS $$
odd = 0
cursor = plpy.cursor("select num from largetable")
while True:
rows = cursor.fetch(batch_size)
if not rows:
break
for row in rows:
if row['num'] % 2:
odd += 1
return odd
$$ LANGUAGE plpythonu;
CREATE FUNCTION count_odd_prepared() RETURNS integer AS $$
odd = 0
plan = plpy.prepare("select num from largetable where num % $1 <> 0", ["integer"])
rows = list(plpy.cursor(plan, [2]))
return len(rows)
$$ LANGUAGE plpythonu;
Cursors are automatically disposed of. However, if you want to explicitly release all resources held by a cursor, use the close
method. Once closed, a cursor cannot be fetched from anymore.
Trapping Errors
Functions accessing the database might encounter errors, which will cause them to abort and raise an exception. Both plpy.execute
and plpy.prepare
can raise an instance of a subclass of plpy.SPIError
, which by default will terminate the function. This error can be handled just like any other Python exception, by using the try/except
construct. For example:
CREATE FUNCTION try_adding_joe() RETURNS text AS $$
try:
plpy.execute("INSERT INTO users(username) VALUES ('joe')")
except plpy.SPIError:
return "something went wrong"
else:
return "Joe added"
$$ LANGUAGE plpythonu;
The actual class of the exception being raised corresponds to the specific condition that caused the error. Refer to this link for a list of possible conditions. The module plpy.spiexceptions
defines an exception class for each condition, deriving their names from the condition name. For instance, division_by_zero
becomes DivisionByZero, unique_violation
becomes UniqueViolation
, fdw_error
becomes FdwError
, and so on. Each of these exception classes inherits from SPIError
. This separation makes it easier to handle specific errors. For example:
CREATE FUNCTION insert_fraction(numerator int, denominator int) RETURNS text AS $$
from plpy import spiexceptions
try:
plan = plpy.prepare("INSERT INTO fractions (frac) VALUES ($1 / $2)", ["int", "int"])
plpy.execute(plan, [numerator, denominator])
except spiexceptions.DivisionByZero:
return "denominator cannot equal zero"
except spiexceptions.UniqueViolation:
return "already have that fraction"
except plpy.SPIError, e:
return "other error, SQLSTATE %s" % e.sqlstate
else:
return "fraction inserted"
$$ LANGUAGE plpythonu;
As all exceptions from the plpy.spiexceptions
module inherit from SPIError
, the except
clause handling it will catch any database access error.
As an alternative way of handling different error conditions, you can catch the SPIError
exception and determine the specific error condition inside the except
block by looking at the sqlstate
attribute of the exception object. This attribute is a string value containing the “SQLSTATE” error code. This approach provides approximately the same functionality.
Explicit Subtransactions
Recovering from errors caused by database access as described in [Trapping Errors] can lead to an undesirable situation where some operations succeed before one of them fails; after recovering from that error, the data is left in an inconsistent state. PL/Python offers a solution to this problem in the form of explicit subtransactions.
Subtransaction Context Managers
Consider a function that implements a transfer between two accounts:
CREATE FUNCTION transfer_funds() RETURNS void AS $$
try:
plpy.execute("UPDATE accounts SET balance = balance - 100 WHERE account_name = 'joe'")
plpy.execute("UPDATE accounts SET balance = balance + 100 WHERE account_name = 'mary'")
except plpy.SPIError, e:
result = "error transferring funds: %s" % e.args
else:
result = "funds transferred correctly"
plan = plpy.prepare("INSERT INTO operations (result) VALUES ($1)", ["text"])
plpy.execute(plan, [result])
$$ LANGUAGE plpythonu;
If the second UPDATE
statement results in an exception being raised, this function will report the error, but the result of the first UPDATE
will nevertheless be committed. In other words, the funds will be withdrawn from Joe’s account, but will not be transferred to Mary’s account.
To avoid such issues, you can wrap your plpy.execute
calls in an explicit subtransaction. The plpy
module provides a helper object to manage explicit subtransactions that gets created with the plpy.subtransaction()
function. Objects created by this function implement the context manager interface. By using explicit subtransactions, you may rewrite the function as:
CREATE FUNCTION transfer_funds2() RETURNS void AS $$
try:
with plpy.subtransaction():
plpy.execute("UPDATE accounts SET balance = balance - 100 WHERE account_name = 'joe'")
plpy.execute("UPDATE accounts SET balance = balance + 100 WHERE account_name = 'mary'")
except plpy.SPIError, e:
result = "error transferring funds: %s" % e.args
else:
result = "funds transferred correctly"
plan = plpy.prepare("INSERT INTO operations (result) VALUES ($1)", ["text"])
plpy.execute(plan, [result])
$$ LANGUAGE plpythonu;
The use of try/catch
is still required. Otherwise the exception would be transferred to the top of the Python stack and would cause the whole function to abort due to an AgensGraph error; as a result, the operations
table would not have any row inserted into it. The subtransaction context manager does not trap errors; it only assures that all database operations executed inside its scope will be atomically committed or rolled back. A rollback of the subtransaction block occurs on any kind of exception exit, not only ones caused by errors originating from database access. A regular Python exception raised inside an explicit subtransaction block would also cause the subtransaction to be rolled back.
Older Python Versions
Context managers syntax using the WITH
keyword is available by default in Python 2.6. If using PL/Python with an older Python version, it is still possible to use explicit subtransactions, although not as transparently. You can call the subtransaction manager’s __enter__
and __exit__
functions using the enter
and exit
convenience aliases. The example function that transfers funds could be written as:
CREATE FUNCTION transfer_funds_old() RETURNS void AS $$
try:
subxact = plpy.subtransaction()
subxact.enter()
try:
plpy.execute("UPDATE accounts SET balance = balance - 100 WHERE account_name = 'joe'")
plpy.execute("UPDATE accounts SET balance = balance + 100 WHERE account_name = 'mary'")
except:
import sys
subxact.exit(*sys.exc_info())
raise
else:
subxact.exit(None, None, None)
except plpy.SPIError, e:
result = "error transferring funds: %s" % e.args
else:
result = "funds transferred correctly"
plan = plpy.prepare("INSERT INTO operations (result) VALUES ($1)", ["text"])
plpy.execute(plan, [result])
$$ LANGUAGE plpythonu;
Utility Functions
The plpy
module also provides functions:
plpy.debug(msg, **kwargs)
plpy.log(msg, **kwargs)
plpy.info(msg, **kwargs)
plpy.notice(msg, **kwargs)
plpy.warning(msg, **kwargs)
plpy.error(msg, **kwargs)
plpy.fatal(msg, **kwargs)
plpy.error
and plpy.fatal
actually raise a Python exception that is not actually called, causing the current transaction or subtransaction to be aborted. raise plpy.Error(msg)
and raise plpy.Fatal(msg)
are equivalent to calling plpy.error(msg)
and plpy.fatal(msg)
, respectively, but you cannot pass keyword arguments in raise
type. Other functions only generate messages of different priority levels. You may determine whether messages of a particular priority should be reported to the client, written to the server log, or both are controlled by the log_min_messages and client_min_messages configuration variables. See this link for more information.
The msg argument is provided as a positional argument. Two or more positional arguments may be provided for backward compatibility. In this case, the tuple string expression of the positional argument is the message reported to the client.
The following keyword-specific arguments are allowed.
detail
hint
sqlstate
schema_name
table_name
column_name
datatype_name
The string expression of the object passed as a keyword-specific argument is used to enforce the messages reported to the client. Here is an example:
CREATE FUNCTION raise_custom_exception() RETURNS void AS $$
plpy.error("custom exception message",
detail="some info about exception",
hint="hint for users")
$$ LANGUAGE plpythonu;
=# SELECT raise_custom_exception();
ERROR: plpy.Error: custom exception message
DETAIL: some info about exception
HINT: hint for users
CONTEXT: Traceback (most recent call last):
PL/Python function "raise_custom_exception", line 4, in <module>
hint="hint for users")
PL/Python function "raise_custom_exception"
Another set of utility functions are plpy.quote_literal(string)
, plpy.quote_nullable(string)
,
and plpy.quote_ident(string)
. They are equivalent to the built-in quoting functions. They are useful when constructing ad-hoc queries. Dynamic PL/Python would be:
plpy.execute("UPDATE tbl SET %s = %s WHERE key = %s" % (
plpy.quote_ident(colname),
plpy.quote_nullable(newvalue),
plpy.quote_literal(keyvalue)))
Environment Variables
Some of the environment variables that are accepted by the Python interpreter may affect PL/Python behavior. They would need to be set in the environment of the main AgensGraph server process, for example in a start script. The available environment variables depend on the version of Python; see the Python manuals for more information.
PYTHONHOME
PYTHONPATH
PYTHONY2K
PYTHONOPTIMIZE
PYTHONDEBUG
PYTHONVERBOSE
PYTHONCASEOK
PYTHONDONTWRITEBYTECODE
PYTHONIOENCODING
PYTHONUSERBASE
PYTHONHASHSEED
(It appears to be a Python implementation detail beyond the control of PL/Python that some of the environment variables listed on the python
man page are only effective in a command-line interpreter and not an embedded Python interpreter.)