Appendix

Error Codes

Every message generated by the AgensGraph server is assigned with a five-character error code that complies with the SQL standard rules for “SQLSTATE” codes.

In accordance with the standard, the first two characters of the error code indicate the error class, and the last three indicate a specific condition within the class. For this reason, we may expect that an application that does not recognize a specific error code is likely to fail to recognize the error and still work in the error class.

The following table lists all the error codes defined in AgensGraph. (Some of them are not actually used at the moment, but their definitions are still included in the SQL standard.) Error classes are also displayed. Each error class has a “standard” error code with the last three characters 000. This code is used only for error conditions that are in the class but without any specific code assigned. The symbols in the “Condition Name” column are the condition names used by PL/pgSQL. Condition names can be written in uppercase or lowercase. (Unlike errors, PL/pgSQL does not recognize warnings for condition names of class 00,01,02.)

For some types of errors, the server reports the names of database objects (table, table column, data type, or constraint) associated with the errors. For example, unique_isolation is the name of a unique constraint that caused the error. As these names are provided in a separate field in the error report message, the application does not have to extract the names to a human-readable message text.

Error Code	Condition Name
Class 00 - Successful Completion
00000	successful_completion
*Class 01 - Warning *
01000	warning
0100C	dynamic_result_sets_returned
01008	implicit_zero_bit_padding
01003	null_value_eliminated_in_set_function
01007	privilege_not_granted
01006	privilege_not_revoked
01004	string_data_right_truncation
01P01	deprecated_feature
Class 02 - No Data
02000	no_data
02001	no_additional_dynamic_result_sets_returned
Class 03 - SQL Statement Not Yet Complete
03000	sql_statement_not_yet_complete
Class 08 - Connection Exception
08000	connection_exception
08003	connection_does_not_exist
08006	connection_failure
08001	sqlclient_unable_to_establish_sqlconnection
08004	sqlserver_rejected_establishment_of_sqlconnection
08007	transaction_resolution_unknown
08P01	protocol_violation
Class 09 - Triggered Action Exception
09000	triggered_action_exception
Class 0A - Feature Not Supported
0A000	feature_not_supported
Class 0B - Invalid Transaction Initiation
0B000	invalid_transaction_initiation
Class 0F - Locator Exception
0F000	locator_exception
0F001	invalid_locator_specification
Class 0L - Invalid Grantor
0L000	invalid_grantor
0LP01	invalid_grant_operation
Class 0P - Invalid Role Specification
0P000	invalid_role_specification
Class 0Z - Diagnostics Exception
0Z000	diagnostics_exception
0Z002	stacked_diagnostics_accessed_without_active_handler
Class 20 - Case Not Found
20000	case_not_found
Class 21 - Cardinality Violation
21000	cardinality_violation
Class 22 - Data Exception
22000	data_exception
2202E	array_subscript_error
22021	character_not_in_repertoire
22008	datetime_field_overflow
22012	division_by_zero
22005	error_in_assignment
2200B	escape_character_conflict
22022	indicator_overflow
22015	interval_field_overflow
2201E	invalid_argument_for_logarithm
22014	invalid_argument_for_ntile_function
22016	invalid_argument_for_nth_value_function
2201F	invalid_argument_for_power_function
2201G	invalid_argument_for_width_bucket_function
22018	invalid_character_value_for_cast
22007	invalid_datetime_format
22019	invalid_escape_character
2200D	invalid_escape_octet
22025	invalid_escape_sequence
22P06	nonstandard_use_of_escape_character
22010	invalid_indicator_parameter_value
22023	invalid_parameter_value
2201B	invalid_regular_expression
2201W	invalid_row_count_in_limit_clause
2201X	invalid_row_count_in_result_offset_clause
2202H	invalid_tablesample_argument
2202G	invalid_tablesample_repeat
22009	invalid_time_zone_displacement_value
2200C	invalid_use_of_escape_character
2200G	most_specific_type_mismatch
22004	null_value_not_allowed
22002	null_value_no_indicator_parameter
22003	numeric_value_out_of_range
22026	string_data_length_mismatch
22001	string_data_right_truncation
22011	substring_error
22027	trim_error
22024	unterminated_c_string
2200F	zero_length_character_string
22P01	floating_point_exception
22P02	invalid_text_representation
22P03	invalid_binary_representation
22P04	bad_copy_file_format
22P05	untranslatable_character
2200L	not_an_xml_document
2200M	invalid_xml_document
2200N	invalid_xml_content
2200S	invalid_xml_comment
2200T	invalid_xml_processing_instruction
Class 23 - Integrity Constraint Violation
23000	integrity_constraint_violation
23001	restrict_violation
23502	not_null_violation
23503	foreign_key_violation
23505	unique_violation
23514	check_violation
23P01	exclusion_violation
Class 24 - Invalid Cursor State
24000	invalid_cursor_state
Class 25 - Invalid Transaction State
25000	invalid_transaction_state
25001	active_sql_transaction
25002	branch_transaction_already_active
25008	held_cursor_requires_same_isolation_level
25003	inappropriate_access_mode_for_branch_transaction
25004	inappropriate_isolation_level_for_branch_transaction
25005	no_active_sql_transaction_for_branch_transaction
25006	read_only_sql_transaction
25007	schema_and_data_statement_mixing_not_supported
25P01	no_active_sql_transaction
25P02	in_failed_sql_transaction
25P03	idle_in_transaction_session_timeout
Class 26 - Invalid SQL Statement Name
26000	invalid_sql_statement_name
*Class 27 - Triggered Data Change Violation *
27000	triggered_data_change_violation
Class 28 - Invalid Authorization Specification
28000	invalid_authorization_specification
28P01	invalid_password
Class 2B - Dependent Privilege Descriptors Still Exist
2B000	dependent_privilege_descriptors_still_exist
2BP01	dependent_objects_still_exist
*Class 2D - Invalid Transaction Termination *
2D000	invalid_transaction_termination
Class 2F - SQL Routine Exception
2F000	sql_routine_exception
2F005	function_executed_no_return_statement
2F002	modifying_sql_data_not_permitted
2F003	prohibited_sql_statement_attempted
2F004	reading_sql_data_not_permitted
Class 34 - Invalid Cursor Name
34000	invalid_cursor_name
Class 38 - External Routine Exception
38000	external_routine_exception
38001	containing_sql_not_permitted
38002	modifying_sql_data_not_permitted
38003	prohibited_sql_statement_attempted
38004	reading_sql_data_not_permitted
Class 39 - External Routine Invocation Exception
39000	external_routine_invocation_exception
39001	invalid_sqlstate_returned
39004	null_value_not_allowed
39P01	trigger_protocol_violated
39P02	srf_protocol_violated
39P03	event_trigger_protocol_violated
Class 3B - Savepoint Exception
3B000	savepoint_exception
3B001	invalid_savepoint_specification
Class 3D - Invalid Catalog Name
3D000	invalid_catalog_name
Class 3F - Invalid Schema Name
3F000	invalid_schema_name
Class 40 - Transaction Rollback
40000	transaction_rollback
40002	transaction_integrity_constraint_violation
40001	serialization_failure
40003	statement_completion_unknown
40P01	deadlock_detected
Class 42 - Syntax Error or Access Rule Violation
42000	syntax_error_or_access_rule_violation
42601	syntax_error
42501	insufficient_privilege
42846	cannot_coerce
42803	grouping_error
42P20	windowing_error
42P19	invalid_recursion
42830	invalid_foreign_key
42602	invalid_name
42622	name_too_long
42939	reserved_name
42804	datatype_mismatch
42P18	indeterminate_datatype
42P21	collation_mismatch
42P22	indeterminate_collation
42809	wrong_object_type
42703	undefined_column
42883	undefined_function
42P01	undefined_table
42P02	undefined_parameter
42704	undefined_object
42701	duplicate_column
42P03	duplicate_cursor
42P04	duplicate_database
42723	duplicate_function
42P05	duplicate_prepared_statement
42P06	duplicate_schema
42P07	duplicate_table
42712	duplicate_alias
42710	duplicate_object
42702	ambiguous_column
42725	ambiguous_function
42P08	ambiguous_parameter
42P09	ambiguous_alias
42P10	invalid_column_reference
42611	invalid_column_definition
42P11	invalid_cursor_definition
42P12	invalid_database_definition
42P13	invalid_function_definition
42P14	invalid_prepared_statement_definition
42P15	invalid_schema_definition
42P16	invalid_table_definition
42P17	invalid_object_definition
Class 44 - WITH CHECK OPTION Violation
44000	with_check_option_violation
Class 53 - Insufficient Resources
53000	insufficient_resources
53100	disk_full
53200	out_of_memory
53300	too_many_connections
53400	configuration_limit_exceeded
Class 54 - Program Limit Exceeded
54000	program_limit_exceeded
54001	statement_too_complex
54011	too_many_columns
54023	too_many_arguments
Class 55 - Object Not In Prerequisite State
55000	object_not_in_prerequisite_state
55006	object_in_use
55P02	cant_change_runtime_param
55P03	lock_not_available
Class 57 - Operator Intervention
57000	operator_intervention
57014	query_canceled
57P01	admin_shutdown
57P02	crash_shutdown
57P03	cannot_connect_now
57P04	database_dropped
Class 58 - System Error
58000	system_error
58030	io_error
58P01	undefined_file
58P02	duplicate_file
Class 72 - Snapshot Failure
72000	snapshot_too_old
Class F0 - Configuration File Error
F0000	config_file_error
F0001	lock_file_exists
Class HV - Foreign Data Wrapper Error
HV000	fdw_error
HV005	fdw_column_name_not_found
HV002	fdw_dynamic_parameter_value_needed
HV010	fdw_function_sequence_error
HV021	fdw_inconsistent_descriptor_information
HV024	fdw_invalid_attribute_value
HV007	fdw_invalid_column_name
HV008	fdw_invalid_column_number
HV004	fdw_invalid_data_type
HV006	fdw_invalid_data_type_descriptors
HV091	fdw_invalid_descriptor_field_identifier
HV00B	fdw_invalid_handle
HV00C	fdw_invalid_option_index
HV00D	fdw_invalid_option_name
HV090	fdw_invalid_string_length_or_buffer_length
HV00A	fdw_invalid_string_format
HV009	fdw_invalid_use_of_null_pointer
HV014	fdw_too_many_handles
HV001	fdw_out_of_memory
HV00P	fdw_no_schemas
HV00J	fdw_option_name_not_found
HV00K	fdw_reply_handle
HV00Q	fdw_schema_not_found
HV00R	fdw_table_not_found
HV00L	fdw_unable_to_create_execution
HV00M	fdw_unable_to_create_reply
HV00N	fdw_unable_to_establish_connection
Class P0 - PL/pgSQL Error
P0000	plpgsql_error
P0001	raise_exception
P0002	no_data_found
P0003	too_many_rows
P0004	assert_failure
Class XX - Internal Error
XX000	internal_error
XX001	data_corrupted
XX002	index_corrupted

Terminologies

Database Cluster

Server (or Node)
Refers to hardware (actual or virtual) with AgensGraph installed.
Cluster (or ‘Database Cluster’)
Refers to storage space (directory, subdirectory, file) in the file system. Database cluster also has global object definitions, such as “Users and Privileges.” These things affect the entire database. There are at least three databases (‘template0’, ‘template1’, ‘postgres’) in database cluster. Roles of each database:
template0’: Template database that can be used with CREATE DATABASE command (template0 should never be modified)
template1’: Template database that can be used by CREATE DATABASE command (template1 can be modified by DBA)
postgres’: an empty database primarily for maintenance purposes
Instance (or ‘Database Server Instance’ or ‘Database Server’ or ‘Backend’)
An instance is a group of processes in a UNIX server. In a Windows server, it is a shared memory that controls and manages services and a cluster. From an IP perspective, one may assume that an instance occupies a combination of IP/port (e.g. http://localhost:5432). This means you can run different instances on different ports on the same server. It is also possible to run many instances on a single system per cluster if the server has multiple clusters.
Database
A database is a storage area in the file system where object collections are stored in files. An object consists of data, metadata (table definitions, data types, constraints, views, etc.) and other data such as indexes, all of which are stored in the default database ‘postgres’ or in a newly created database. The storage area for one database consists of a single subdirectory tree in the storage area of the database cluster. This means a database cluster may contain multiple databases.
Schema
A namespace within a database. A schema consists of named objects (tables, data types, functions, and operators) that can replicate object names in other schemas in the database. All databases contain the default schema ‘public’ and is able to contain more schemas. All objects in a schema should be in the same database, and objects in other schemas in the same database may have the same name. Each database has ‘pg_catalog,’ a special schema; ‘pg_catalog’ contains all system tables, built-in data types, functions and operators.
Search Path (or ‘Schema Search Path’)
A list of schema names. If the application uses an unqualified object name (for example, ‘employee_table’ in table name), the search path is used to find this object in the specified schema order. The ‘pg_catalog’ schema is not specified in the search path, but is always the first part of the search path. This action allows AgensGraph to find the system object.
initdb (OS command)
initdb creates a new cluster (including ‘template0’, ‘template1’, and ‘postgres’).

Consistent Writes

Checkpoint
A checkpoint is a time when a database file is guaranteed to be in a consistent state. At checkpoint time, all changes are written to the WAL file, all dirty data pages in the shared buffer are flushed to disk, and finally the checkpoint record is written to the WAL file. The instance’s checkpoint process is triggered according to a regular schedule. You can also force it through CHECKPOINT command in the client program. In the case of a database system, it takes a lot of time to execute the checkpoint command as it is physically written to disk.
WAL File
The WAL file consists of changes that are applied to data through commands such as INSERT, UPDATE, DELETE, or CREATE TABLE. This is redundant information as it is also written to the data file (for better performance). Depending on the configuration of the instance, more information may be recorded in the WAL file. The WAL file is in the pg_wal directory (pg_xlog for version 10 or earlier) in binary format with a fixed size of 16MB. A single unit of information within the WAL file is called a log record.
Logfile
An instance writes and reports warning and error messages for special situations to a readable text file. The log file recorded here can be stored at any position in the server except for clusters.
Log Record
A log record is a single unit of information within the WAL file.

FAQ

What debugging features do you have?

Compile time

First, if you are developing using a new C code, you should always work in a build environment consisting of the --enable-cassert and --enable-debug options. If you enable a debug symbol, you can trace the malfunctioning code using the debugger (e.g. gdb).
When compiling with gcc, an additional called -ggdb -Og -g3 -fno-omit-frame-pointer is useful because it inserts a lot of debugging information. You can pass it to configure as follows:

./configure --enable-cassert --enable-debug CFLAGS="-ggdb -Og -g3 -fno-omit-frame-pointer"

If you use -O0, instead of -Og, most compiler optimizations, including inlining, will be disabled. However, -Og performs much like a normal optimization flag such as -O2 or -Os, providing more debug information; it has far fewer <value optimized out> variables, generates less confusion or difficulty in changing the execution order with a quite useful performance. -ggdb -g3 tells gcc to include the maximum amount of debug information in the generated binaries, including things like macro definitions. -fno-omit-frame-pointer is useful when using trace and profiling tools (e.g. perf) as with the frame pointer that allows these tools to capture the call stack as well as the stack’s top functions.

Run time

The postgres server has a -d option to record detailed information (elog or ereport DEBUGn output). The -d option takes a number that specifies the debug level. Note that high debug level values generate large log files. This option is not available when starting the server via ag_ctl, but you can use -o log_min_messages = debug4 or something similar.

gdb

If postmaster is running, start agens in a window and find the PID of the postgres process agens using
SELECT pg_backend_pid(). Use the debugger to attach postgres PID -gdb -p 1234 or attach 1234 within the running gdb. The gdblive script can be useful as well. You can set breakpoints in the debugger and then run queries in the agens session.

Set breakpoints in errfinish to find a location to generate errors or log messages. As this will trap all of the elog and ereport calls on the available log levels, many triggers may be fired. If you are only interested in ERROR/FATAL/PANIC, use gdb conditional breakpoints on errordata[errordata_stack_depth].elevel >= 20, or set source line breakpoints on PANIC, FATAL, and ERROR in errfinish. Not all errors pass errfinish. In particular, authorization checks occur separately. If the breakpoints are not triggered, run git grep on the error text and see where it was thrown.

If you are debugging things that occurred when you start a session, you may start agens after setting PGOPTIONS="-W n". You can then delay the start for n seconds, connect to the process using the debugger, set the appropriate breakpoints, and continue the startup sequence.

You may tell the target process alternately for debugging by looking at pg_stat_activity, log,
pg_locks, pg_stat_replication, and so on.

I broke `initdb`. How do I debug it?

Sometimes a patch can cause an initdb failure. These are rare in initdb itself. A failure occurs to do some configuration job on the postgres backend, which is started more often by initdb.

If one of them crashes, putting gdb in initdb is OK by itself, and gdb will not break because initdb itself does not crash.

All you need to do is run initdb in gdb, set a breakpoint on fork, and then continue execution. When you trigger a breakpoint, the function will end. gdb will report that a child process has been created. But this is not what you want, since it is the shell that started the actual postgres instance.

Find the postgres instance that began using ps while initdb is paused. pstree -p can be useful for this. If you find it, attach a separate gdb session as gdb -p $ the_postgres_pid. At this point, you can safely disconnect gdb from initdb and debug the failed postgres instance.

I need to change the parsing query. Can you briefly describe the parser file?

The parser file is in the src/backend/parser directory.

scan.l defines a lexer that is an algorithm for splitting a string (including SQL statements) into a token stream. Tokens are usually single words and do not contain spaces; they are instead separated by spaces. For example, it can be a string enclosed in double or single quotation marks. The lexer is basically defined as a regular expression that describes various token types.

gram.y defines the grammar (syntax structure) of the SQL statement using tokens generated by the lexer as basic building blocks. The grammar is defined in BNF notation. BNF is similar to regular expressions, but works at the non-character token level. Patterns (also called rules or productions in BNF) are named and can be recursive. That is, it can use itself as a subpattern.

The actual lexer is created in scan.l with a tool called flex; its manual can be found at http://flex.sourceforge.net/manual/.

The actual parser is created in gram.y with a tool called bison; a relevant guide can be found at http://www.gnu.org/s/bison/.

However, if you have not used flex or bison before, you may find it a bit difficult to learn.

How can I efficiently access information of the system catalogs in the backend code?

First you need to find the tuple (row) you are interested in. There are two ways. First, SearchSysCache() and related functions allow you to query system catalogs using predefined indexes of the catalog. This is the primary way to access system tables because, after loading the row that needs first call on the cache, subsequent requests can return results without accessing the base table. A list of available caches is in src/backend/utils/cache/syscache.c. Many column-specific cache lookups are contained in src/backend/utils/cache/lsyscache.c.

The row returned is the heap row of the cache-owned version. Therefore, you should not modify or delete the tuple returned by SearchSysCache(). You must release it when you have finished using ReleaseSysCache(). This tells the cache that it can delete the tuple if necessary. If you do not call ReleaseSysCache(), the cache entry is locked in the cache until the transaction ends. This can be tolerated only in the stage of development (not tolerated in a code worth releasing).

If the system cache is not available, you should retrieve the data directly from the heap table using the buffer cache shared by all backends. The backend automatically loads the row into the buffer cache. To do this, use heap_open() to open the table. You can then start the table scan using heap_beginscan() and continue it as long as you use
heap_getnext() and HeapTupleIsValid() returns true. Now, run heap_endscan(). The key can be assigned to the scan. Since the index is not used, all rows are compared to the key and only valid rows are returned.

You can also fetch rows by block number/offset using heap_fetch(). While the checker automatically locks/unlocks rows in the buffer cache, it should pass the buffer pointer with heap_fetch(). When finished, releaseBuffer() should be passed.

Once there is a row, you can access the HeapTuple structure entry to get data common to all tuples such as t_self and t_oid. If you need a table-related column, you need to fetch a HeapTuple pointer and use the GETSTRUCT () macro to access the table-related start tuple. Then, cast the pointer (e.g. Form_pg_proc if accessing the pg_proc table, or Form_pg_type if accessing pg_type). You can then access the fields of the tuple using struct pointers: ((Form_pg_class) GETSTRUCT (tuple))->relnatts

However, this works only when the column is fixed-width and non-null and when all previous columns are fixed-width and absolutely null. Otherwise, as the position of the column is variable, you must extract it from the tuple using heap_getattr() or a related function.
Do not store it directly in the struct field by changing the live tuple. The best option is to use heap_modifytuple() to pass the original tuple and the value you want to change. It returns a tuple enclosed in palloc, and passes it to heap_update(). You can delete a tuple by passing tuple’s t_self to heap_delete(). Use t_self for heap_update() as well. A tuple can be a copy of the system cache that may disappear after calling ReleaseSysCache(), or that reads directly from the disk buffer when heap_getnext(), heap_endscan, or ReleaseBuffer() in the heap_fetch() case disappears (Or, it can be a palloc tuple). When done, you should perform pfree().

How can I add a new port?

There are many places to modify to add a new port. Start with the src/template directory. Add an appropriate item to the OS. Use src/config.guess to add the OS to src/template/.similar. You do not have to match the OS versions exactly. The configure test finds the correct OS version number; if not found, it finds what matches without a version number. Edit src/configure.in to add the new OS. See above configuration items. You should run autoconf or patch src/configure.

Then, check src/include/port and add the new OS file with appropriate values. Fortunately, there is a lock code for CPU in src/include/storage/s_lock.h. There is also a src/makefiles directory for port-specific Makefile processing. There is a backend/port directory in case your OS needs special files.