Backup & Recovery

Like in other database systems, backup is critical in AgensGraph as well. This is because most of the important data for services in general are stored in databases. The backup and restore process is relatively simple, and we will outline the technical and conceptual aspects of the process for a better understanding. Since AgensGraph was developed based on PostgreSQL, the PostgreSQL technical documents were quoted when introducing the features of the RDB side (not GDB side).

This chapter describes file system-based backups and archive-mode backups.

Backup

File system level backup

This is a way of backup at the file system level by directly copying the entire data storage space. It is not much different from the file system backup done by general operating systems. For instance:

tar -cf backup.tar /usr/local/pgsql/data

Backup with a command above is possible. However, as there are two limitations as follows to this approach, this is less useful in practical terms than using pg_dump command.

You must suspend the database server for normal backup operations. Blocking all client connections and copying the file system does not guarantee a safe backup. (This is because neither tar command nor any other similar tools provide a consistent snapshot of the file system state and can accurately reflect information about the internal buffering used by the server). Likewise, when restoring, the server must be in a stopped state.
If you know the locations and intended uses of various files in the database well, you may think that you will be able to copy/use only the files that correspond to individual databases or tables. Even if it is true, normal operation is not possible without commit log files. pg_clog/* files store the commit status of all transactions. Since these log files are processed as database clusters (not as tables), partial copying is impossible; even if you copy physical files and log files of individual tables, they may be related to other tables.

Another thing to consider in the file system-based backups is that if the file system provides a “consistent (and reliable) snapshot” feature, you should be careful when using the feature. A typical backup method using the file system snapshot feature: i) create a “frozen snapshot”; ii) copy all the data cluster files (as mentioned above, there is no point in back up partial files); iii) then unlock the locked snapshot. If the file system provides this feature, backup while the database server is running is possible. At this time, as backup is done while in operation, the database files are saved without properly closing the database server. When you start the database server with backup data, it will recognize the old server as a conflict and replay the WAL log. However, if you run the CHECKPOINT command before starting and keep the WAL logs generated in the course of the backup operation separately, the execution will be done without a big issue. (You may, of course, ignore the WAL logs.)

The problem is that the database server uses multiple file systems. For example, if you create a tablespace that uses different partitions, you should be able to create a “frozen snapshot” of all the file systems used by the database server at the same time. To generate backups in this way, please read through the filesystem documentation carefully and conduct backups after you weigh the benefits against the risks involved.

If it is not possible to create multiple file system snapshots simultaneously, you must stop the database server while creating the snapshot. Another way is to use a rsync application. The work proceeds in two steps. When the server is running, it synchronizes all the data first. Then, it stops the server and synchronizes data once again. This will minimize server downtime because only the changed files will be synchronized while the server is stopped.

File system backups typically require more backup space than SQL dumps. (This is because dump files created by pg_dump do not have indexes, physically, and only contain commands to make the indexes.) For this reason, the restore operation can be done much faster in file system backups.

Archive mode backup

Creating an archive of Write Ahead Logging (WAL)

Internally, the AgensGraph system constantly creates WAL records in manipulating the database. Those records are divided into WAL segment files so they can be stored in physical disk space. A single WAL file is basically a 16MB file (this size is determined when you compile the server). The filename uses the corresponding number in the WAL order. If you set it not to create a WAL archive file, only a few of these files will be created, and you will need to find and “reuse” any log files that you no longer use. Internally, it finds the status information of WAL records, reuses it by changing the checkpoint operation to no-longer-in-use state and recording a new WAL record in its place.

If you create a WAL archive file, you archive existing WAL record information elsewhere (same host, NFS mount point, or even tape) before reusing a certain WAL segment file. The archiving method is specified directly by the administrator. Let the administrator decide even in the case where there is already a file with the same name. However, when reusing the WAL segment file, the AgensGraph server only performs operations according to the “method of archiving” set by the administrator. You may simply use the cp command for the archive, use a more complex custom shell script, or use a backup solution command. This part is entirely up to the administrator.

To create a WAL archive file, specify archive (or hot_standby) with the value of the configuration parameter wal level. Next, set the environment configuration parameter archive mode to on, and the environment configuration parameter archive command to an appropriate system command. All of these settings are configured in the postgresql.conf file. You can use %p (absolute path to the WAL log file) and %f (the name of the log file to be archived) reserved arguments for the shell command to specify as the archive_command value. If you need to type % literally, type %%. In general, this setting is used in the following format.

# Unix
  archive_command = 'test ! -f /mnt/server/archivedir/%f && cp %p /mnt/server/archivedir/%f'   
# Windows
  archive_command = 'copy "%p" "C:\\server\\archivedir\\%f"'  

The above configuration simply copies the WAL segment file to the /mnt/server/archivedir directory (with the same file name). The above is only an example for Unix and Windows operating systems. Actual settings should be changed to match the operating system. %p %f reserved arguments are changed as follows for actual execution of the command.

test ! -f /mnt/server/archivedir/00000001000000A900000065 && 
cp pg_xlog/00000001000000A900000065\ /mnt/server/archivedir/00000001000000A900000065

This will always make the WAL log file fresh at a certain location.

This archive command is executed by the privileges of the system user who ran the AgensGraph server. So, just like processing WAL segment files, you should consider security policies when processing archive files. It should be noted that a fatal security incident can occur if it can be read, overwritten and deleted by anyone.

Another thing to note is that the archive command’s shell execution return value should be 0 (zero) if the command is successful, or a different value if the command is not successful. The server will determine if the log file has been successfully archived or failed with this return value, and if successful, will either erase or reuse the original log file; if unsuccessful, it will retry until it succeeds.

You should not overwrite files that already exist with this archive command. Overwriting may cause unintended errors in the restore operation. (This file may already have been created or used by another database.) It is safest for the administrator to manually process the log file because constant errors may occur if it already exists.

It is recommended to check if the file already exists before you archive a log file. If it exists, set the command to return a non-zero value. On Unix, a test command is provided for this task. On some Unix platforms, the -i option is provided in the cp command to avoid overwriting files that already exist. You must check the return value when using this command. (The GNU cp command returns “0” if there is a log file, which is not desirable.)

When you save a WAL file separately, operational jobs may need to be aborted manually, or saving operation may fail repeatedly because of insufficient storage space. For example, if you want to archive a WAL segment file to a tape storage device and the device does not automatically exchange tapes and there is no free space on the tape; then, the user continues to run the file write operation until an error occurs and may repeat the process. In such a case, the pg_xlog/ directory will not delete or reuse the WAL segment file because the data has not been safely copied to the tape storage, and the new WAL segment files created later will continue to accumulate and eventually stop as there will be no more free space in the pg_xlog/ directory and a PANIC error will occur in the server.

You also have to worry about the data recording speed of the WAL file when it is stored separately. Even if the work is properly done, the same problem may occur if the speed at which the WAL file is created is faster than the speed at which the file is kept separately. If the pg_xlog/ directory has enough free space, it will not be a problem, of course. Before using this feature, you should consider this kind of problem in this section sufficiently; the administrator should monitor whether this function works as intended as well.

The path to the save file should be up to 64 ASCII characters. The file name must use the reserved word %f (i.e. only directories can be changed). The original WAL segment file must use the reserved word %p. Note that changes to the postgresql.conf, pg_hba.conf, and pg_ident.conf files are not reflected in the database to be restored by this backup method because the WAL file contains only transactional operation information. You should back up these configuration files according to your system’s normal file backup policy.

The archive command is executed for files that are all reflected on the server (rollbacked or committed and then checkpointed) among the WAL files. That is, for a database with very small workloads, the interval at which archive commands are executed is very long. If there is a database failure at this interval, data loss occurs. Thus, you must forcefully use the archive command to save this segment file separately after a certain time before all the content of the WAL segment file is processed. If the setting value is too short, the disk space can be wasted (a disadvantage). Typically, the setting in minutes is appropriate. You may also force the user to change the segment file using the pg_switch_xlog function. This function is typically used when bulk data entry, modification, or deletion occurs and immediate backups are required.

If you set the wal_level value to minimal, you will not be able to save this WAL log and restore it based on it. This is because, if you use this setting, you do not keep the restore-related information in the WAL file. Therefore, when you change the wal_level setting, you must restart the server. In the case of archive_command setting, however, reloading the configuration files is sufficient. Stopping this operation during operation may be needed. To do that, specify the archive_command setting as an empty string (‘’). The archive_command setting will remain in the pg_xlog/ directory until you copy the WAL file again.

Recovery

Recovering Using a Continuous Archive Backup

Describes how to recover a database server using backups archived in the archive mode backup method when a problem occurs. The order of operation is as follows:

Stop the server first if it is running.
If the system has enough disk space, copy the entire database cluster directory, any associated custom tablespace files, and any required WAL segment files to a temporary location. To run this on a system that was already running, you would need at least twice as much disk space. If you do not have such free space, at least save the pg_xlog files and server configuration files in the cluster directory of your existing database server. If you save the WAL segment files that have been and have not been backed up, they can be restored to the state just before the server was stopped.
Delete both the data cluster directory for the existing server and the directory for the existing tablespace.
Copy the backup file to its original location. The user at this time must be a system user running the database server. (If you are working as root, you must change it to the owner!) Thus, you need to set the file access permissions and owner to match the system user. When using the next user-defined tablespace, copy the source directory and any files in it that are symbolic links in the pg_tblspc/ directory.
Leave the pg_xlog/ directory empty. If there are any files in this directory, they should be removed. The necessary files are automatically created in this directory during the recovery process. If you have excluded the pg_xlog/ directory from backup, create a directory or a symbolic link to make the directory available to the database server. Again, you need to make sure that the directory permissions and owner are only available to the system user running the database server. If it is a symbolic link, you must make sure that it is the same as the source path used in the previous database so that you can overwrite the previous data.
If you do not have a backup copy of the WAL segment file, copy the pg_xlog/ files you copied in step 2 to the location you want to restore. (copying, not relocating, is recommended. Copying is safe because recovery is not complete yet.)
Create recovery.conf in the database cluster directory. It is also a good idea to temporarily modify pg_hba.conf so that it is not accessible from the outside during the recovery process.
Run the server. The server runs in recovery mode and finds the required WAL files and begins to process the unrecovered transactions in batches. If the server hangs due to external impact during the recovery operation, simply resume the server and continue the recovery operation. At the end of the recovery process, the server changes the name of recovery.conf into recovery.done and waits for the client to connect to it in the normal run state; this is to prevent it from being rerun in recovery mode.
Now connect to the database to see if the data is normal; if not, review the server logs and go back to step 1. If all the data and database status are normal, modify the content of pg_hba.conf and allow external access.

The key to this recovery method is how to apply the backup WAL segment file to the database and until what point the restore should be made. This is specified in recovery.conf. A simple way to create this file is to first copy recovery.conf.sample into the share/ directory of the database distros and modify only the necessary parts. The essential part of recovery.conf is restore_command. This setting is a definition of how to apply the backup WAL segment file in AgensGraph. In short, you can specify a command that is the opposite of archive_command in the server environment settings. The reserved words used here are the same as when specifying the archive_command setting value. The %f value is the file name in the backup archive directory, and the %p value is replaced with a transaction log file. (If relative paths are used, they are processed relative to the current directory in which the server is running.) Reserved word %% is treated as % characters. The following are typical settings.

restore_command = 'cp /mnt/server/archivedir/%f %p'

he above configuration is used to restore the WAL segment files that have been previously backed up in the /mnt/server/archivedir directory. Of course, this command can be much more complicated depending on the backup device you are using. You can also create and use a shell script that includes some commands directly (e.g. mounting a tape and moving it to a desired location like a tape backup device). The important point here is that if the operation specified here fails, the return value of the operation must be non-zero. This operation also requests files that are not in the archive backup directory. The result of this operation must return a non-zero return value (this is not an error). If there is no command in the script, or if the server is shut down normally and a signal other than SIGTERM signal (as part of recovery process) stops the operation, that is considered an error. In such a situation, the recovery operation is stopped, and the server stops operating as well.

The server is able to find a file ending with .backup or .history. If there is no file the server is searching for, it should tell the server that the file does not exist (by returning a non-zero value). The %f filename and the %p filename copied to the server are not always the same. For this reason, you should be careful not to make mistakes while processing this part when you create and use your own shell script.

If you cannot find the WAL segment file in the archive backup directory, look in the pg_xlog/ directory inside the data cluster. If there are more WAL segment files to be processed, the files will be processed even if they are not backed up. However, if the file with the same filename already exists in the archive backup directory, that file will be applied, the file in pg_xlog/ will be ignored, and the new WAL segment name will reflect that log. For this reason, you must clean up the WAL segment files before recovery process.

Typically, the recovery job reflects the last log file in the archive backup directory. As it finds until there is no next file, the last of the recovery log is printed as a “file not found” message. You can also find the file of 00000001.history when you start the recovery process. All of these are logs that occur during a normal recovery process.

The recovery operation can also be stopped at any point by specifying a recovery stop point in recovery.conf. This point is called the “recovery target.” This feature is useful for an inexperienced database administrator to help recover from operational errors, such as accidentally deleting critical tables. The “recovery target” can be specified at a specific time, or can be any string specified by the administrator. It can also be specified with a specific transaction ID. If you cannot use a tool to determine at which transaction ID an accident occurred, simply specify a specific time or a predefined string specified by the administrator for recovery.

The point at which recovery is stopped should be after the base backup time. It should be specified to be later than the time when pg_stop_backup command was executed. You cannot use the base backup to back up the data cluster files to the file system level and restore them to the time they were backed up. If you need to do this, you need the old base backup data and the corresponding WAL backup files.

If there is a problem with the WAL file itself, it only applies to transactions that have been processed normally; both the recovery operation and server stop. In this case, it is necessary to identify the time when the problem occurs and restore it from the beginning to the point of “recovery target.” If the operation is stopped due to external influences during the recovery operation, the cause of the problem must be solved. Then, simply rerun the server and continue the recovery process. The recovery operation is re-executed together with a checkpoint operation in the normal execution environment. Internally, the pg_control file is periodically updated and already-reflected logs are no longer reworked. This means there is no need to worry about duplicate processing.