lt_distributed_restore.py

lt_distributed_restore.py
Prev	Up	LightDB Client Applications	Home	Next

Description

lt_distributed_restore.py is a utility for restoring a LightDB database from an archive created by lt_distributed_dump.py in one of the non-plain-text formats. It will issue the commands necessary to reconstruct the database to the state it was in at the time it was saved. The archive files also allow lt_distributed_restore.py to be selective about what is restored, or even to reorder the items prior to being restored. The archive files are designed to be portable across architectures.

Obviously, lt_distributed_restore.py.pyore cannot restore information that is not present in the archive file. For instance, if the archive was made using the “dump data as INSERT commands” option, lt_distributed_restore.py will not be able to load the data using COPY statements.

Options

lt_distributed_restore.py accepts the following command line arguments.

-f dirname --folder=dirname

Specify dump file's dir for restore

-a --data-only

Restore only the data, not the schema (data definitions). Table data, large objects, and sequence values are restored, if present in the archive.

This option is similar to specifying --section=data.

-c --clean

Clean (drop) database objects before recreating them. (Unless --if-exists is used, this might generate some harmless error messages, if any objects were not present in the destination database.)

-K --recreate-schema

Output commands to directly drop schema(use cascade mode) prior to outputting the commands for creating them, more faster then normal mode. (Unless --if-exists is also specified, restore might generate some harmless error messages, if any objects were not present in the destination database.)

-d dbname --dbname=dbname

Connect to database dbname and restore directly into the database. The dbname can not be a connection string.

-e --exit-on-error

For lt_restore called by lt_distributed_restore.py, It will exit if an error is encountered while sending SQL commands to the database. The default is to continue and to display a count of errors at the end of the restoration.

-F format --format=format

Specify format of the archive. It is not necessary to specify the format, since lt_restore will determine the format automatically. If specified, it can be one of the following:

c custom: The archive is in the custom format of lt_dump.
d directory: The archive is a directory archive.
t tar: The archive is a tar archive.

-I index --index=index

Not support yet.

-j number-of-jobs --jobs=number-of-jobs

Run the most time-consuming steps of lt_restore — those that load data, create indexes, or create constraints — concurrently, using up to number-of-jobs concurrent sessions. This option can dramatically reduce the time to restore a large database to a server running on a multiprocessor machine. This option is ignored when emitting a script rather than connecting directly to a database server.

Each job is one process or one thread, depending on the operating system, and uses a separate connection to the server.

The optimal value for this option depends on the hardware setup of the server, of the client, and of the network. Factors include the number of CPU cores and the disk setup. A good place to start is the number of CPU cores on the server, but values larger than that can also lead to faster restore times in many cases. Of course, values that are too high will lead to decreased performance because of thrashing.

Only the custom and directory archive formats are supported with this option. The input must be a regular file or directory (not, for example, a pipe or standard input).

-l --list

Not support yet.

-L list-file --use-list=list-file

Not support yet.

-n schema --schema=schema

Restore only objects that are in the named schema. Multiple schemas may be specified with multiple -n switches. This can be combined with the -t option to restore just a specific table.

-N schema --exclude-schema=schema

Do not restore objects that are in the named schema. Multiple schemas to be excluded may be specified with multiple -N switches.

When both -n and -N are given for the same schema name, the -N switch wins and the schema is excluded.

-O --no-owner

Do not output commands to set ownership of objects to match the original database. By default, lt_restore issues ALTER OWNER or SET SESSION AUTHORIZATION statements to set ownership of created schema elements. These statements will fail unless the initial connection to the database is made by a superuser (or the same user that owns all of the objects in the script). With -O, any user name can be used for the initial connection, and this user will own all the created objects.

-P function-name(argtype [, ...]) --function=function-name(argtype [, ...])

Not support yet.

-s --schema-only

Restore only the schema (data definitions), not data, to the extent that schema entries are present in the archive.

This option is the inverse of --data-only. It is similar to specifying --section=pre-data --section=post-data.

(Do not confuse this with the --schema option, which uses the word “schema” in a different meaning.)

-S username --superuser=username

Specify the superuser user name to use when disabling triggers. This is relevant only if --disable-triggers is used.

-t table --table=table

Restore definition and/or data of only the named table. For this purpose, “table” includes views, materialized views, sequences, and foreign tables. Multiple tables can be selected by writing multiple -t switches. This option can be combined with the -n option to specify table(s) in a particular schema.

Note

When -t is specified, lt_distributed_restore.py makes no attempt to restore any other database objects that the selected table(s) might depend upon. Therefore, there is no guarantee that a specific-table restore into a clean database will succeed.

Note

while lt_distributed_dump.py's -t flag will also dump subsidiary objects (such as indexes) of the selected table(s), lt_distributed_restore.py's -t flag does not include such subsidiary objects.

-T trigger --trigger=trigger

Not support yet

-v --verbose

Specifies verbose mode.

-V --version

Print the lt_distributed_restore.py version and exit.

-x --no-privileges --no-acl

Prevent restoration of access privileges (grant/revoke commands).

-1 --single-transaction

Not supported yet.

--disable-triggers

This option is relevant only when performing a data-only restore. It instructs lt_restore to execute commands to temporarily disable triggers on the target tables while the data is restored. Use this if you have referential integrity checks or other triggers on the tables that you do not want to invoke during data restore.

Presently, the commands emitted for --disable-triggers must be done as superuser. So you should also specify a superuser name with -S or, preferably, run lt_distributed_restore.py as a LightDB superuser.

--enable-row-security

This option is relevant only when restoring the contents of a table which has row security. By default, lt_restore will set row_security to off, to ensure that all data is restored in to the table. If the user does not have sufficient privileges to bypass row security, then an error is thrown. This parameter instructs lt_restore to set row_security to on instead, allowing the user to attempt to restore the contents of the table with row security enabled. This might still fail if the user does not have the right to insert the rows from the dump into the table.

Note that this option currently also requires the dump be in INSERT format, as COPY FROM does not support row security.

--if-exists

Use conditional commands (i.e., add an IF EXISTS clause) to drop database objects. This option is not valid unless --clean is also specified.

--no-comments

Do not output commands to restore comments, even if the archive contains them.

--no-data-for-failed-tables

Not supported yet.

--no-publications

Do not output commands to restore publications, even if the archive contains them.

--no-security-labels

Do not output commands to restore security labels, even if the archive contains them.

--no-subscriptions

Do not output commands to restore subscriptions, even if the archive contains them.

--no-tablespaces

Do not output commands to select tablespaces. With this option, all objects will be created in whichever tablespace is the default during restore.

--section=sectionname

Only restore the named section. The section name can be pre-data, data, or post-data. This option can be specified more than once to select multiple sections. The default is to restore all sections.

The data section contains actual table data as well as large-object definitions. Post-data items consist of definitions of indexes, triggers, rules and constraints other than validated check constraints. Pre-data items consist of all other data definition items.

--strict-names

Not supported yet.

--use-set-session-authorization

Output SQL-standard SET SESSION AUTHORIZATION commands instead of ALTER OWNER commands to determine object ownership. This makes the dump more standards-compatible, but depending on the history of the objects in the dump, might not restore properly.

--table_exists_action

Tells lt_distributed_restore.py what to do if the table it is tring to create already exists. table_exists_action has four options: skip,append,truncate or replace, the possible values have the following effects:

skip: skip leaves the table as is and moves on to the next object.
append: append loads rows from the source and leaves existing rows unchanged.
truncate: truncate deletes existing rows and then loads rows from the source.
replace: replace deletes existing rows and then loads rows from the source.

When you use truncate or replace, ensure that rows in the affected tables are not targets of any referential constraints. When you use append or truncate, ensure that rows from the source are compatible with the existing table before performing any action.

--parallel-num=number-of-lt_restore

Parallel execute lt_restore.

--help

Show help about lt_distributed_restore.py command line arguments, and exit.

lt_distributed_restore.py also accepts the following command line arguments for connection parameters:

-h host --host=host: Specifies the host name of the machine on which the server is running. If the value begins with a slash, it is used as the directory for the Unix domain socket. The default is taken from the LTHOST environment variable, if set, else 'localhost'.
-p port --port=port: Specifies the TCP port or local Unix domain socket file extension on which the server is listening for connections. Defaults to the LTPORT environment variable, if set, or '5432'.
-U username --username=username: User name to connect as. Defaults to the LTUSER environment variable, if set, or current user.
-w --no-password: Never issue a password prompt. If the server requires password authentication and a password is not available by other means such as a .pgpass file, the connection attempt will fail. This option can be useful in batch jobs and scripts where no user is present to enter a password.
-W --password: Force lt_distributed_restore.py to prompt for a password before connecting to a database.
--role=rolename: Specifies a role name to be used to perform the restore. This option causes lt_restore to issue a SET ROLE rolename command after connecting to the database. It is useful when the authenticated user (specified by -U) lacks privileges needed by lt_restore, but can switch to a role with the required rights. Some installations have a policy against logging in directly as a superuser, and use of this option allows restores to be performed without violating the policy.

The following command-line options control the loggin parameters.

-l log-directory --log-directory=log-directory: Specifies the log directory path. default is '/tmp/ltAdminLogs'
--log-level-console=log-level-console: Specifies the console log level.
--log-level-file=log-level-file: Specifies the file log level.
--log-filename=log-filename: Specifies the log file name. Default is 'lt_distributed_restore-%Y-%m-%d.log'.

Environment

LTHOST LTOPTIONS LTPORT LTUSER: Default connection parameters

Diagnostics

If you have problems running lt_distributed_restore.py, make sure you are able to select information from the database using, for example, ltsql. Also, any default connection settings and environment variables used by the libpq front-end library will apply.

Notes

If your installation has any local additions to the template1 database, be careful to dump with '--lt-exclude-lightdb-objects'.

The limitations of lt_distributed_restore.py are detailed below.

When restoring data to a pre-existing table and the option --disable-triggers is used, lt_restore called by lt_distributed_restore.py emits commands to disable triggers on user tables before inserting the data, then emits commands to re-enable them after the data has been inserted. If the restore is stopped in the middle, the system catalogs might be left in the wrong state.
lt_distributed_restore.py cannot restore large objects selectively; for instance, only those for a specific table. If an archive contains large objects, then all large objects will be restored, or none of them if they are excluded via -t, or other options.

See also the lt_distributed_dump.py documentation for details on limitations of lt_distributed_restore.py.

Once restored, it is wise to run ANALYZE on each restored table so the optimizer has useful statistics; see Section 22.1.3 and Section 22.1.6 for more information.

Examples

Assume we have dumped a database called mydb into a custom-format dump file:

$ lt_dump -Fc -d mydb -f dumpdir --lt-exclude-lightdb-objects

Note that we need dump with '--lt-exclude-lightdb-objects' to exclude extension.

To restore the dump into a new distributed database called newdb :

$ lt_restore -d newdb -f dumpdir