Migrating PXF from LightDB-A 5.x to 6.x
Note PXF also supports
gpupgrade
-based LightDB-A upgrade in additional to the manual migration described in this topic.
If you are using PXF in your LightDB-A Database 5.x installation, upgrade LightDB-A Database to version 5.21.2 or newer before you migrate PXF to LightDB-A Database 6.10.1 or newer.
The PXF LightDB-A Database 5.x to 6.x migration procedure has two parts. You perform one PXF procedure in your LightDB-A Database 5.x installation, then install, configure, and migrate data to LightDB-A 6.x:
- Step 1: Complete PXF LightDB-A Database 5.x Pre-Migration Actions
- Step 2: Migrate PXF to LightDB-A Database 6.x
Prerequisites
Before migrating PXF from LightDB-A 5.x to LightDB-A 6.x, ensure that you can:
- Identify your OS Version. LightDB-A Database 6.x does not support running PXF on CentOS 6.x or RHEL 6.x due to a limitation with the version of
cURL
. See Known Issues and Limitations in the LightDB-A Database release notes for more details. - Identify the version number of your LightDB-A 5.x installation. If it is older than 5.21.2, upgrade to a newer 5.x version.
- Identify the version of PXF running in the LightDB-A 5.x cluster.
Identify the location of your PXF installation:
- If you installed PXF from a separate
rpm
ordeb
, the PXF install location is/usr/local/pxf-gp<greenplum-major-version>
. - If you are running the PXF bundled in the LightDB-A Database 5.x Server installation, the PXF install location is
$GPHOME/pxf
.
- If you installed PXF from a separate
Determine if you have
gphdfs
external tables defined in your LightDB-A 5.x installation.Identify the file system location of the PXF
$PXF_CONF
directory in your LightDB-A 5.x installation. (If you are unsure of the location, you can find the value inpxf-env-default.sh
.) In LightDB-A 5.15 and later,$PXF_CONF
identifies the user configuration directory that was provided to thepxf cluster init
command. This directory contains PXF server configurations, security keytab files, and log files.
Step 1: Complete PXF LightDB-A Database 5.x Pre-Migration Actions
Perform this procedure in your LightDB-A Database 5.x installation:
Log in to the LightDB-A Database coordinator node. For example:
$ ssh gpadmin@<gp5coordinator>
Identify and note the LightDB-A Database version number of your 5.x installation. For example:
gpadmin@gp5coordinator$ psql -d postgres
SELECT version(); version ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ PostgreSQL 8.3.23 (LightDB-A Database 5.21.2 build commit:610b6d777436fe4a281a371cae85ac40f01f4f5e) on x86_64-pc-linux-gnu, compiled by GCC gcc (GCC) 6.2.0, 64-bit compiled on Aug 7 2019 20:38:47 (1 row)
Identify and note the version number of PXF running in your LightDB-A 5.x installation. For example:
gpadmin@gp5coordinator$ pxf version
LightDB-A 6 removes the
gphdfs
external table protocol. If you havegphdfs
external tables defined in your LightDB-A 5.x installation, you must delete or migrate them topxf
as described in Migrating gphdfs External Tables to PXF.Stop PXF on each segment host as described in Stopping PXF.
If you plan to install LightDB-A Database 6.x on a new set of hosts, be sure to save a copy of the
$PXF_CONF
directory in your LightDB-A 5.x installation.Install and configure LightDB-A Database 6.x, migrate LightDB-A 5.x table definitions and data to your LightDB-A 6.x installation, and then continue your PXF migration with Step 2: Migrating PXF to LightDB-A 6.x.
Step 2: Migrating PXF to LightDB-A 6.x
After you upgrade to LightDB-A Database 6.x, and table definitions and data from the LightDB-A 5.x installation have been migrated or upgraded, perform the following procedure to configure and start the new PXF software:
Log in to the LightDB-A Database 6.x coordinator node. For example:
$ ssh gpadmin@<gp6coordinator>
Identify and note the version number of your LightDB-A Database 6.x installation.
PXF releases different software packages for LightDB-A Database version 5 and version 6.
If you are running version 6.x of the independent PXF distribution that you installed via an
rpm
ordeb
in your LightDB-A 5.x installation:- Install the same PXF 6.x version for LightDB-A 6 on your LightDB-A 6.x hosts as described in Installing PXF.
Copy the PXF extension control file from the PXF installation directory to the new LightDB-A 6.x install directory:
gpadmin@gp6coordinator$ pxf cluster register
Start PXF on each host:
gpadmin@gp6coordinator$ pxf cluster start
Skip the following steps and exit this procedure.
If you are running PXF version 5.x, you must install the latest independent PXF 5.16.x distribution for LightDB-A 6 on your LightDB-A 6.x hosts as described in Installing PXF.
Note: If you are interested in running PXF version 6.x, upgrade to that version after you complete this entire procedure and verify that PXF is working in your LightDB-A 6.x installation.
Identify and note the PXF (to) version number. For example:
gpadmin@gp6coordinator$ pxf version
If you installed LightDB-A Database 6.x on a new set of hosts, copy the
$PXF_CONF
directory from your LightDB-A 5.x installation to the coordinator node. Consider copying the directory to the same file system location at which it resided in the 5.x cluster. For example, ifPXF_CONF=/usr/local/greenplum-pxf
:gpadmin@gp6coordinator$ scp -r gpadmin@<gp5coordinator>:/usr/local/greenplum-pxf /usr/local/
Initialize PXF on each segment host as described in Initializing PXF, specifying the
PXF_CONF
directory that you copied in the step above.If you are migrating PXF from LightDB-A Database version 5.23 or earlier and you have configured any JDBC servers that access Kerberos-secured Hive, you must now set the
hadoop.security.authentication
property in thejdbc-site.xml
file to explicitly identify use of the Kerberos authentication method. Perform the following for each of these server configs:- Navigate to the server configuration directory.
Open the
jdbc-site.xml
file in the editor of your choice and uncomment or add the following property block to the file:<property> <name>hadoop.security.authentication</name> <value>kerberos</value> </property>
Save the file and exit the editor.
If you are migrating PXF from LightDB-A Database version 5.26 or earlier: The PXF
Hive
andHiveRC
profiles now support column projection using column name-based mapping. If you have any existing PXF external tables that specify one of these profiles, and the external table relied on column index-based mapping, you may be required to drop and recreate the tables:- Identify all PXF external tables that you created that specify a
Hive
orHiveRC
profile. For each external table that you identify in step 1, examine the definitions of both the PXF external table and the referenced Hive table. If the column names of the PXF external table do not match the column names of the Hive table:
Drop the existing PXF external table. For example:
DROP EXTERNAL TABLE pxf_hive_table1;
Recreate the PXF external table using the Hive column names. For example:
CREATE EXTERNAL TABLE pxf_hive_table1( hivecolname int, hivecolname2 text ) LOCATION( 'pxf://default.hive_table_name?PROFILE=Hive') FORMAT 'custom' (FORMATTER='pxfwritable_import');
Review any SQL scripts that you may have created that reference the PXF external table, and update column names if required.
- Identify all PXF external tables that you created that specify a
Synchronize the PXF configuration from the LightDB-A Database 6.x coordinator host to the standby coordinator and each segment host in the cluster. For example:
gpadmin@gp6coordinator$ pxf cluster sync
Start PXF on each LightDB-A Database 6.x segment host:
gpadmin@gp6coordinator$ pxf cluster start
Your LightDB-A Database cluster is now running version 5.16.x of PXF, and running it from the PXF installation directory (
/usr/local/pxf-gp<greenplum-major-version>
). Should you wish to upgrade PXF in the future, consult the PXF upgrade documentation.Verify the migration by testing that each PXF external table can access the referenced data store.