R Data Science Library Package
R packages are modules that contain R functions and data sets. LightDB-A Database provides a collection of data science-related R libraries that can be used with the LightDB-A Database PL/R language. You can download these libraries in .gppkg
format from VMware Tanzu Network.
This chapter contains the following information:
- R Data Science Libraries
- Installing the R Data Science Library Package
- Uninstalling the R Data Science Library Package
For information about the LightDB-A Database PL/R Language, see LightDB-A PL/R Language Extension.
Parent topic: Installing Optional Extensions (VMware LightDB-A)
R Data Science Libraries
Libraries provided in the R Data Science package include:
abind adabag arm assertthat backports BH bitops car caret caTools cli clipr coda colorspace compHclust crayon curl data.table DBI Deriv dichromat digest doParallel dplyr e1071 ellipsis fansi fastICA fBasics fGarch flashClust foreach forecast foreign fracdiff gdata generics ggplot2 glmnet glue gower gplots |
gss gtable gtools hms hybridHclust igraph ipred iterators labeling lattice lava lazyeval lme4 lmtest lubridate magrittr MASS Matrix MatrixModels mcmc MCMCpack minqa ModelMetrics MTS munsell mvtnorm neuralnet nloptr nnet numDeriv pbkrtest pillar pkgconfig plogr plyr prodlim purrr quadprog quantmod quantreg R2jags |
R2WinBUGS R6 randomForest RColorBrewer Rcpp RcppArmadillo RcppEigen readr recipes reshape2 rjags rlang RobustRankAggreg ROCR rpart RPostgreSQL sandwich scales SparseM SQUAREM stabledist stringi stringr survival tibble tidyr tidyselect timeDate timeSeries tseries TTR urca utf8 vctrs viridisLite withr xts zeallot zoo |
Installing the R Data Science Library Package
Before you install the R Data Science Library package, make sure that your LightDB-A Database is running, you have sourced lightdb_a_path.sh
, and that the $MASTER_DATA_DIRECTORY
and $GPHOME
environment variables are set.
Locate the R Data Science library package that you built or downloaded.
The file name format of the package is
DataScienceR-<version>-relhel<N>_x86_64.gppkg
.Copy the package to the LightDB-A Database coordinator host.
Follow the instructions in Verifying the LightDB-A Database Software Download to verify the integrity of the LightDB-A Procedural Languages R Data Science Package software.
Use the
gppkg
command to install the package. For example:$ gppkg -i DataScienceR-<version>-relhel<N>_x86_64.gppkg
gppkg
installs the R Data Science libraries on all nodes in your LightDB-A Database cluster. The command also sets theR_LIBS_USER
environment variable and updates thePATH
andLD_LIBRARY_PATH
environment variables in yourlightdb_a_path.sh
file.Restart LightDB-A Database. You must re-source
lightdb_a_path.sh
before restarting your LightDB-A cluster:$ source /usr/local/greenplum-db/lightdb_a_path.sh $ gpstop -r
The LightDB-A Database R Data Science Modules are installed in the following directory:
$GPHOME/ext/DataScienceR/library
Note
rjags
libraries are installed in the$GPHOME/ext/DataScienceR/extlib/lib
directory. If you want to userjags
and your$GPHOME
is not/usr/local/greenplum-db
, you must perform additional configuration steps to create a symbolic link from$GPHOME
to/usr/local/greenplum-db
on each node in your LightDB-A Database cluster. For example:
$ gpssh -f all_hosts -e 'ln -s $GPHOME /usr/local/greenplum-db'
$ gpssh -f all_hosts -e 'chown -h gpadmin /usr/local/greenplum-db'
Uninstalling the R Data Science Library Package
Use the gppkg
utility to uninstall the R Data Science Library package. You must include the version number in the package name you provide to gppkg
.
To determine your R Data Science Library package version number and remove this package:
$ gppkg -q --all | grep DataScienceR
DataScienceR-<version>
$ gppkg -r DataScienceR-<version>
The command removes the R Data Science libraries from your LightDB-A Database cluster. It also removes the R_LIBS_USER
environment variable and updates the PATH
and LD_LIBRARY_PATH
environment variables in your lightdb_a_path.sh
file to their pre-installation values.
Re-source lightdb_a_path.sh
and restart LightDB-A Database after you remove the R Data Science Library package:
$ . /usr/local/greenplum-db/lightdb_a_path.sh
$ gpstop -r
Note When you uninstall the R Data Science Library package from your LightDB-A Database cluster, any UDFs that you have created that use R libraries installed with this package will return an error.