Many sites have limited network access between nodes running DataStax Enterprise and the internet, and these connectivity constraints can make downloading software directly from internet slow or impossible. Lifecycle manager supports installation of "offline" nodes through a variety of mechanisms.
LCM offers many options for offline installation so that it can be operated effectively in a wide variety of environments. With so many options to choose from, settling on a comprehensive strategy can be daunting.
The simplest option is often to use a
proxy if one is available or can be created. A proxy will allow LCM to complete all necessary downloads by itself, whereas if no proxy is available then it's often necessary to combine multiple offline techniques to download all the required files. With a little extra configuration, a proxy can also cache most downloads and is a good choice for accelerating installs at sites with slow connections to the internet. The sample squid config at the bottom of this answer can help you get started.
However, there often isn't a choice. The site-specific tools and policies that necessitate offline installs in the first place will constrain the viable strategies. If that's the case, read on and craft a customized offline strategy that works for your site.
Different Mechanisms for Different Software Sources
When LCM deploys DataStax enterprise on a target node, it generally must download software from several sources that have differing constraints on how they may be adapted for offline environments.
DataStax software in
.rpm formatted packages must be downloaded from
debian.datastax.com. In offline environments, DSE packages can be downloaded either via a
Proxy or a
If target nodes are able to perform http/https downloads via a proxy like
Squid, then LCM can configure the targets to use the proxy when downloading DSE packages. This can be configured on the
Package Proxy page of your configuration profile(s). See Configuring a proxy for package downloads documentation for details.
Note that DSE packages are password protected. Most proxies will allow downloads from password protected urls, but won't cache and accelerate the downloads which means that LCM installs can still overwhelm sites with slow internet access even with the proxy in use.
Some proxies can be explicitly set to cache certain password-protected urls. Configuring a proxy this way violates RFCs and if access to the proxy is misconfigured could result in accidentally publicly redistributing DataStax software without authorization. Caching should only be done in cases where the proxy is private. An example Squid 3.x config is provided later in this article that demonstrates caching of authenticated downloads.
The other approach for allowing offline access to DSE packages is to download from a local mirror of the DataStax package repository rather than the official repo hosted at
datastax.com. This can be configured in LCM by editing your repository and choosing a private repo.
In general customers that want to do this have someone available that knows how to create the mirror in the first place, but we're aiming document the process of mirroring the DSE package repositories in ENGOPS-528.
Operating System Packages
DataStax packages have a small number of dependencies on packages provided by your operating system vendor. In general, customers with offline DSE nodes have already installed their operating system and have configured it to be able to install and update packages provided by the operating system vendor.
If not, offline access to operating system packages can be provided by:
- Proxy: This can be configured in LCM as described above for DSE packages.
- Repository Mirror: Operating system vendors provide mechanisms to mirror their package repositories locally and this can be configured outside of LCM.
- Pre-Install: If some manual method is available to install operating system packages outside of LCM, they can be pre-installed, LCM will confirm they are present, and will not attempt to download them again.
Package Signing Key
DSE packages in
.rpm format are signed with a DataStax-owned key. Target nodes need to use this key to ensure that the DSE packages haven't been tampered with while transiting the network. The following mechanisms are available to make this key available to offline nodes:
- Proxy: The file can be downloaded through a proxy, as described above for downloading DSE packages. It is typically downloaded via https and therefore will not be cached. However, it is <5KB in size and does not impose a significant bandwidth or download time overhead.
- Custom Url: It is possible to download the signing-key outside of LCM and to host it on an internal web-server accessible to the target nodes. Custom repo-key-urls can be configured by editing your repository in LCM and choosing a private-repo.
LCM is capable of installing Java for you. In LCM 6.0.0 through 6.5.latest, Oracle Java will be installed by default, and requires a custom download-url to be specified. In LCM 6.7.0 and greater, OpenJDK will be installed by default.
The OpenJDK runtime is installed from the java packages provided by the Operating System. See the
Operating System Packages section above for details on offline configuration.
Oracle Java Runtime
The Oracle Java runtime is downloaded from oracle.com. It may be made available in offline environments via:
- Custom URL: It is possible to download the Oracle Java tarball outside of LCM and to host it on an internal web-server accessible to the target nodes. LCM can be configured to use a custom url for java as described in the Manage Java Installs documentation.
- Pre-installed: Any appropriately licensed and supported java can be installed outside of LCM as a pre-requisite before running the install job, provided DSE and LCM are able to find the installed java. LCM can be configured not to manage java as described in the Manage Java Installs documentation.
Oracle Unlimited Strength JCE Policy
If LCM is used to install Oracle Java, then the Oracle unlimited strength JCE policy may be optionally downloaded an installed by LCM to enable DSE to perform strong encryption. The options are available for making the JCE policy file avilable to offline targets are the same as those described above for the Oracle Java tarball.
Example Squid Config
In order to get the most caching and fastest downloads possible, custom configs that violate RFC are necessary. Do not run this configuration on a publicly accessible proxy server! LCM is compatible with many http/https caching proxies, but an example configuration for Squid 3.x is provided below for reference:
# The order of maximum_object_size, cache_replacement_policy, and cache_dir matter. # They may not be ordered as needed in the default squid.conf. # # Cache large objects from like rpm/deb packages maximum_object_size 2048 MB # # Favor caching a few large objects over many smaller ones. cache_replacement_policy heap LFUDA # Cache to disk, not just in-memory cache_dir ufs /var/spool/squid3 5000 16 256 # Squid defaults to being accessible on localhost only # Customize the localnet to match your network environment acl localnet src 192.168.0.0/24 http_access allow localnet # Custom Patterns to cache DataStax deb/rpm packages, even though they're authenticated. # This violates RFC's, should not be done on a publicly accessible proxy. # It is necessary to accelerate downloads from a private proxy, though. refresh_pattern debian.datastax.com/enterprise/.*deb$ 129600 100% 129600 ignore-auth refresh_pattern debian.datastax.com/enterprise/.*$ 0 20% 4320 ignore-auth refresh-ims refresh_pattern rpm.datastax.com/enterprise/.*rpm$ 129600 100% 129600 ignore-auth refresh_pattern rpm.datastax.com/enterprise/.*$ 0 20% 4320 ignore-auth refresh-ims