Installing Seeq Data Lab Server
Overview
Seeq Data Lab Server is supported on-premise on Ubuntu and Red Hat Enterprise Linux (RHEL) and relies heavily on the container platform called Docker.
The best way to get Seeq Data Lab is by utilizing Seeq’s Software-as-a-Service (SaaS) offering. All of the technical details of installation and maintenance are done for you, and you will benefit from scalability features that will only available via Seeq SaaS. If you are not yet taking advantage of Seeq SaaS, please contact us to start a conversation.
If Seeq SaaS is not yet an option, we offer Seeq Data Lab Server – a single-machine configuration that is suitable for on-premise or private-cloud use. Note that Seeq Data Lab Server is only supported on Linux, and only on the Ubuntu and Red Hat Enterprise Linux distributions.
Prerequisites
Currently, Seeq Data Lab Server is only supported on two Linux distributions - Ubuntu and Red Hat Enterprise Linux (RHEL).
To install Data Lab on Linux with Docker, everything you need is included in the seeq-data-lab-<version>-64bit-linux.tar.gz
installer tarball. You can find the link to download it here: https://www.seeq.com/customer-download .
Minimum Hardware Requirements
The hardware required for running Seeq Data Lab Server on Docker is highly dependent on how Data Lab Server is used. Each Data Lab project requires memory and CPU based on the number and nature of the notebooks running within it.
If you are finding that operations within notebooks are slow, are running out of memory, or are running out of disk space, you will need to increase CPU, memory and/or storage resources. Seeq Data Lab relies completely on the hardware resources that have been allocated to the virtual machine that Docker is running on.
Based on our observations, each non-resource-intensive notebook consumes about 1600 MiB of memory and 800 millicores of CPU.
As adoption of Seeq Data Lab increases, users will likely leverage the scheduled notebook execution mechanism (spy.jobs) which will increase the number of simultaneously executing notebooks over time.
For general guidance, you can use the table below but if you are finding that operations within notebooks are slow, are running out of memory, or are running out of disk space, you will need to increase CPU, memory and/or storage resources.
Simultaneously Executing Simple Notebooks
up to 10 | up to 40 | up to 80 | up to 160 | |
CPU Architecture | 64-bit | |||
CPU Cores | 8 | 32 | 64 | 128 |
Memory | 16 GB | 64 GB | 128 GB | 256 GB |
Available Storage | 100 GB free disk space | 500 GB free disk space | ||
OS | Ubuntu LTS versions 18.04 - 20.04 |
Can Seeq Data Lab Server and Seeq Server be installed on the same hardware/VM?
No – Seeq Data Lab Server must be installed on its own hardware/VM so that it does not compete for resources with the Seeq Server. Both pieces of software assume that they have complete use of CPU, memory and disk space on the machine.
Does Seeq Data Lab adversely affect performance of Seeq Server?
Seeq Data Lab executes Python code separately from the main Seeq Server. Therefore, the Python code itself will not “compete” for CPU/memory resources with Seeq Workbench or Seeq Organizer directly. However, the SPy module accesses data through Seeq Server and has facilities to leverage the Seeq calculation engine, so there can be material impacts to Seeq Server load when Data Lab is used. You will see this impact in the Server Load percentage in the bottom-right corner of Seeq Workbench/Organizer.
Can Seeq Data Lab Server be connected to multiple Seeq servers?
No, Seeq Data Lab Server can only be connected to one Seeq server.
Can a cloud-based Seeq Data Lab Server be used with an on-premise Seeq server?
If you are deploying Seeq Data Lab into your private cloud, Yes, but there must exist a private network connection between Seeq Data Lab Server and Seeq Server. That can be achieved with certain cloud provider features like Azure ExpressRoute and AWS Direct Connect.
It is not possible to use SaaS-based Seeq Data Lab with a non-SaaS-based Seeq Server. You cannot connect a Seeq SaaS Data Lab server to your on-premise or private cloud server.
Installing Seeq Data Lab Server
Download The Installer
You can get the link for the installer archive from https://www.seeq.com/customer-download under Seeq Data Lab.
If you're logged on to the server and want to download the installer via command line, you can use the following command:
curl -O -J 'https://download.seeq2.com/<download link>'
Ubuntu
Extract And Install
tar xvf seeq-data-lab-<version>-64bit-linux.tar.gz
sudo seeq-data-lab-installer/install -f /opt/seeq/seeq-data-lab -g /var/opt/seeq -u seeq
Configure
Run the following commands to configure DataLab to point at the main Seeq Server (replacing <value> with the appropriate value):
sudo seeq config set Network/DataLab/Hostname localhost
sudo seeq config set Network/DataLab/Port 34231 # the port of the Data Lab server (usually 34231)
sudo seeq config set Network/Hostname <value> # the host IP or URL of the main Seeq Server
# If the main Seeq server is configured to listen over HTTPS
sudo seeq config set Network/Webserver/SecurePort 443 # the secure port of the main Seeq Server (usually 443)
# If the main Seeq server is NOT configured to listen over HTTPS
sudo seeq config set Network/Webserver/Port <value>
If you are using SSL and wish to have users redirect from the unsecure http://your.site.com to the secure version version of Seeq https://your.site.com, Seeq recommends port forwarding. See Secure Configuration Options (SSL/TLS)
If the Data Lab Server will be communicating with the Seeq Server over a private network, but users will be accessing Seeq over a public network, the Network/Hostname
above should be set to the public hostname, and the Network/Webserver/PrivateUrl
should be set to the base URL for the Seeq Server as accessed over the private network:
sudo seeq config set Network/Webserver/PrivateUrl <value>
If not overridden, the Network/Webserver/PrivateUrl
will default to the value of Network/Webserver/Url
constructed from the Network/Hostname
and Network/Webserver/SecurePort
or Network/Webserver/Port
.
On the main Seeq server, open a Seeq Command Prompt and set the hostname of the Data Lab server:
sudo seeq config set Network/DataLab/Hostname <value> # the host IP (not URL) of the Data Lab server
sudo seeq config set Network/DataLab/Port 34231 # the port of the Data Lab server (usually 34231)
Troubleshooting
It has been observed that 502 Gateway errors are resolved by restarting the main Seeq server after setting the hostname on the main Seeq server.
sudo seeq restart
Self Signed SSL Certificate
If Data Lab is to be run using a self signed SSL certificate, after creating your certificate using the names seeq-cert.pem
and seeq-key.pem
, install it on both Seeq Server and Seeq Data Lab.
# Seeq Global Folder $GLOBAL
sudo seeq config get Folders/Global
# Seeq Data Folder $DATA
sudo seeq config get Folders/Data
In Seeq Server, copy the
seeq-cert.pem
andseeq-key.pem
files to$GLOBAL/keys
folderIn Seeq Data Lab, copy the
seeq-cert.pem
andseeq-key.pem
files to$GLOBAL/keys
folderIn Seeq Data Lab, copy the
seeq-cert.pem
file to$DATA/data-lab/keys
folder and rename the file toextra_ca_certs.pem
If you are not using port forwarding, you will have to set your SecurePort
to 1234 on both Seeq Server and Seeq Data Lab.
sudo seeq config set Network/Webserver/SecurePort 1234
Run As A Service
Now Data Lab can be enabled as a service and started:
sudo seeq service enable
sudo seeq start
Troubleshooting
It has been observed that connectivity issues are resolved by switching the Seeq Server host URL to host IP (or vice versa) in Network/Hostname
on Data Lab server.
If sudo seeq start
has already been run then sudo seeq restart
may be needed
Red Hat Enterprise Linux
Seeq Data Lab depends on the Docker container management system. In Red Hat Enterprise Linux 7 and later, Red Hat has adopted Podman as its preferred container management system, but Seeq Data Lab does not support Podman. Therefore you will need to install docker-ce (see instructions below) which will in turn uninstall the nftables routing subsystem in favor of iptables. (Docker is incompatible with nftables.)
Configure Subscriptions And Repos
Red Hat Enterprise Linux (RHEL) requires a couple of packages be installed and/or removed in order to run the install script and Seeq Data Lab.
RHEL 7: Register with Subscription Manager and add extra repo
On RHEL 7, it’s necessary to enable a repo to have access to the container-selinux package.
If the RHEL server has not been registered during the OS installation, then it will be necessary to register the system now:
sudo subscription-manager register
If you get the message The system is already registered. Use --force to override
then you can skip to Enabling the extra repo.
The responses to the subscription-manager register
command will depend on the Red Hat subscription you want to use. During development, you can sign up for a free developer subscription at https://developers.redhat.com/register/ and then use the account information that you used to sign up in the responses to subscription-manager register
.
Once you’ve registered, make sure to refresh the local data and attach a compatible subscription to the newly registered system:
sudo subscription-manager refresh
sudo subscription-manager attach --auto
Enabling the extra repo
sudo subscription-manager repos --enable=rhel-7-server-extras-rpms
If you get a message like Repositories disabled by configuration.
then you can use sudo subscription-manager config --rhsm.manage_repos=1
to change the configuration.
RHEL 8: Remove container tools
A prerequisite for installing docker-ce
on RHEL8 is to uninstall container-tools from your server:
sudo yum module remove container-tools
container-tools
module must be uninstalled because it is not compatible with `docker-ce`
If you get a message like No packages marked for removal
, ignore the message and go to the next step
Add docker-ce Repo
Add docker-ce
repository to your server:
sudo yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
Install docker-ce
Once the docker-ce
repository has been added you can install docker-ce
and a Seeq dependent package libseccomp-devel
by running:
sudo yum install docker-ce libseccomp-devel -y
Start docker
To start docker you can use:
sudo systemctl start docker
You can check if docker successfully started using:
sudo docker ps
When docker was successfully started you should see an output like:
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
...
Configure docker to automatically start on boot
To automatically start Docker and Containers on boot for RHEL, use the commands below:
sudo systemctl enable docker.service
sudo systemctl enable containerd.service
Note: If you need to disable this behavior you can use:
sudo systemctl disable docker.service
sudo systemctl disable containerd.service
Docker Compose V1 and V2
R58 and later requires docker compose V1 and V2. When installing Seeq we will attempt to automatically install both versions. Docker compose V2 should have installed with docker-ce above. If you see an installation error on docker compose v1 you can use the following process to manually install.
Download this file to your local machine and then upload it to the server: https://github.com/docker/compose/releases/download/1.29.2/docker-compose-Linux-x86_64
Move the above file to /usr/local/bin and rename it to
docker-compose
Perform
chmod +x
on the file then proceeded with the SDL install
Run the Seeq DataLab installer again.
Extract And Install
See Ubuntu instructions.
Configure
Run As A Service
Troubleshooting
Starting Seeq & updating the configuration from the Seeq Prompt fails with a permission denied error
Problem: When trying to start Seeq & updating the configuration from the Seeq Prompt the following error is seen:
PermissionError: [Errno 13] Permission denied: ‘/opt/seeq/seeq-data-lab/install.properties’
Solution: Whilst logged in as a user that has appropriate permission run the following and try again:
chown seeq:seeq /opt/seeq/seeq-data-lab/install.properties
Upgrading
Stop the Seeq Data Lab service by issuing sudo seeq stop
.
Back up your installation appropriately first, see Backup and Restoration below.
Follow the same instructions as in the Install section with the new version.
RHEL: Restore SELinux Contexts
On RHEL, before starting the service again, make sure to restore the SELinux contexts:
CODE
|
Backup and Restoration of Seeq Data Lab
Because Seeq Data Lab interacts with its underlying storage using standard File System semantics, backups can occur while Seeq Data Lab is running; system down-time is not required.
For those installations where Data Lab is deployed on a RHEL or Ubuntu server running on a Virtual Machine, the use of the underlying VM snapshot and restoration mechanisms native to VmWare, Azure, AWS, or other cloud services works well. Backups can be scheduled and executed by the I.T. infrastructure without special handling for Seeq Data Lab. In general, Seeq recommends full-system backup/restore/DR practices when deploying Data Lab running Docker on RHEL or Ubuntu.
For those wishing to be more explicit, or for installations wishing to backup just the Seeq files specifically, backing up the contents of the seeq home directory (such as /home/seeq
if the installation followed the example listed at the beginning of this article) will be sufficient and complete. Restoration of the files can be made directly into the same directory; a restart of the seeq-data-lab service would be recommended following any restoration.
To reconstitute a Seeq Data Lab when the Data Lab filesystem has been individually backed up:
Restore or Deploy a fresh VM to host Seeq Data Lab following the instructions listed above for deploying Seeq Data Lab.
Download and re-install the same version of Seeq using the same command option as the original installation (such as “
-g /home/seeq
”)Insure the seeq-data-lab service is stopped
Restore the contents of
/home/seeq
from the backupStart the seeq-data-lab service