Restricted Access to Sensitive Content in Seeq Data Lab
Overview
Sometimes you need to write a Data Lab script or Add-on that integrates with other applications, requiring credentials to access these external applications. Additionally, you may want other users to execute your script or Add-on without exposing your credentials.
This page describes common practices in Seeq Data Lab for sharing scripts or Add-ons while keeping sensitive information, such as credentials, hidden from other users.
Access level for Data Lab projects
Access levels | Read | Write |
---|---|---|
Open and view Data Lab projects |
|
|
Create new files |
|
|
Open any project file including notebooks |
|
|
Save or overwrite any project file |
|
|
Terminal access |
|
|
Execute notebook cells |
|
|
Execute notebook in Add-on Mode | Only if the notebook has been deployed as an Add-on Tool |
|
Execute Data Lab Function endpoints |
|
|
Hidden Files in Data Lab
Files created in a Data Lab project are stored in the home directory of the project’s file system. A user with write access to the project can hide files by prepending the file or directory name with a dot (e.g. .my_hidden_file.txt
). When a file or directory is prepended with a dot, they are not visible in the file browser of the Data Lab project. A user with write access can view and edit hidden files by going to a terminal in the same Data Lab project (e.g. running the command ls -a
shows all files including hidden files in the directory). However, a read-only user cannot access the terminal; thus, the read-only user cannot list hidden files or access their content.
As a read-only user, you cannot view hidden files but you can still execute Data Lab Function endpoints and notebooks in Add-on Mode deployed as Add-on Tools
Secret Managers and Key Vaults
In a situation where your Data Lab code needs to access external applications, it is a good architectural practice to use a proper secrets manager (e.g. AWS Secrets Manager or Azure Key Vault). The authentication information required to access a particular secret from your secrets manager can be stored in a Data Lab hidden file. Instead of storing the credentials of the external application in Data Lab, there is a runtime call to the secret manager service to retrieve the credentials dynamically.
The benefit of this practice is that secret managers help with the management of application credentials, tokens, or API keys throughout their lifecycles. This allows features like automatic rotation for your secrets, and proper management of identity roles that can access such secrets. Furthermore, if your code requires access to more than one external application you would only need to store the access to the secrets manager of your choice instead of all the secrets for the applications you need to access.
Example
To illustrate the process of integrating with other external applications, let’s walk through an example of a Data Lab notebook that gets the data of the signals shown in the display panel of Seeq Workbench and stores those samples as a .csv
file in an AWS S3 bucket.
Prerequisites
S3 bucket setup https://docs.aws.amazon.com/AmazonS3/latest/userguide/creating-bucket.html
AWS Secrets Manager setup where the credentials to access the S3 bucket are stored. The following values need to be stored:
AWS_S3_BUCKET
,AWS_ACCESS_KEY_ID
, andAWS_SECRET_ACCESS_KEY
. https://docs.aws.amazon.com/secretsmanager/latest/userguide/create_secret.htmlPostman is installed in the local environment.
Setup
In a new Data Lab project:
Open a new Terminal (File > New > Terminal) and run
pip install fsspec s3fs boto3
.Locally on your computer create a text file named
.secrets_auth.ini
with the authentication information of the AWS role that has access to the secrets manager and upload the.secrets_auth.ini
file to the Data Lab project. (Note: You may have to re-add the leading.
after uploading the file to the project if you’re using MacOS. To rename once it’s been uploaded to the project, runmv secrets_auth.ini .secrets_auth.ini
in the terminal.)CODE[AWS auth] ACCESS_KEY=YourAccessKey SECRET_KEY=YourSecretKey
Upload the S3DataUpload.ipynb notebook to the Data Lab project. For this example, the name of the notebook should be
S3DataUpload
. If the name is modified, you will have to adjust the name in the URLs of the API calls below.As an initial test, you can try accessing the endpoints from an external application such as Postman
Send a request to authenticate to the Seeq server
CODEcurl --location 'https://{{seeq_hostname}}/api/auth/login' \ --header 'Content-Type: application/vnd.seeq.v1+json' \ --header 'Accept: application/vnd.seeq.v1+json' \ --data '{ "username": {{seeq_access_key}}, "password": {{password}}, "authProviderClass": "Auth", "authProviderId": "Seeq" }
If you get a
200 OK
response, copy thex-sq-auth
from the response headers. Otherwise, verifyusername
,password
andseeq_hostname
Send a request to the
GET /hello
test endpoint, make sure to either create an environment variable in Postman forseeq_hostname
,datalab_project_id
, andx-sq-auth
or substitute the values directly in the code belowCODEcurl --location 'https://{{seeq_hostname}}/data-lab/{{datalab_project_id}}/functions/notebooks/S3DataUpload/endpoints/hello' \ --header 'x-sq-auth: {{x-sq-auth}}' \ --data ''
A successful
200 OK
response will show the response body"hello from Data Lab"
Open Seeq workbench, and display the signal data you want to save to S3. Take note of the workbook ID and worksheet ID. This example will upload the data that is shown in the display pane for the time window shown.
To access the data displayed in Workbench from Postman send an API request to the
GET worksheet/items
endpoint. Make sure to change all the values enclosed by{{}}
CODEcurl --location 'https://{{seeq_hostname}}/data-lab/{{datalab_project_id}}/functions/notebooks/S3DataUpload/endpoints/worksheet/items?workbook_id={{workbook_id}}&worksheet_id={{worksheet_id}}' \ --header 'x-sq-auth: {{x-sq-auth}}' \ --data ''
To upload the data displayed in Workbench to the S3 bucket, use the
GET /upload
endpoint. Make sure to change all the values enclosed by{{}}
CODEcurl --location 'https://{{seeq_hostname}}/data-lab/{{datalab_project_id}}/functions/notebooks/S3DataUpload/endpoints/upload?workbook_id={{workbook_id}}&worksheet_id={{worksheet_id}}&file_name={{file_name}}' \ --header 'x-sq-auth: {{x-sq-auth}}' \ --data ''
To download data from the S3 bucket, use the
GET /download
endpoint. Make sure to change all the values enclosed by{{}}
CODEcurl --location 'https://{{seeq_hostname}}/data-lab/{{datalab_project_id}}/functions/notebooks/S3DataUpload/endpoints/download?workbook_id={{workbook_id}}&worksheet_id={{worksheet_id}}&file_name={{file_name}}' \ --header 'x-sq-auth: {{x-sq-auth}}' \ --data ''
Once you are ready to share this project, you can provide read-only access to other users. Read-only users will be able to make requests to the same endpoints but will not be able to execute cells in the S3DataUpload.ipynb
notebook.
How it works
The .secrets_auth.ini
file contains the access and secret keys to authenticate to AWS Secrets Manager. These keys are generated within AWS and belong to an identity that has access to the Secrets Manager. At any time, the access can be revoked from the AWS environment, and in that case, our example would stop working. Additionally, the credentials to the S3 bucket can be changed either manually or automatically in the AWS environment and our example application would continue to have access to the S3 bucket so long as the identity to which we have the access keys still has access to the S3 credentials in the AWS Secrets Manager.
In the S3DataUpload.ipynb
notebook, we make use of the SPy module to access the data shown in Workbench for the particular workbook and worksheet IDs passed in the request. Before uploading to an S3 bucket, we read the access and secret keys of the identity that can access the AWS secret manager. Notice that these keys are stored in a hidden file that cannot be accessed by a read-only user. These keys are used in the function get_secret()
which is run dynamically when the endpoint is called from another application or tool. We shared the project with read permissions only. Read-only users can make requests to the Data Lab Functions endpoints we have created but are not able to execute cells directly in the Data Lab project. Thus, we control what the output of the endpoint is and the read-only user cannot see the S3 bucket credentials that are retrieved during execution. For getting started with Data Lab Functions development, see Data Lab Functions