Understanding Google BigQuery

Overview

The BigQuery connector uses a configuration pattern for connecting to a BigQuery data warehouse. It requires the name of a project, a dataset, and a table or a view; it also requires the name of the column that represents time and the names of the columns that contain values.

This article shows how to create a dataset, upload CSV files, create a view, create a BigQuery connection in Seeq, and trend a signal.

BigQuery

Sign in to your Google Cloud Platform (GCP) and select a project, or create an account and a project will be created for you.

Create a dataset

To create a dataset, click the options icon and choose Create dataset.

A Create dataset dialog displays. Provide a Dataset ID, choose a Data location, and click CREATE DATASET.

Upload some CSV files

We are going to use two tables : one table for tag names and another for tag data. The tag_table.csv file has the following columns:

TAG_NAME - the name of the tag
TAG_INDEX - the ID of the tag, which is used to join to the data table
LAST_CHANGE - when the tag was last modified

The raw_data.csv file has the following columns:

TAG_INDEX - the ID of the tag, which is used to join to the tag table
VAL - the sample
QUALITY- true means the sample has passed a quality check
TIMESTAMP- when the sample was collected
LAST_CHANGE- when the sample was last modified

Download the following files:

tag_table.csv

raw_data.csv

Open the Dataset. Click the options icon and select Open.

The Dataset is opened in a tab. Click the CREATE TABLE button. The Create table dialog displays.

For Create table from, select Upload and then browse for the tag_table.csv that you downloaded.

For Table enter tag_table.

Under Schema, click the + button three times and then enter the Field names as shown:

Expand Advanced options, and change Header rows to skip to 1.

Click CREATE TABLE. The tag_table table is created.

Repeat the process to upload the raw_data.csv file, name the Table raw_data , and use the following schema definition:

Remember to change Header rows to skip to 1.

Your Sample dataset should now have two tables:

Create a view

Open the Sample dataset and click COMPOSE NEW QUERY.

A query editor opens in a new tab. Copy the following query into the editor:

CODE

SELECT
  rd.TIMESTAMP,
  tt.TAG_NAME,
  rd.VAL,
  rd.QUALITY
FROM
  `bigquery-sandbox-342919.Sample.raw_data` rd
LEFT JOIN
  `bigquery-sandbox-342919.Sample.tag_table` tt
ON
  rd.TAG_INDEX = tt.TAG_INDEX
WHERE
  rd.QUALITY = TRUE
ORDER BY
  rd.TIMESTAMP,
  tt.TAG_NAME

Replace bigquery-sandbox-342919 with our project ID. Click RUN and verify the results, it should look something like this:

To create a view, click SAVE and chose Save View.

The Save view dialog displays. Select the Sample Dataset, specify signal for Table, and click SAVE.

The signal view is created within the Sample dataset, and has the following schema:

Generate an access key

The Seeq configuration for BigQuery requires an access key. Use the following steps to create one. From the menu choose IAM & Admin > IAM

In the left panel, select Service Accounts.

The Service accounts page is displayed. Click CREATE SERVICE ACCOUNT. The Create service account dialog displays.

For Service account name, specify something like seeq-connector. A Service account id will be created for you. Provide a Service account description.

Click CREATE AND CONTINUE. For project access, set the Role to Viewer.

Click CONTINUE.

Click DONE. The service account is displayed. Click the options icon and select Manage keys.

The KEYS tab displays. Click ADD KEY and select Create new key.

The Create private key dialog displays. Choose JSON and click CREATE.

A JSON file is downloaded in your browser. You will need this in the next step.

Seeq

To connect to our Sample dataset, we need an access key and connection.

Upload the access key

Find the JSON file that was downloaded in the previous step and copy it to the remote agent. For the example, the file has been copied to the configuration/link directory within the data folder, but this is not required. Just make sure that the Seeq service runs with an account that can read the key file in the location where you copied it to.

Create a BigQuery connection

Navigate to the Administration | Datasources page, and click Add connection. The Create new connection dialog displays.

Select the desired Agent name, select the BigQuery Connector, and select the BigQuery Connection Template Connection. Provide a Datasource name and accept the default value for Datasource ID and ignore Max concurrent requests, Max results per request, and Transforms.

For the Additional configuration, specify the TimeZone, TimeColumn, GroupBy (notice the array), DataSet, DataColumns (notice the array), ProjectId, Name, and AccessKeyFile.

The TimeZone, ProjectId and AccessKeyFile will be different for you.

The key file in this example is a relative path, which will be relative to the Seeq data folder.
There are no conditions in this example and so we set ConditionTables to an empty array.
Escape any backslashes with another backslash; for example "C:\file.txt" would be entered as "C:\\file.txt".

Click CREATE. The connector should index and you should see 4 signals.

Trend the signals

Create a worksheet. Find the Asset Tree for the BQ.Samples connection and navigate into the project, dataset, and view. You should see four tags:

Select the VAL signal for one of the tags: