Controlling Signal Freshness
Freshness describes how close a signal displayed in Seeq is to the best data in the external data source. By default, Seeq will display the most current data possible, at least as far as the data source is concerned. When signals are used in a live doc context, the freshness can be up to the “update interval” out of date.
Seeq also has the concept of forecast data, represented as dotted lines in the trend. While forecasts can represent data that may occur in the future, it applies to any data that isn’t known certainly enough to benefit from caching.
Freshness can be expensive in terms of load on the Seeq server or request load to the external data source. This is especially true for forecast data, calculations that use many inputs, or aggregate inputs over time.
A function called setRefreshRate()
allows a user to achieve better analysis performance in exchange for less freshness. This function allows you to avoid the costs by choosing a refresh timestamp that occurs before the signal would become a forecast.
You can think of the refresh timestamp as a version of now()
that moves forward in discrete intervals such that the queries to the external data source result in fully cacheable and non-forecast data.
Once the refresh timestamp is computed, no data will be requested after that timestamp. Since the timestamp is configured to exclude forecast data, the data should be entirely cached and no additional calculation or external data will be needed, giving a very fast result.
General Usage
In it’s simplest form, the formula looks like
$signal.setRefreshRate($period [, $secondaryPeriod])
This also works for conditions. The $secondaryPeriod
defaults to the $period
and can be used to increase freshness when the signal is an aggregation.
The refresh timestamp is determined by the period of the input data and possibly the period of the input's source. This requires knowledge that only the formula author has. The refresh timestamp is computed as
(now() - $secondaryPeriod).floorTime($period)
Recommendations
Only use
setRefreshRate()
as the last function in a formula. If you chain more operations after it in the same formula, the formula can’t achieve the caching results.Consider using
setRefreshRate()
as the only function in a formula. This enables you to put the the original and optimized versions side-by-side in your trend to see the different performance and freshness effects.For most plain signals and conditions,
$secondaryPeriod
is unnecessary. The secondary period is most useful on signals that are aggregates and can improve the freshness of the output.If you’re trying to optimize a rolling aggregate, it may be easier to increase the rolling aggregate period and avoid using
$secondaryPeriod
. For example, average of 4 months every 2 weeks could be changed to a the average of 4 months every day and just usesetRefreshRate(1d)
. This is easier to reason about than the secondary period.
Scenarios
Freshness is an advanced topic and it isn’t obvious how (or why) to choose the values. These examples rationalize and clarify the behavior details.
Example 1
We have a signal that only produces a sample every hour at half past. It’s no use to keep querying for the data source on every refresh unless it’s been an hour since the last check.
$signal.setRefreshRate(1h)
Trend the signal when "now" is 2:45pm and the view range is “today”
The refresh timestamp is computed as 1:00pm. The formula indicates (2:45 - 1h).floorTime(1h) and the secondary period defaults to 1 hour
The external data source is queried from midnight until 1:00pm
The results include a sample at 12:30pm and the boundary value at 1:30pm
Because 1:30pm sample is after the query range, we can cache everything up to 1pm.
Refresh the screen at 3:05pm.
The refresh timestamp is now 2:00pm, so the cache needs to be filled from 1:00 to 2:00pm
The external data source will return samples 12:30pm, 1:30pm, and 2:30pm because the boundary values are always included
The region from 1-2pm is safely cacheable, which will include the 12:30 sample
This will result in the last sample being between 1 and 2 hours old, but the data source will only get queried once each time a user looks and the refresh timestamp is passed the cached data.
Example 2
To understand these values, we'll use an example signal that is defined as the trailing 7 day average of a source temperature signal, computed once per day.
Trend the signal when "now" is Monday at 2pm. This will have a sample every day at midnight, including
A sample at midnight yesterday that represents a fully known 7 day average of Sunday-Sunday.
A forecast sample at midnight tonight that represents the average from last Monday through 2pm today.
Refresh the screen at 3pm. Now the samples will include
The sample at midnight yesterday is fully in the past so pulls from the cache.
The sample at midnight tonight must be recomputed using all the data from last Monday through 3pm today.
This makes it clear that there is no utility in computing the new results until after midnight tonight. This implies that the inputPeriod should be set at 1 day
By default, the secondaryPeriod is the same as the inputPeriod. Using the formula above, it means the latest sample is between 1 and 2 periods behind now. When the input is an aggregate (or other function that reduces the density of the datums), this may be too pessimistic.
Continuing to refine the scenario, let's assume that the source data logs a temperature reading once an hour at half past.
Trend the signal when "now" is Monday at 11:45pm. The refresh timestamp is still Sunday midnight and no attempt is made to compute the upcoming midnight sample.
Refresh the screen at 12:15am. Now the refresh timestamp is 12:00am Tuesday, but the last source sample is 11:30. We don't have the next temperature sample to know if the temperature is rising or falling during the last 30 minutes of the day, so we can't cache that final weekly average sample yet.
For this scenario, you should set the secondaryPeriod to 1 hour. The effect is that the last sample of the result is between 1 and 25 hours behind now, a substantial improvement over the default of 24 to 48 hours.
The final formula is
$signal.aggregate(average(), periods(7d, 1d), endKey())
.setRefreshRate(1d, 1h)