GroverStreams

FAQ

Frequently Asked Questions

What is GroveStreams?
Who would use GroveStreams?
What is meant by high performance and near real-time?
Do I need to buy or lease servers?
How am I billed?
Is GroveStreams secure?
How does data get into GroveStreams?
Can I control my remote device using GroveStreams?
What are Components and Streams?
What's a Derived Stream?
What's a Component Template?
What is a GroveStreams Organization?
How often can a Feed be uploaded?
Do you support automatic deleting of data?
What are Cycles and Rollup Calendars?
Do you have Time Zone support?
Do you support monitoring and tracking mobile device geo locations?
How are Time Filters used?
How much functionality is exposed in the GroveStreams' RESTful API?
Can GroveStreams be Rebranded, OEM'd, or White Labeled?

Answers

What is GroveStreams?

GroveStreams is a cutting-edge data streaming platform in the cloud providing decision making capabilities to many users and devices as data arrives from many sources.

GroveStreams' data streaming analytics is designed to meet this demand so that your business or organization can quickly react to changes as they're happening. GroveStreams isn't just built to allow you to react to data, it's also built to allow your devices to react accordingly by using the GroveStreams platform independent open API.

The proliferation of devices that generate data increases every day and traditional systems cannot effectively capture, analyze and react to the amount of data these devices are generating in a timely matter. GroveStreams has been built from the ground up using the latest technologies that power today's largest Internet companies such as Google, Yahoo and Facebook. It brings the scalability and reliability that these large websites offer today to your business or organization.

GroveStreams is an open platform, in the cloud, that any organization, user or device can take advantage of.

GroveStreams specializes in:

Capturing, Analyzing and Acting on Large Amounts of Time Series Data Streams and Data Points. GroveStreams can manage large numbers of data streams for each organization. Each stream can store over 60 million data points (or samples). This is equivalent to two year's of data for a stream defined as a one second interval stream!
The following categories:
- "Real-time Business Intelligence", "Operational Intelligence"
- "Internet of Things", "Device Cloud", "M2M or Machine to Machine", "Sensor Network", "Smart Grid"
- "Data Logging"
Sample times accurate to the millisecond
Many data types supported (short, integer, long, float, double, text, datetime, boolean, geo coordinates, files, etc.)
File Streams. Upload store, retrieve, and view any file type (.wav, .mpg, .pdf, etc.) within a stream.

Actionable Analytics:

User defined roll-ups (i.e., aggregate 1 second data into 5 minutes into hour into day into month into quarter into year and so on)
- Real-time Roll-up calculations (Sum, Time Weighted Avg, Min, Max, Min Occurrence (to the second), Max Occurrence, First, Last)
- Utility/energy time of use billing determinant extraction with support for Holidays
- Interval gap detection: monitor the quality of your data as it arrives
Derived streams: Streams can be derived from:
- Formula expressions involving other streams across your organization
- Internal or external RSS feeds
- The aggregation of large number of other streams

GS SQL: A high performance query language based on SQL with time-series extensions:

SELECT Sample( Range(sd='2021-03-01T00:00:00-05:00', ed='2021-04-01T00:00:00-05:00') ) FROM Stream	Return fixed range using ISO8601
SELECT Sample( Range(sd=-1d, ed=now) ) FROM Stream	Return last day from current datetime
SELECT Sample( Range(currentCycle='month') ) FROM Stream	Return current month
Select Sample( cycleId='day', stat='max' ) as maximum, FormatDate(Sample( cycleId='day', stat='MAXOCCURRENCE' )) as 'maximum time' From Stream	Return 12 month intervals each with the maximum sample value for each month and the datetime it occurred.
SELECT formatDate(time) AS 'time', sample( range(last=100) ) FROM Stream WHERE cid='generator22' && id='kw' && sample > 100	Return the last 100 samples and times for a specific stream with values greater than 100.
SELECT formatDate(time) AS 'time', sample( timeFilterId='spring', range(previousCycle='year') ) FROM Stream WHERE cid='generator22' UNION ALL SELECT formatDate(time) AS 'time', sample( timeFilterId='summer', range(previousCycle='year') ) FROM Stream WHERE cid='generator22' ORDER BY time	Return the Spring and Summer samples for the previous year; ordered by time.

AI-Powered Forecasting: Choose from the latest models to get the most accurate forecasts: TFTModel (Temporal Fusion Transformer), NBEATSModel, ARIMA, Prophet, TCNModel (Temporal Convolutional Network), TransformerModel, ExponentialSmoothing, RNNModel (Recurrent Neural Network)
Correlation Detection: Our correlation detector is invaluable for selecting covariates, identifying leading indicators, or exploring data relationships.

Customizable drag and drop html dash-boarding
Automatic Registration. Components and streams can register themselves automatically and appear in existing dashboards and aggregation analytics as they upload their initial feed data.
RSS Feeds: Components and Streams provide RSS feeds. View RSS feeds within Dashboards.
Event Monitoring with customizable templates for email, SMS, MMS, and HTTP call notifications
Component location tracking with mapbox integration
User Role based access security
Public/private web UI settings. Make your organization accessible to only your users or allow anonymous Guest users with the ability to set Guest access rights.
HTTP REST and MQTT APIs. Almost all functionality is exposed in GS' APIs.
Fine-grained API access security
Time Filtering (Time of Use filtering)
Time zone support
Custom units and formatting
Browser based User Interface. All HTML with no plug-ins such as flash. 100% thin client.
Compatible with popular reporting tools (MS Power BI Report Builder, SAP, IBM Cognos, Tableau) via our OData reporting connector.
Mobile (smart phone) dashboards

Who would use GroveStreams?

Anyone or any device that needs to collect large amounts of time series data, monitor it, analyze it and react to it or other devices' data quickly. Whether you want to monitor one data stream from a single source or many more streams from many sources, GroveStreams is available to you and others including:

Open-source electronic enthusiasts (such as Arduino users)
Sensor/device driven organizations or businesses
Governments
Businesses
Utilities
Healthcare providers
GroveStreams (We use GroveStreams to monitor our own servers)
And many others ...

Use GroveStreams when you:

Need near real-time:
- Environmental monitoring:
  - "Notify me when the temperature of my freezer is higher than 25 °F during weekend off-hours"
  - "Notify my clients if a commodity price increases by 10% over one day"
  - "Record oil production from thousands of wells - notify me of any well problems"
  - "Notify my doctor if my wearable biometric devices detect anomalies"
- Complex Billing calculations:
  - "What is my ongoing energy cost for the last minute and the current month for my time of use block rate?"
Need to collect and store large amounts of data at small sample sizes and interval sizes
Need to know if the data is reliable
Need to monitor a mobile component's location
- "Take action when one of our cargo ships is within 50 miles of its home port"
- "Notify me if two cargo ships come within two miles of each other"
Need to monitor rolled-up data statistics as data arrives
- For example, you may have a meter or sensor uploading one second interval data every 10 seconds but you only want to monitor your daily peak. You can create a roll-up calendar and associate it with that stream so that one second intervals roll-up to 5 minute data which rolls up to hour data then day data then month data and so on in as data arrives. You can then dashboard your daily cycle and watch it change as one second data is uploaded into GroveStreams.
Have devices that you want to sell to customers but don't have the time, money or expertise to develop the software to manage the data generated by them.

What is meant by high performance and near real-time?

GroveStreams can run on hundreds of servers that utilize partitioning and parallel processing. Most transactions happen within milliseconds including uploading and downloading tens of thousands of metrics. For example, a GroveStreams request to download one day of second samples (86,400 values) usually takes under 1 second.

The GroveStreams API supports the batching of samples across streams and components during uploading or downloading. This can allow for many streams and their samples to be uploaded with sub-second response times.

Long running tasks are managed by the GroveStreams job framework. An example of a long running task is a component reconcile job. If a user changes a stream's interval size or its data type and the stream has several years of one second data, it could take some time updating all this data, so a job is kicked off and runs asynchronously. Feeds can still be uploaded/downloaded while jobs are running. Note that some jobs, such as the reconcile job, require that a feed only append data while a job is running.

Component locations, roll-up calculations, gap filling, constraints, event detection and notifications occur as feeds are uploaded.>

Do I need to buy or lease servers?

No. Data from one or more devices can be upload directly into GroveStreams as long as one of your devices has Internet access. The option to lease an entire private cluster is available to large customers.

How am I billed?

GroveStreams is free for small users. Larger users will be billed monthly via a registered credit card. Enterprise customers may request monthly invoicing.

All organizations, including organizations that fall within the free limits, can include an unlimited number of users. Visit our pricing page and the pricing FAQ page for more information.

Is GroveStreams secure?

Yes. Our data center is SOC2 certified. The GroveStreams website is password protected and allows for all requests to occur via SSL. Strong passwords, that expire, can be configured along with two factor authentication (2FA).

Our HTTP API uses access keys.
Our MQTT API uses X.509 certificates.

Both APIs can have specific rights assigned for resource types and/or specific resource instances. See the Developers page for more details about API keys.

How does data get into GroveStreams?

Data is uploaded into GroveStreams several ways:

Via the GroveStreams HTTP REST and MQTT APIs.
Via User Defined Connectors.
Via RSS feeds. Data streams can be derived from internal or external RSS feeds. For example, users can derive a stream from a weather website to automatically import weather data.
Via import files. The GroveStreams file import framework is very powerful and fast. You can upload very large files and you can upload thousands of files for each import. GroveStreams currently supports its proprietary json feed file format and a customizable text delimited file format. Contact us if you would like us to support a popular or custom file format - we will work with you to get your data into GroveStreams. Imports can be scheduled from within GroveStreams to run periodically.
Manually via the GroveStreams website.

MQTT API Monitoring and testing can be done within Observation Studio:
GroveStreams - MQTT Monitoring

Can I control my remote device using GroveStreams?

Yes. The easiest and most secure method is with our MQTT servers and X.509 certificates.

A secondary HTTP technique is to return a command or configuration signal back to the device every time the device sends data to the platform. This technique allows commands to travel through a firewall without having to configure your firewall.

What are Components and Streams?

A Stream represents a collection of one or more data points. Each data point can be associated with a single time stamp or a time range. Timestamps are accurate to the millisecond.

There are three types of streams:

Regular Streams: Each data point is associated with a datetime. datetimes can be random or arrive at fixed intervals. There can be 10s of millions of data points for each stream. Regular streams are the most common stream type
Interval Streams: Each data point is associated with a start datetime (inclusive) and an end datetime (exclusive). There can be 10s of millions of interval data points for each stream.
Point Streams: There is only one data point. This value can change frequently or infrequently, but only one data point is stored.

Stream data types can be numbers, dates and times, text, booleans (yes,no), longitude, latitude or elevation. Each Stream can have time filters, constraints, gap filling methods, a unit and other items applied to its feed data as it is uploaded. Stream data can be uploaded via the GroveStreams' API or a stream can be defined as a derived stream. It is not required for a component to upload all of its streams at the same time with the exception of mobile components. Mobile components are required to upload their longitude and latitude streams at the same time.

A component is a "thing" that is also a container for a group of streams and events that share a similar location. For example, a component might be a sensor that monitors temperature and humidity. Temperature would be one stream and humidity would be another stream. If the sensor were mobile, its component could also have latitude and longitude streams that record the location of the sensor component over time.

A component is an abstract concept that represents physical "things" or non-physical "things". Examples of components:

Physical "things" :
- Utility Meters
- Smart Plugs
- Sensors (Temperature, Accelerometer, etc.)
- Motion Detectors
- Cardiac Telemetry Devices
- Cellular Phones
- Vehicles, Planes, Boats
Nonphysical "things"
- A computer program that tracks computer memory and CPU in a server farm
- A computer program that polls a relational database for information
- A computer program that uploads frequently changing financial information such as sales figures, expenses, stock quotes, etc.

What's a Derived Stream?

Streams can be configured to be derived from RSS feeds or from expressions.

From RSS Feed: Streams can be derived from internal or external RSS feeds. GroveStreams allows for a simple filter to be applied to RSS feed results so that certain values can be extracted from a feed.

RSS feeds are a popular way of bringing external data into GroveStreams for mashing up with other streams. Many RSS feeds are available on the Internet including:

Weather data Stock quotes Currency Exchange Rates External, 3rd party sensor data

GroveStreams updates RSS derived streams hourly.

From Aggregation:
Streams can be derived from aggregating two or thousands other streams. Calculate statistics such as the maximums, minimums, sums, averages and gap counts for each interval in the aggregation time range. Aggregation is started manually or it can be scheduled to run periodically since it can be a lengthy process.

From Expression:
A Derived expression stream is calculated from any other component streams within an organization. Calculations can occur as data arrives from API calls or it can occur periodically, usually one to three minutes, for large batch calculations. Derivation only happens when dependents have data available for the time period being calculated. Different stream roll-up cycles and interval offsets can be used inside a derivation expression.

Derived streams are just like other streams except that their feed data is derived by the GroveStreams' derivation engine. They can be graphed and monitored in dashboards and such. Derived streams can be derived from other derived streams.

Expression variables are streams. If a variable is an interval stream, then the user can choose the interval offset for that variable. For example, if an hourly interval with a time span of 2:00 pm to 3:00 pm is being calculated, a dependent variables' offset can be set to -1 and that variables' data for time span 1:00 pm to 2:00 pm will be used in the calculation. Offsets are useful for things like calculating rolling averages or comparing monthly costs.

Expressions can contain any stream data types, conditional "if" operators (<=,>=, <,>, !=, ==, &&, ||) and many functions (i.e., trig functions, log functions, exponents, absolute value, random number and many more).

Example of a derived stream expression that calculates one second 3 point rolling averages from component1.stream1:

Variables:

Variable	Offset	Stream	Cycle	Cycle Function
n_minus_0	0	component1.stream1	Second	-
n_minus_1	-1	component1.stream1	Second	-
n_minus_2	-2	component1.stream1	Second	-

Expression:
(n + n_minus_1 + n_minus_2) / 3

Other usages of Derived Streams:

A Stream that filters out the Winter on Peak kilowatt usage
A Stream that converts Fahrenheit to Celsius
A Stream that calculates summer weekend afternoon peak temperatures
A Stream that sums all the energy for all plant meters (aggregates a collection of component streams)
A Stream that calculates energy cost every minute, hour, day, month and year
A Stream that measures the distance between a component and another component every second
A Stream that applies a diversity factor

Derived stream variables can be different data types. A stream defined as text can be derived from the sum of two other streams of types long and double. GroveStreams will attempt to do the data type conversion wherever it is possible. If the conversion is not possible, the resulting interval will be considered a Gap or NULL.

As you can see, derived streams are a very powerful tool! You can view derived stream screen snapshots here.

What's a Component Template?

Large organizations typically have a large number of devices that are identical (such as utility electric meters). It makes no sense to manually create a new component instance by hand every time a device comes on-line.

A component template is used to simplify creating multiple component instances that only differ by a few attributes and stream feed data. When a component uses the feed API to upload data it can pass in a component template Id. A component instance will be created automatically within the organization, if a component does not already exist for that feed (also known as device automatic registration).

One or more components created from a component template can remain linked to that template. Any changes made to a template can be applied to all components linked to it. This makes mass modeling changes easy to do without any required programming.

What is a GroveStreams Organization?

An organization is a Workspace, typically representing a home, business, or organization. Each organization has its own set of components, streams, dashboards, maps, and other items. An organization allows you to control user and device access to the organization. You are automatically the "owner" and given full access rights when you create an organization. Other users may invite you to their organizations with rights they give you. All of the organizations, you "own" or are a member of, will appear in your GroveStreams start page (the first page that appears when you sign in).

GroveStreams data is store hierarchically in this manner:

Organizations
        Components
                Streams
        Component Templates
        Maps
        Dashboards

How often can a Feed be uploaded?

Multiple streams can be uploaded in batch from each public facing IP source every ten seconds or longer. Note that the stream may sample every second and upload those 10 samples every 10 seconds. See the HTTP API Limits and MQTT Limits pages for more details.

Do you support automatic deleting of data?

Yes. Some users may wish to retain stream data for a short period of time. A "delete profile" can be created and associated with each stream. Each profile specifies a user defined time span to retain stream data. A couple of times a day, the GroveStreams delete profile system job will run and delete stream data according to its delete profile. GroveStreams is not a round-robin database (RRD). Data is retained as long as you need it.

What are Cycles and Rollup Calendars?

Cycle
Each interval stream's data time range is specified by the base cycle associated with a stream. A cycle is a user defined recurring period. A cycle can be defined to be fixed or custom:

Fixed size: User defines the start datetime and time zone and then selects "X" number of Seconds, Minutes, Hours, Days, Weeks, Months or Years
Custom: User defines a collection of start and end datetimes for reach interval and a time zone

Examples of Fixed Cycles:

Second: 1 Second, starting at December 31, 2011 12:00:00 am
Quarterly: 3 Month, starting at January 15, 2012 12:00:00 pm

Example of Custom Cycle:

Seasonal: User defined Spring, Summer, Fall, Winter seasons

The time zone is used for the start datetime for fixed cycles and for the start and end datetimes for custom cycles. The time zone can be set to use the component's time zone so that the same cycle can be used across time zones.

Rollup Calendar
A rollup calendar defines how stream data can be rolled-up for viewing and analysis. Rollup calendars are optional.

If a stream has a base cycle set for one second intervals, it quickly becomes cumbersome to view and analyze a year of data which is 31,536,000 intervals. For the "float" data type which is 4 bytes, a years worth of data would be about 120 Megabytes. Instead of downloading 31 million intervals and sifting through them, a user is better off downloading a set of rolled-up stream data at a larger interval size and then drilling in on the interesting cycles. A good example of a graph that demonstrates the power of roll-ups is the Google finance Graph. Unfortunately, unlike GroveStreams, Google hard-coded their roll-up cycles (1d, 5d, 1m, 6m, 1y, 5y).

Also, you may want to monitor your energy usage in 1 second intervals so as to watch for certain events, but you are probably billed based on monthly kW and or kWh. A stream with a rollup calendar would be an ideal solution. You can view/monitor 1 second intervals and you can also view/monitor your one hour and one month cycle intervals too, at the same time. You'll be able to watch your one month cycle interval change nearly every second... actually, every 10 seconds since you can only upload every 10 seconds :(

A rollup calendar is defined by simply referencing a collection of cycles, described above. When the rollup calendar is saved, it will be sorted and validated that it contains no gaps. A stream's feed can only be uploaded to the base cycle. Any stream non-base cycle information can be used (graphing and such) just like the base cycle.

Example
Let's define a rollup calendar that rolls one second data into several other cycles up to a year: Cycles that were defined earlier and that the rollup calendar will reference:

1 Second Cycle
5 Minute Cycle
1 Hour Cycle
1 Day Cycle
1 Month Cycle
1 Year Cycle

After the rollup calendar is defined and associated with a stream, the stream's data for each rolled-up cycle is available as base cycle data is uploaded. When the stream uploads 10 intervals of data (1 second each), all the rollup data is immediately available via the API and browser user interface. Rollup cycle data can be requested with these functions:

First : Returns the first interval value of the underlying cycle
Last : Returns the last interval value of the underlying cycle
Min : Returns the smallest interval value of the underlying cycle
Max : Returns the largest interval value of the underlying cycle
Avg : Returns the time weighted average of the underlying cycle
Sum : Returns the sum of the intervals of the underlying cycle
Min Occurrence : Returns the start datetime the minimum occurred for the base cycle, accurate to the second
Max Occurrence : Returns the start datetime the maximum occurred for the base cycle, accurate to the second
SampleCount : Returns the number of samples that occurred for this cycle's range.
GapCount : Returns the number of base cycle interval gaps that occurred for this cycle's range. For example, if the base cycle is 1 Second, the requested rollup cycle is 1 Year, and the request datetime range is Jan 1, 2010 12:00 am to Jan 1, 2011 12:00 am, then a single value representing how many 1 second intervals are Gaps is returned.
NonGapCount : Like GapCount, but returns the number of base cycle intervals that are not gaps for the requested datetime range.
IntvlCount : Returns the number of base cycle intervals whether they are a gap or not
MilliSecCount : Returns the number of milliseconds for the requested datetime range
NonGapMilliSecCount : Returns the number of nonGap milliseconds for the requested datetime range

Gap intervals are ignored for some of the above functions (First, Last, Min, Max, Avg, Min Occurrence, Max Occurrence) for interval streams.

Only base cycle information can be uploaded via interval stream feeds. The other feeds are calculated and therefore cannot be uploaded. Rollup feeds reflect any changes to base cycle data during any time period.

Note that an interval stream's base Cycle does not have to exist within the rollup calendar it is referencing. It just needs to be able to "fit" evenly into one of the cycles and it will be rolled-up from the one that it fits into and above. For the example above, a stream might have a base cycle of "30 Minute". It would upload a value for each 30 minute interval and that value would start rolling up to 1 Hour then to 1 Day and so on up to a Year.

What does "fit" evenly mean? A cycle defined as "7 Seconds" will not fit evenly into a rollup cycle defined as "8 Seconds" or "1 Minute" but will fit evenly into a cycle defined as "42 Seconds".

Do you have Time Zone support?

Yes. Our time zone support is very flexible. All components must reference a Time Zone. All cycles reference a time zone or point to a component's time zone.

Rollup calculations use the time zone and cycle reference date to determine the actual UTC time to associate with the rollup interval times. When data is requested via the API it is always requested using UTC datetime epoch milliseconds.

The ability to fix a time zone to a cycle or to a component allows for rollups to occur in a specific time zone (such as the corporate headquarters). For example, an oil company may want to monitor daily oil production from each well. Their definition of a day might be the corporate headquarters time zone day. They would set their time zone in each rollup cycle, forcing each well's daily oil production to be calculated using the corporate headquarters definition of a day. If the same oil company wanted to track production in each well's time zone then they would just point each rollup cycle time zone to the component's time zone.

Do you support monitoring and tracking mobile device geo locations?

Yes. A component can have fixed geo coordinate attributes or a component can have a stream for each geo coordinate so yes, GroveStreams allows for storing, retrieving, deriving and rolling-up (basically you can do anything to/with a longitude or latitude stream that you can do with other streams). Geo coordinates are stored as double value types. Geo streams can be regular, interval, or point streams.

A component's current location, along with any of its active events, can be viewed in a map.

How are Time Filters used?

Time Filters allow for the ignoring of samples that occur during specified time spans (seasons) and time intervals for each day of the week.

Time Filters are useful for things such as extracting energy time-of-use billing determinates or detecting if a door was opened on a certain day and at a certain time.

Time filters are applied to stream feeds as they are uploaded or derived. Sample times or interval ranges not included in the filter are converted to Gaps (or null values) for interval streams and ignored for regular streams. This is advantageous because:

It improves performance, since most Gaps are not stored, they do not need to be retrieved
Gap values created by a Time Filter are ignored by the GroveStreams' Rollup-Calendar engine. For example, the request for a Cycle Gap Count will ignore intervals excluded by the Time Filter.

How much functionality is exposed in the GroveStreams' HTTP REST and MQTT APIs?

Almost all of it.

The entire web based user interface utilizes only the GroveStreams APIs for non-static page data. A popular way to become familiar with the GroveStreams HTTP API is to use the Google Chrome browser and hit F12 to monitor all HTTP calls from the browser to GroveStreams.com.

Feel free to design your own user interface using our API.

Can GroveStreams be Rebranded, OEM'd, or White Labeled?

Yes. GroveStreams can be rebranded as your own application.

See the GroveStreams Help Guide for more information about rebranding.