FAQ
Frequently Asked Questions
What is GroveStreams?
Who would use GroveStreams?
What is meant by high performance and
near real-time?
Do I need to
buy or lease servers?
How am
I billed?
Is GroveStreams
secure?
How does data get
into GroveStreams?
Can I control my remote device using GroveStreams?
What are
Components and Streams?
What's
a Derived Stream?
What's a
Component Template?
What is
a GroveStreams Organization?
How
often can a Feed be uploaded?
Do
you support automatic deleting of data?
What are Cycles and Rollup
Calendars?
Do you have Time
Zone support?
Do you
support monitoring and tracking mobile device geo locations?
How are Time Filters used?
How much functionality is exposed in
the GroveStreams' RESTful API?
Can
GroveStreams be Rebranded, OEM'd, or White Labeled?
Answers
What is GroveStreams?
GroveStreams is a cutting-edge data streaming platform in the cloud
providing decision making capabilities to many users and devices as
data arrives from many sources.
GroveStreams' data streaming analytics is designed to meet this demand so that your business or organization can quickly react to changes as they're happening. GroveStreams isn't just built to allow you to react to data, it's also built to allow your devices to react accordingly by using the GroveStreams platform independent open API.
The proliferation of devices that generate data increases every day and traditional systems cannot effectively capture, analyze and react to the amount of data these devices are generating in a timely matter. GroveStreams has been built from the ground up using the latest technologies that power today's largest Internet companies such as Google, Yahoo and Facebook. It brings the scalability and reliability that these large websites offer today to your business or organization.
GroveStreams is an open platform, in the cloud, that any organization, user or device can take advantage of.
GroveStreams specializes in:
- Capturing, Analyzing and Acting on Large Amounts of Time Series Data Streams and Data Points. GroveStreams can manage large numbers of data streams for each organization. Each stream can store over 60 million data points (or samples). This is equivalent to two year's of data for a stream defined as a one second interval stream!
- The following categories:
- "Real-time Business Intelligence", "Operational Intelligence"
- "Internet of Things", "Device Cloud", "M2M or Machine to Machine", "Sensor Network", "Smart Grid"
- "Data Logging"
- Sample times accurate to the millisecond
- Many data types supported (short, integer, long, float, double, text, datetime, boolean, geo coordinates, files, etc.)
- File Streams. Upload store, retrieve, and view any file type (.wav, .mpg, .pdf, etc.) within a stream.
- Actionable Analytics:
- User defined roll-ups (i.e., aggregate 1
second data into 5 minutes into hour into day into month into
quarter into year and so on)
- Real-time Roll-up calculations (Sum, Time Weighted Avg, Min, Max, Min Occurrence (to the second), Max Occurrence, First, Last)
- Utility/energy time of use billing determinant extraction with support for Holidays
- Interval gap detection: monitor the quality of your data as it arrives
- Derived streams: Streams can be derived
from:
- Formula expressions involving other streams across your organization
- Internal or external RSS feeds
- The aggregation of large number of other streams
- GS SQL: A high performance query language based on SQL with time-series extensions:
SELECT Sample( Range(sd='2021-03-01T00:00:00-05:00', ed='2021-04-01T00:00:00-05:00') ) FROM Stream Return fixed range using ISO8601 SELECT Sample( Range(sd=-1d, ed=now) ) FROM Stream Return last day from current datetime SELECT Sample( Range(currentCycle='month') ) FROM Stream Return current month Select Sample( cycleId='day', stat='max' ) as maximum, FormatDate(Sample( cycleId='day', stat='MAXOCCURRENCE' )) as 'maximum time' From Stream Return 12 month intervals each with the maximum sample value for each month and the datetime it occurred. SELECT formatDate(time) AS 'time', sample( range(last=100) ) FROM Stream WHERE cid='generator22' && id='kw' && sample > 100 Return the last 100 samples and times for a specific stream with values greater than 100. SELECT formatDate(time) AS 'time', sample( timeFilterId='spring', range(previousCycle='year') ) FROM Stream WHERE cid='generator22' UNION ALL SELECT formatDate(time) AS 'time', sample( timeFilterId='summer', range(previousCycle='year') ) FROM Stream WHERE cid='generator22' ORDER BY time Return the Spring and Summer samples for the previous year; ordered by time.
- User defined roll-ups (i.e., aggregate 1
second data into 5 minutes into hour into day into month into
quarter into year and so on)
- Customizable drag and drop html dash-boarding
- Automatic Registration. Components and streams can register themselves automatically and appear in existing dashboards and aggregation analytics as they upload their initial feed data.
- RSS Feeds: Components and Streams provide RSS feeds. View RSS feeds within Dashboards.
- Event Monitoring with customizable templates for email, SMS, MMS, and HTTP call notifications
- Component location tracking with mapbox integration
- User Role based access security
- Public/private web UI settings. Make your organization accessible to only your users or allow anonymous Guest users with the ability to set Guest access rights.
- HTTP REST and MQTT APIs. Almost all functionality is exposed in GS' APIs.
- Fine-grained API access security
- Time Filtering (Time of Use filtering)
- Time zone support
- Custom units and formatting
- Browser based User Interface. All HTML with no plug-ins such as flash. 100% thin client.
- Mobile (smart phone) dashboards
Who would use GroveStreams?
Anyone or any device that needs to collect large amounts of
time series data, monitor it, analyze it and react to it or other
devices' data quickly. Whether you want to monitor one data stream
from a single source or many more streams from many sources,
GroveStreams is available to you and others including:
- Open-source electronic enthusiasts (such as Arduino users)
- Sensor/device driven organizations or businesses
- Governments
- Businesses
- Utilities
- Healthcare providers
- GroveStreams (We use GroveStreams to monitor our own servers)
- And many others ...
Use GroveStreams when you:
- Need near real-time:
- Environmental monitoring:
- "Notify me when the temperature of my freezer is higher than 25 °F during weekend off-hours"
- "Notify my clients if a commodity price increases by 10% over one day"
- "Record oil production from thousands of wells - notify me of any well problems"
- "Notify my doctor if my wearable biometric devices
detect anomalies"
- Complex Billing calculations:
- "What is my ongoing energy cost for the last
minute and the current month for my time of use block
rate?"
- "What is my ongoing energy cost for the last
minute and the current month for my time of use block
rate?"
- Environmental monitoring:
- Need to collect and store large amounts of data at small sample sizes and interval sizes
- Need to know if the data is reliable
- Need to monitor a mobile component's
location
- "Take action when one of our cargo ships is within 50 miles of its home port"
- "Notify me if two cargo ships come within two miles of each other"
- Need to monitor rolled-up data statistics
as data arrives
- For example, you may have a meter or sensor uploading one
second interval data every 10 seconds but you only want to
monitor your daily peak. You can create a roll-up calendar and
associate it with that stream so that one second intervals
roll-up to 5 minute data which rolls up to hour data then day
data then month data and so on in as data arrives. You can then
dashboard your daily cycle and watch it change as one second data
is uploaded into GroveStreams.
- For example, you may have a meter or sensor uploading one
second interval data every 10 seconds but you only want to
monitor your daily peak. You can create a roll-up calendar and
associate it with that stream so that one second intervals
roll-up to 5 minute data which rolls up to hour data then day
data then month data and so on in as data arrives. You can then
dashboard your daily cycle and watch it change as one second data
is uploaded into GroveStreams.
- Have devices that you want to sell to customers but don't have the time, money or expertise to develop the software to manage the data generated by them.
What is meant by high
performance and near real-time?
GroveStreams can
run on hundreds of servers that utilize partitioning and parallel
processing. Most transactions happen within milliseconds including
uploading and downloading tens of thousands of metrics. For example,
a GroveStreams request to download one day of second samples (86,400
values) usually takes under 1 second.
The GroveStreams API supports the batching of samples across streams and components during uploading or downloading. This can allow for many streams and their samples to be uploaded with sub-second response times.
Long running tasks are managed by the GroveStreams job framework. An example of a long running task is a component reconcile job. If a user changes a stream's interval size or its data type and the stream has several years of one second data, it could take some time updating all this data, so a job is kicked off and runs asynchronously. Feeds can still be uploaded/downloaded while jobs are running. Note that some jobs, such as the reconcile job, require that a feed only append data while a job is running.
Component locations, roll-up calculations, gap filling, constraints, event detection and notifications occur as feeds are uploaded.>
Do I need to buy or lease
servers?
No. Data from one or more
devices can be upload directly into GroveStreams as long as one of
your devices has Internet access. The option to lease an entire private cluster is available to large customers.
How am I billed?
GroveStreams is free for small users. Larger users will be billed monthly via a registered credit card. Enterprise customers may request monthly invoicing.
All organizations, including organizations that fall within the free limits, can include an unlimited number of users. Visit our pricing page and the pricing FAQ page for more information.
Is GroveStreams secure?
Yes. Our data center is SOC2 certified. The GroveStreams website is
password protected and allows for all requests to occur via SSL. Strong passwords, that expire, can be configured along with
two factor authentication (2FA).
Our HTTP API uses access keys.
Our MQTT API uses X.509 certificates.
Both APIs can have specific rights assigned for resource types and/or specific resource instances. See the Developers page
for more details about API keys.
How does data get into
GroveStreams?
Data is uploaded into GroveStreams
several ways:
- Via the GroveStreams HTTP REST and MQTT APIs.
- Via RSS feeds. Data streams can be derived from internal or external RSS feeds. For example, users can derive a stream from a weather website to automatically import weather data.
- Via import files. The GroveStreams file import framework is very powerful and fast. You can upload very large files and you can upload thousands of files for each import. GroveStreams currently supports its proprietary json feed file format and a customizable text delimited file format. Contact us if you would like us to support a popular or custom file format - we will work with you to get your data into GroveStreams. Imports can be scheduled from within GroveStreams to run periodically.
- Manually via the GroveStreams website.
MQTT API Monitoring and testing can be done within Observation Studio:
Can I control my remote device using GroveStreams?
Yes. The easiest and most secure method is with our MQTT servers and X.509 certificates.
A secondary HTTP technique is to return a command or configuration signal
back to the device every time the device sends data to the platform. This technique allows
commands to travel through a firewall without having to configure your firewall.
What are Components and
Streams?
A Stream represents a collection of one or
more data points. Each data point can be associated with a single
time stamp or a time range. Timestamps are accurate to the
millisecond.
There are three types of streams:
- Regular Streams: Each data point is associated with a datetime. datetimes can be random or arrive at fixed intervals. There can be 10s of millions of data points for each stream. Regular streams are the most common stream type
- Interval Streams: Each data point is associated with a start datetime (inclusive) and an end datetime (exclusive). There can be 10s of millions of interval data points for each stream.
- Point Streams: There is only one data point. This value can change frequently or infrequently, but only one data point is stored.
Stream data types can be numbers, dates and times, text, booleans (yes,no), longitude, latitude or elevation. Each Stream can have time filters, constraints, gap filling methods, a unit and other items applied to its feed data as it is uploaded. Stream data can be uploaded via the GroveStreams' API or a stream can be defined as a derived stream. It is not required for a component to upload all of its streams at the same time with the exception of mobile components. Mobile components are required to upload their longitude and latitude streams at the same time.
A component is a "thing" that is also a container for a group of streams and events that share a similar location. For example, a component might be a sensor that monitors temperature and humidity. Temperature would be one stream and humidity would be another stream. If the sensor were mobile, its component could also have latitude and longitude streams that record the location of the sensor component over time.
A component is an abstract concept that represents physical "things" or non-physical "things". Examples of components:
- Physical "things" :
- Utility Meters
- Smart Plugs
- Sensors (Temperature, Accelerometer, etc.)
- Motion Detectors
- Cardiac Telemetry Devices
- Cellular Phones
- Vehicles, Planes, Boats
- Nonphysical "things"
- A computer program that tracks computer memory and CPU in a server farm
- A computer program that polls a relational database for information
- A computer program that uploads frequently changing financial information such as sales figures, expenses, stock quotes, etc.
What's a Derived Stream?
Streams can be configured to be derived from RSS feeds or
from expressions.
From RSS Feed: Streams can be derived from internal or external RSS feeds. GroveStreams allows for a simple filter to be applied to RSS feed results so that certain values can be extracted from a feed.
RSS feeds are a popular way of bringing external data into GroveStreams for mashing up with other streams. Many RSS feeds are available on the Internet including:
Weather data Stock quotes Currency Exchange Rates External, 3rd party sensor data
GroveStreams updates RSS derived streams hourly.
From Aggregation:
Streams can be derived
from aggregating two or thousands other streams. Calculate
statistics such as the maximums, minimums, sums, averages and gap
counts for each interval in the aggregation time range. Aggregation
is started manually or it can be scheduled to run periodically since
it can be a lengthy process.
From Expression:
A Derived expression stream
is calculated from any other component streams within an
organization. Calculations can occur as data arrives from API calls or it can occur periodically, usually one to three minutes, for large batch calculations. Derivation only happens when dependents have data available for
the time period being calculated. Different stream roll-up cycles
and interval offsets can be used inside a derivation expression.
Derived streams are just like other streams except that their feed data is derived by the GroveStreams' derivation engine. They can be graphed and monitored in dashboards and such. Derived streams can be derived from other derived streams.
Expression variables are streams. If a variable is an interval stream, then the user can choose the interval offset for that variable. For example, if an hourly interval with a time span of 2:00 pm to 3:00 pm is being calculated, a dependent variables' offset can be set to -1 and that variables' data for time span 1:00 pm to 2:00 pm will be used in the calculation. Offsets are useful for things like calculating rolling averages or comparing monthly costs.
Expressions can contain any stream data types, conditional "if" operators (<=,>=, <,>, !=, ==, &&, ||) and many functions (i.e., trig functions, log functions, exponents, absolute value, random number and many more).
Example of a derived stream expression that calculates one second 3 point rolling averages from component1.stream1:
Variables:
Variable |
Offset |
Stream |
Cycle |
Cycle
Function |
n_minus_0 |
0 |
component1.stream1 |
Second |
- |
n_minus_1 |
-1 |
component1.stream1 | Second | - |
n_minus_2 | -2 |
component1.stream1 | Second | - |
Expression:
(n + n_minus_1 + n_minus_2) / 3
Other usages of Derived Streams:
- A Stream that filters out the Winter on Peak kilowatt usage
- A Stream that converts Fahrenheit to Celsius
- A Stream that calculates summer weekend afternoon peak temperatures
- A Stream that sums all the energy for all plant meters (aggregates a collection of component streams)
- A Stream that calculates energy cost every minute, hour, day, month and year
- A Stream that measures the distance between a component and another component every second
- A Stream that applies a diversity factor
Derived stream variables can be different data types. A stream defined as text can be derived from the sum of two other streams of types long and double. GroveStreams will attempt to do the data type conversion wherever it is possible. If the conversion is not possible, the resulting interval will be considered a Gap or NULL.
As you can see, derived streams are a very powerful tool! You can view derived stream screen snapshots here.
What's a Component Template?
Large organizations typically have a large number of devices
that are identical (such as utility electric meters). It makes no
sense to manually create a new component instance by hand every time
a device comes on-line.
A component template is used to simplify creating multiple component instances that only differ by a few attributes and stream feed data. When a component uses the feed API to upload data it can pass in a component template Id. A component instance will be created automatically within the organization, if a component does not already exist for that feed (also known as device automatic registration).
One or more components created from a component template can remain linked to that template. Any changes made to a template can be applied to all components linked to it. This makes mass modeling changes easy to do without any required programming.
What is a GroveStreams
Organization?
An organization is a Workspace,
typically representing a home, business, or organization. Each
organization has its own set of components, streams, dashboards,
maps, and other items. An organization allows you to control user
and device access to the organization. You are automatically the
"owner" and given full access rights when you create an
organization. Other users may invite you to their organizations with
rights they give you. All of the organizations, you "own" or are a
member of, will appear in your GroveStreams start page (the first
page that appears when you sign in).
GroveStreams data is store hierarchically in this manner:
Organizations
Components
Streams
Component
Templates
Maps
Dashboards
How often can a Feed be
uploaded?
Multiple streams can be uploaded in batch
from each public facing IP source every ten seconds or longer. Note
that the stream may sample every second and upload those 10 samples
every 10 seconds. See the HTTP API
Limits and MQTT Limits pages for more details.
Do you support automatic
deleting of data?
Yes. Some users may
wish to retain stream data for a short period of time. A "delete
profile" can be created and associated with each stream. Each
profile specifies a user defined time span to retain stream data. A
couple of times a day, the GroveStreams delete profile system job
will run and delete stream data according to its delete profile.
GroveStreams is not a round-robin database (RRD). Data is retained
as long as you need it.
What are Cycles and Rollup
Calendars?
Cycle
Each interval
stream's data time range is specified by the base cycle associated
with a stream. A cycle is a user defined recurring period. A cycle
can be defined to be fixed or custom:
- Fixed size: User defines the start datetime and time zone and then selects "X" number of Seconds, Minutes, Hours, Days, Weeks, Months or Years
- Custom: User defines a collection of start and end datetimes for reach interval and a time zone
Examples of Fixed Cycles:
- Second: 1 Second, starting at December 31, 2011 12:00:00 am
- Quarterly: 3 Month, starting at January 15, 2012 12:00:00 pm
Example of Custom Cycle:
- Seasonal: User defined Spring, Summer, Fall, Winter seasons
The time zone is used for the start datetime for fixed cycles and for the start and end datetimes for custom cycles. The time zone can be set to use the component's time zone so that the same cycle can be used across time zones.
Rollup Calendar
A rollup calendar defines
how stream data can be rolled-up for viewing and analysis. Rollup
calendars are optional.
If a stream has a base cycle set for one second intervals, it quickly becomes cumbersome to view and analyze a year of data which is 31,536,000 intervals. For the "float" data type which is 4 bytes, a years worth of data would be about 120 Megabytes. Instead of downloading 31 million intervals and sifting through them, a user is better off downloading a set of rolled-up stream data at a larger interval size and then drilling in on the interesting cycles. A good example of a graph that demonstrates the power of roll-ups is the Google finance Graph. Unfortunately, unlike GroveStreams, Google hard-coded their roll-up cycles (1d, 5d, 1m, 6m, 1y, 5y).
Also, you may want to monitor your energy usage in 1 second intervals so as to watch for certain events, but you are probably billed based on monthly kW and or kWh. A stream with a rollup calendar would be an ideal solution. You can view/monitor 1 second intervals and you can also view/monitor your one hour and one month cycle intervals too, at the same time. You'll be able to watch your one month cycle interval change nearly every second... actually, every 10 seconds since you can only upload every 10 seconds :(
A rollup calendar is defined by simply referencing a collection of cycles, described above. When the rollup calendar is saved, it will be sorted and validated that it contains no gaps. A stream's feed can only be uploaded to the base cycle. Any stream non-base cycle information can be used (graphing and such) just like the base cycle.
Example
Let's define a rollup calendar that
rolls one second data into several other cycles up to a year: Cycles
that were defined earlier and that the rollup calendar will
reference:
- 1 Second Cycle
- 5 Minute Cycle
- 1 Hour Cycle
- 1 Day Cycle
- 1 Month Cycle
- 1 Year Cycle
After the rollup calendar is defined and associated with a stream, the stream's data for each rolled-up cycle is available as base cycle data is uploaded. When the stream uploads 10 intervals of data (1 second each), all the rollup data is immediately available via the API and browser user interface. Rollup cycle data can be requested with these functions:
- First : Returns the first interval value of the underlying cycle
- Last : Returns the last interval value of the underlying cycle
- Min : Returns the smallest interval value of the underlying cycle
- Max : Returns the largest interval value of the underlying cycle
- Avg : Returns the time weighted average of the underlying cycle
- Sum : Returns the sum of the intervals of the underlying cycle
- Min Occurrence : Returns the start datetime the minimum occurred for the base cycle, accurate to the second
- Max Occurrence : Returns the start datetime the maximum occurred for the base cycle, accurate to the second
- SampleCount : Returns the number of samples that occurred for this cycle's range.
- GapCount : Returns the number of base cycle interval gaps that occurred for this cycle's range. For example, if the base cycle is 1 Second, the requested rollup cycle is 1 Year, and the request datetime range is Jan 1, 2010 12:00 am to Jan 1, 2011 12:00 am, then a single value representing how many 1 second intervals are Gaps is returned.
- NonGapCount : Like GapCount, but returns the number of base cycle intervals that are not gaps for the requested datetime range.
- IntvlCount : Returns the number of base cycle intervals whether they are a gap or not
- MilliSecCount : Returns the number of milliseconds for the requested datetime range
- NonGapMilliSecCount : Returns the number of nonGap milliseconds for the requested datetime range
Gap intervals are ignored for some of the above functions (First, Last, Min, Max, Avg, Min Occurrence, Max Occurrence) for interval streams.
Only base cycle information can be uploaded via interval stream feeds. The other feeds are calculated and therefore cannot be uploaded. Rollup feeds reflect any changes to base cycle data during any time period.
Note that an interval stream's base Cycle does not have to exist within the rollup calendar it is referencing. It just needs to be able to "fit" evenly into one of the cycles and it will be rolled-up from the one that it fits into and above. For the example above, a stream might have a base cycle of "30 Minute". It would upload a value for each 30 minute interval and that value would start rolling up to 1 Hour then to 1 Day and so on up to a Year.
What does "fit" evenly mean? A cycle defined as "7 Seconds" will not fit evenly into a rollup cycle defined as "8 Seconds" or "1 Minute" but will fit evenly into a cycle defined as "42 Seconds".
Do you have Time Zone
support?
Yes. Our time zone support
is very flexible. All components must reference a Time Zone. All
cycles reference a time zone or point to a component's time zone.
Rollup calculations use the time zone and cycle reference date to determine the actual UTC time to associate with the rollup interval times. When data is requested via the API it is always requested using UTC datetime epoch milliseconds.
The ability to fix a time zone to a cycle or to a component allows for rollups to occur in a specific time zone (such as the corporate headquarters). For example, an oil company may want to monitor daily oil production from each well. Their definition of a day might be the corporate headquarters time zone day. They would set their time zone in each rollup cycle, forcing each well's daily oil production to be calculated using the corporate headquarters definition of a day. If the same oil company wanted to track production in each well's time zone then they would just point each rollup cycle time zone to the component's time zone.
Do you support monitoring
and tracking mobile device geo locations?
Yes.
A component can have fixed geo coordinate attributes or a component
can have a stream for each geo coordinate so yes, GroveStreams
allows for storing, retrieving, deriving and rolling-up (basically
you can do anything to/with a longitude or latitude stream that you
can do with other streams). Geo coordinates are stored as double
value types. Geo streams can be regular, interval, or point streams.
A component's current location, along with any of its active events, can be viewed in a map.
How are Time Filters used?
Time Filters allow for the ignoring of samples that occur
during specified time spans (seasons) and time intervals for each
day of the week.
Time Filters are useful for things such as extracting energy time-of-use billing determinates or detecting if a door was opened on a certain day and at a certain time.
Time filters are applied to stream feeds as they are uploaded or derived. Sample times or interval ranges not included in the filter are converted to Gaps (or null values) for interval streams and ignored for regular streams. This is advantageous because:
- It improves performance, since most Gaps are not stored, they do not need to be retrieved
- Gap values created by a Time Filter are ignored by the GroveStreams' Rollup-Calendar engine. For example, the request for a Cycle Gap Count will ignore intervals excluded by the Time Filter.
How much functionality is
exposed in the GroveStreams' HTTP REST and MQTT APIs?
Almost all
of it.
The entire web based user interface utilizes only the GroveStreams APIs for non-static page data. A popular way to become familiar with the GroveStreams HTTP API is to use the Google Chrome browser and hit F12 to monitor all HTTP calls from the browser to GroveStreams.com.
Feel free to design your own user interface using our API.
Can GroveStreams be
Rebranded, OEM'd, or White Labeled?
Yes.
GroveStreams can be rebranded as your own application.
See the GroveStreams Help Guide for more information about rebranding.
Contact us at info@grovestreams.com for inquiries about licensing an entire GroveStreams cluster.