Frequently Asked Questions
What is GroveStreams?
Who would use GroveStreams?
What is meant by high performance and near real-time?
Do I need to buy or lease servers?
How am I billed?
Is GroveStreams secure?
How does data get into GroveStreams?
Can I control my remote device using GroveStreams?
What are Components and Streams?
What's a Derived Stream?
What's a Component Template?
What is a GroveStreams Organization?
How often can a Feed be uploaded?
Do you support automatic deleting of data?
What are Cycles and Rollup Calendars?
Do you have Time Zone support?
Do you support monitoring and tracking mobile device geo locations?
How are Time Filters used?
How much functionality is exposed in the GroveStreams' RESTful API?
Can GroveStreams be Rebranded, OEM'd, or White Labeled?
What is GroveStreams?
GroveStreams is a cutting-edge platform in the cloud providing decision making capabilities to many users and devices as data arrives from many sources.
GroveStreams' patent-pending data streaming analytics is designed to meet this demand so that your business or organization can quickly react to changes as they're happening. GroveStreams isn't just built to allow you to react to data, it's also built to allow your devices to react accordingly by using the GroveStreams platform independent open API.
The proliferation of devices that generate data increases every day and traditional systems cannot effectively capture, analyze and react to the amount of data these devices are generating in a timely matter. GroveStreams has been built from the ground up using the latest technologies that power today's largest Internet companies such as Google, Yahoo and Facebook. It brings the scalability and reliability that these large websites offer today to your business or organization.
GroveStreams is an open platform, in the cloud, that any organization, user or device can take advantage of.
GroveStreams specializes in:
- Capturing, Analyzing and Acting on Large Amounts of Time Series Data Streams and Data Points. GroveStreams can manage large numbers of data streams for each organization. Each stream can store over 60 million data points (or samples). This is equivalent to two year's of data for a stream defined as a one second interval stream!
- The following categories:
- Sample times accurate to the millisecond
- Many data types supported (short, integer, long, float, double, text, datetime, boolean, geo coordinates, etc.)
- Actionable Analytics:
- User defined roll-ups (i.e., aggregate 1
second data into 5 minutes into hour into day into month into
quarter into year and so on)
- Roll-up calculations (Sum, Time Weighted Avg, Min, Max, Min Occurrence (to the second), Max Occurrence, First, Last)
- Utility/energy time of use billing determinant extraction
- Interval gap detection: monitor the quality of your data as it arrives
- Derived Streams. Streams can be derived
- Formula expressions involving other streams across your organization
- Internal or external RSS feeds
- The aggregation of large number of other streams
- User defined roll-ups (i.e., aggregate 1 second data into 5 minutes into hour into day into month into quarter into year and so on)
- Customizable drag and drop html dash-boarding
- Embeddable Live Charts and Grids. Embed live GroveStreams charts and grids within external websites (just like you can do with youtube videos). See the GroveStreams.com home page for an example of live embeddable charts
- Automatic Registration. Components and streams can register themselves automatically and appear in existing dashboards and aggregation analytics as they upload their initial feed data.
- RSS Feeds: Components and Streams provide RSS feeds. View RSS feeds within Dashboards.
- Event Monitoring with customizable email, SMS and http call notifications
- Component location tracking with Microsoft Bing Maps integration
- User Role based access security
- Public/private web UI settings. Make your organization accessible to only your users or allow anonymous Guest users with the ability to set Guest access rights.
- RESTful API. Almost all functionality is exposed in a public API.
- Fine-grained API access security
- Time Filtering (Time of Use filtering)
- Time zone support
- Custom units and formatting
- Browser based User Interface. All HTML with no plug-ins such as flash. 100% thin client.
- Mobile (smart phone) application
Who would use GroveStreams?
Anyone or any device that needs to collect large amounts of time series data, monitor it, analyze it and react to it or other devices' data quickly. Whether you want to monitor one data stream from a single source or many more streams from many sources, GroveStreams is available to you and others including:
- Open-source electronic enthusiasts (such as Arduino users)
- Sensor/device driven organizations or businesses
- Healthcare providers
- GroveStreams (We use GroveStreams to monitor our own servers)
- And many others ...
Use GroveStreams when you:
- Need near real-time:
- Environmental monitoring:
- "Notify me when the temperature of my freezer is higher than 25 °F during weekend off-hours"
- "Notify my clients if a commodity price increases by 10% over one day"
- "Record oil production from thousands of wells - notify me of any well problems"
- "Notify my doctor if my wearable biometric devices
- Complex Billing calculations:
- "What is my ongoing energy cost for the last
minute and the current month for my time of use block
- "What is my ongoing energy cost for the last minute and the current month for my time of use block rate?"
- Environmental monitoring:
- Need to collect and store large amounts of data at small sample sizes and interval sizes
- Need to know if the data is reliable
- Need to monitor a mobile component's
- "Take action when one of our cargo ships is within 50 miles of its home port"
- "Notify me if two cargo ships come within two miles of each other"
- Need to monitor rolled-up data statistics
as data arrives
- For example, you may have a meter or sensor uploading one
second interval data every 10 seconds but you only want to
monitor your daily peak. You can create a roll-up calendar and
associate it with that stream so that one second intervals
roll-up to 5 minute data which rolls up to hour data then day
data then month data and so on in as data arrives. You can then
dashboard your daily cycle and watch it change as one second data
is uploaded into GroveStreams.
- For example, you may have a meter or sensor uploading one second interval data every 10 seconds but you only want to monitor your daily peak. You can create a roll-up calendar and associate it with that stream so that one second intervals roll-up to 5 minute data which rolls up to hour data then day data then month data and so on in as data arrives. You can then dashboard your daily cycle and watch it change as one second data is uploaded into GroveStreams.
- Have devices that you want to sell to customers but don't have the time, money or expertise to develop the software to manage the data generated by them.
What is meant by high
performance and near real-time?
GroveStreams is powered by 64-bit Java running on the Linux operating system. It can run on hundreds of servers that utilize partitioning and parallel processing. Most transactions happen in a few milliseconds including uploading and downloading tens of thousands of metrics. For example, a GroveStreams request to download one day of second samples (86,400 values) usually takes under 1 second.
The GroveStreams API supports the batching of samples across streams and components during uploading or downloading. This can allow for many streams and their samples to be uploaded with sub-second response times.
Long running tasks are managed by the GroveStreams job framework. An example of a long running task is a component reconcile job. If a user changes a stream's interval size or its data type and the stream has several years of one second data, it could take some time updating all this data, so a job is kicked off and runs asynchronously. Feeds can still be uploaded/downloaded while jobs are running. Note that some jobs, such as the reconcile job, require that a feed only append data while a job is running.
How am I billed?
GroveStreams is free for small users. Users, who are flagged as an organization payer, will be billed monthly via a registered credit card.
Is GroveStreams secure?
Yes. The GroveStreams user interface is password protected and allows for all requests to occur via SSL.
GroveStreams provides the ability for an organization to provide access security to its API. Each organization can create API access keys. An API access secret key is passed with every API call. API key security is very granular. Each API key can have specific rights assigned for resource types and/or specific resource instances. This avoids hard-coding passwords in your devices or other websites that are utilizing the API. For example, a single API key can be generated for read only (GET) access to a single Stream's Feed or one API key can be generated that grants access to the entire API for all actions (GET, PUT, POST, DELETE). See the Developers page for more details about API keys.
How does data get into
Data is uploaded into GroveStreams several ways:
- Via the GroveStreams RESTful API. Devices can upload (or download) batched stream data every 10 seconds or more. All data can also be downloaded via the API. This is the preferred upload method for device generated data. See the Developers page for more information.
- Via RSS feeds. Data streams can be derived from internal or external RSS feeds. For example, users can derive a stream from a weather website to automatically import weather data.
- Via import files. The GroveStreams file import framework is very powerful and fast. You can upload very large files and you can upload thousands of files for each import. GroveStreams currently supports its proprietary json feed file format and a customizable text delimited file format. Contact us if you would like us to support a popular or custom file format - we will work with you to get your data into GroveStreams. Imports can be scheduled from within GroveStreams to run periodically.
- Manually via the GroveStreams website.
Can I control my remote device using GroveStreams?
Yes. With the correct security rights, you can return a command or configuration signal back to the device every time the device sends data to the platform. This technique allows commands to travel through a firewall without having to configure your firewall. If you send data at intervals of every 10s, 1M, or 1 Hour, it is during those times that a control signal would be sent. If your stream is random in nature, then your update signal would follow the same pattern. Sending an “I'm alive” data signal from your device at an appropriate interval would make your control capability more robust.
What are Components and
A Stream represents a collection of one or more data points. Each data point can be associated with a single time stamp or a time range. Timestamps are accurate to the millisecond.
There are three types of streams:
- Regular Streams: Each data point is associated with a datetime. datetimes can be random or arrive at fixed intervals. There can be 10s of millions of data points for each stream. Regular streams are the most common stream type
- Interval Streams: Each data point is associated with a start datetime (inclusive) and an end datetime (exclusive). There can be 10s of millions of interval data points for each stream.
- Point Streams: There is only one data point. This value can change frequently or infrequently, but only one data point is stored.
Stream data types can be numbers, dates and times, text, booleans (yes,no), longitude, latitude or elevation. Each Stream can have time filters, constraints, gap filling methods, a unit and other items applied to its feed data as it is uploaded. Stream data can be uploaded via the GroveStreams' API or a stream can be defined as a derived stream. It is not required for a component to upload all of its streams at the same time with the exception of mobile components. Mobile components are required to upload their longitude and latitude streams at the same time.
A component is a "thing" that is also a container for a group of streams and events that share a similar location. For example, a component might be a sensor that monitors temperature and humidity. Temperature would be one stream and humidity would be another stream. If the sensor were mobile, its component could also have latitude and longitude streams that record the location of the sensor component over time.
A component is an abstract concept that represents physical "things" or non-physical "things". Examples of components:
- Physical "things" :
- Utility Meters
- Smart Plugs
- Sensors (Temperature, Accelerometer, etc.)
- Motion Detectors
- Cardiac Telemetry Devices
- Cellular Phones
- Vehicles, Planes, Boats
- Nonphysical "things"
What's a Derived Stream?
Streams can be configured to be derived from RSS feeds or from expressions.
From RSS Feed: Streams can be derived from internal or external RSS feeds. GroveStreams allows for a simple filter to be applied to RSS feed results so that certain values can be extracted from a feed.
RSS feeds are a popular way of bringing external data into GroveStreams for mashing up with other streams. Many RSS feeds are available on the Internet including:
Weather data Stock quotes Currency Exchange Rates External, 3rd party sensor data
GroveStreams updates RSS derived streams hourly.
Streams can be derived from aggregating two or thousands other streams. Calculate statistics such as the maximums, minimums, sums, averages and gap counts for each interval in the aggregation time range. Aggregation is started manually or it can be scheduled to run periodically since it can be a lengthy process.
A Derived expression stream is calculated from any other component streams within an organization. Calculations occur periodically (usually every few seconds) and only occur when all dependents have data available for the time period being calculated. Different stream roll-up cycles and interval offsets can be used inside a derivation expression.
Derived streams are just like other streams except that their feed data is derived by the GroveStreams' derivation engine. They can be graphed and monitored in dashboards and such. Derived streams can be derived from other derived streams.
Expression variables are streams. If a variable is an interval stream, then the user can choose the interval offset for that variable. For example, if an hourly interval with a time span of 2:00 pm to 3:00 pm is being calculated, a dependent variables' offset can be set to -1 and that variables' data for time span 1:00 pm to 2:00 pm will be used in the calculation. Offsets are useful for things like calculating rolling averages or comparing monthly costs.
Expressions can contain any stream data types, conditional "if" operators (<=,>=, <,>, !=, ==, &&, ||) and many functions (i.e., trig functions, log functions, exponents, absolute value, random number and many more).
Example of a derived stream expression that calculates one second 3 point rolling averages from component1.stream1:
(n + n_minus_1 + n_minus_2) / 3
Other usages of Derived Streams:
- A Stream that filters out the Winter on Peak kilowatt usage
- A Stream that converts Fahrenheit to Celsius
- A Stream that calculates summer weekend afternoon peak temperatures
- A Stream that sums all the energy for all plant meters (aggregates a collection of component streams)
- A Stream that calculates energy cost every minute, hour, day, month and year
- A Stream that measures the distance between a component and another component every second
- A Stream that applies a diversity factor
Derived stream variables can be different data types. A stream defined as text can be derived from the sum of two other streams of types long and double. GroveStreams will attempt to do the data type conversion wherever it is possible. If the conversion is not possible, the resulting interval will be considered a Gap.
As you can see, derived streams are a very powerful tool! You can
view derived stream screen snapshots here.
What's a Component Template?
Large organizations typically have a large number of devices that are identical (such as utility electric meters). It makes no sense to manually create a new component instance by hand every time a device comes on-line.
A component template is used to simplify creating multiple component instances that only differ by a few attributes and stream feed data. When a component uses the feed API to upload data it can pass in a component template Id. A component instance will be created automatically within the organization, if a component does not already exist for that feed (also known as device automatic registration).
One or more components created from a component template can remain linked to that template. Any changes made to a template can be applied to all components linked to it. This makes mass modeling changes easy to do without any required programming.
What is a GroveStreams
An organization is a Workspace, typically representing a home, business, or organization. Each organization has its own set of components, streams, dashboards, maps, and other items. An organization allows you to control user and device access to the organization. You are automatically the "owner" and given full access rights when you create an organization. Other users may invite you to their organizations with rights they give you. All of the organizations, you "own" or are a member of, will appear in your GroveStreams start page (the first page that appears when you sign in).
GroveStreams data is store hierarchically in this manner:
How often can a Feed be
Multiple streams can be uploaded in batch from each public facing IP source every ten seconds or longer. Note that the stream may sample every second and upload those 10 samples every 10 seconds. See the API Limits page for more details about our limits.
Do you support automatic
deleting of data?
Yes. Some users may wish to retain stream data for a short period of time. A "delete profile" can be created and associated with each stream. Each profile specifies a user defined time span to retain stream data. A couple of times a day, the GroveStreams delete profile system job will run and delete stream data according to its delete profile. GroveStreams is not a round-robin database (RRD). Data is retained as long as you need it.
What are Cycles and Rollup
Each interval stream's data time range is specified by the base cycle associated with a stream. A cycle is a user defined recurring period. A cycle can be defined to be fixed or custom:
- Fixed size: User defines the start datetime and time zone and then selects "X" number of Seconds, Minutes, Hours, Days, Weeks, Months or Years
- Custom: User defines a collection of start and end datetimes for reach interval and a time zone
Examples of Fixed Cycles:
- Second: 1 Second, starting at December 31, 2011 12:00:00 am
- Quarterly: 3 Month, starting at January 15, 2012 12:00:00 pm
Example of Custom Cycle:
- Seasonal: User defined Spring, Summer, Fall, Winter seasons
The time zone is used for the start datetime for fixed cycles and for the start and end datetimes for custom cycles. The time zone can be set to use the component's time zone so that the same cycle can be used across time zones.
A rollup calendar defines how stream data can be rolled-up for viewing and analysis. Rollup calendars are optional.
If a stream has a base cycle set for one second intervals, it quickly becomes cumbersome to view and analyze a year of data which is 31,536,000 intervals. For the "float" data type which is 4 bytes, a years worth of data would be about 120 Megabytes. Instead of downloading 31 million intervals and sifting through them, a user is better off downloading a set of rolled-up stream data at a larger interval size and then drilling in on the interesting cycles. A good example of a graph that demonstrates the power of roll-ups is the Google finance Graph. Unfortunately, unlike GroveStreams, Google hard-coded their roll-up cycles (1d, 5d, 1m, 6m, 1y, 5y).
Also, you may want to monitor your energy usage in 1 second intervals so as to watch for certain events, but you are probably billed based on monthly kW and or kWh. A stream with a rollup calendar would be an ideal solution. You can view/monitor 1 second intervals and you can also view/monitor your one hour and one month cycle intervals too, at the same time. You'll be able to watch your one month cycle interval change nearly every second... actually, every 10 seconds since you can only upload every 10 seconds :(
A rollup calendar is defined by simply referencing a collection of cycles, described above. When the rollup calendar is saved, it will be sorted and validated that it contains no gaps. A stream's feed can only be uploaded to the base cycle. Any stream non-base cycle information can be used (graphing and such) just like the base cycle.
Let's define a rollup calendar that rolls one second data into several other cycles up to a year: Cycles that were defined earlier and that the rollup calendar will reference:
- 1 Second Cycle
- 5 Minute Cycle
- 1 Hour Cycle
- 1 Day Cycle
- 1 Month Cycle
- 1 Year Cycle
After the rollup calendar is defined and associated with a stream, the stream's data for each rolled-up cycle is available as base cycle data is uploaded. When the stream uploads 10 intervals of data (1 second each), all the rollup data is immediately available via the API and browser user interface. Rollup cycle data can be requested with these functions:
- First : Returns the first interval value of the underlying cycle
- Last : Returns the last interval value of the underlying cycle
- Min : Returns the smallest interval value of the underlying cycle
- Max : Returns the largest interval value of the underlying cycle
- Avg : Returns the time weighted average of the underlying cycle
- Sum : Returns the sum of the intervals of the underlying cycle
- Min Occurrence : Returns the start datetime the minimum occurred for the base cycle, accurate to the second
- Max Occurrence : Returns the start datetime the maximum occurred for the base cycle, accurate to the second
- SampleCount : Returns the number of samples that occurred for this cycle's range.
- GapCount : Returns the number of base cycle interval gaps that occurred for this cycle's range. For example, if the base cycle is 1 Second, the requested rollup cycle is 1 Year, and the request datetime range is Jan 1, 2010 12:00 am to Jan 1, 2011 12:00 am, then a single value representing how many 1 second intervals are Gaps is returned.
- NonGapCount : Like GapCount, but returns the number of base cycle intervals that are not gaps for the requested datetime range.
- IntvlCount : Returns the number of base cycle intervals whether they are a gap or not
- MilliSecCount : Returns the number of milliseconds for the requested datetime range
- NonGapMilliSecCount : Returns the number of nonGap milliseconds for the requested datetime range
Gap intervals are ignored for some of the above functions (First, Last, Min, Max, Avg, Min Occurrence, Max Occurrence) for interval streams.
Only base cycle information can be uploaded via interval stream feeds. The other feeds are calculated and therefore cannot be uploaded. Rollup feeds reflect any changes to base cycle data during any time period.
Note that an interval stream's base Cycle does not have to exist within the rollup calendar it is referencing. It just needs to be able to "fit" evenly into one of the cycles and it will be rolled-up from the one that it fits into and above. For the example above, a stream might have a base cycle of "30 Minute". It would upload a value for each 30 minute interval and that value would start rolling up to 1 Hour then to 1 Day and so on up to a Year.
Do you have Time Zone
Yes. Our time zone support is very flexible. All components must reference a Time Zone. All cycles reference a time zone or point to a component's time zone.
Rollup calculations use the time zone and cycle reference date to determine the actual UTC time to associate with the rollup interval times. When data is requested via the API it is always requested using UTC datetime epoch milliseconds.
The ability to fix a time zone to a cycle or to a component allows for rollups to occur in a specific time zone (such as the corporate headquarters). For example, an oil company may want to monitor daily oil production from each well. Their definition of a day might be the corporate headquarters time zone day. They would set their time zone in each rollup cycle, forcing each well's daily oil production to be calculated using the corporate headquarters definition of a day. If the same oil company wanted to track production in each well's time zone then they would just point each rollup cycle time zone to the component's time zone.
Do you support monitoring
and tracking mobile device geo locations?
Yes. A component can have fixed geo coordinate attributes or a component can have a stream for each geo coordinate so yes, GroveStreams allows for storing, retrieving, deriving and rolling-up (basically you can do anything to/with a longitude or latitude stream that you can do with other streams). Geo coordinates are stored as double value types. Geo streams can be regular, interval, or point streams.
How are Time Filters used?
Time Filters allow for the ignoring of samples that occur during specified time spans (seasons) and time intervals for each day of the week.
Time Filters are useful for things such as extracting energy time-of-use billing determinates or detecting if a door was opened on a certain day and at a certain time.
Time filters are applied to stream feeds as they are uploaded or derived. Sample times or interval ranges not included in the filter are converted to Gaps (or null values) for interval streams and ignored for regular streams. This is advantageous because:
- It improves performance, since most Gaps are not stored, they do not need to be retrieved
- Gap values created by a Time Filter are ignored by the GroveStreams' Rollup-Calendar engine. For example, the request for a Cycle Gap Count will ignore intervals excluded by the Time Filter.
How much functionality is
exposed in the GroveStreams' RESTful API?
Almost all of it.
The entire web based user interface utilizes only the GroveStreams API for non-static page data. A popular way to become familiar with the GroveStreams API is to use the Google Chrome browser and hit F12 to monitor all HTTP calls from the browser to GroveStreams.com.
Can GroveStreams be
Rebranded, OEM'd, or White Labeled?
Yes. GroveStreams can be rebranded as your own application.
See the GroveStreams Help Guide for more information about rebranding.
Contact us at firstname.lastname@example.org for inquiries about licensing an entire GroveStreams cluster.