Open source in-memory data grid (IMDG) company Hazelcast has reached the 0.4 release of Hazelcast Jet. This developer and data architect-level technology works with fast-flowing continuous data from IoT devices to process and calculate results in real-time stream processing environments.
Embeddable processing
Hazelcast Jet is application-embeddable, meaning it is a software ‘engine’ that acts as a workhorse inside another application, or more likely in this case, a distributed application stack. Further, it is a distributed processing engine for big data stream processing and batch processing.
New functionality in Hazelcast Jet 0.4 includes event-time processing with tumbling, sliding and ‘session windowing’. We will explain why windows and windowing (lower case w, not the Microsoft kind) matters to the IoT in just a moment.
Continuous IoT data streams
The idea here is to give developers (and therefore users) the ability to use a feature-rich stream processing architecture that is capable of working with continuous data streams – the kind that might, for example, come from an IoT device.
Hazelcast Jet, according to the company, is appropriate for applications such as sensor updates in IoT architectures supporting house thermostats, lighting systems and in-store e-commerce systems. It might also be used for somewhat less IoT-centric social media platforms due to their ‘continuous’ stream of application data.
Read more: IoT data streams fill up the data lake
Why stream processing matters
In terms of IoT data management trends, Hazelcast suggests that stream processing has overtaken batch processing as a preferred method of processing big data sets for companies that require immediate insight into data. However, to get value from data, it must be ‘partitioned’, meaning that we need to take a fragment of the data stream and analyze it.
To create and classify so-called ‘data windows’ during processing, each data element in the stream needs to be associated with a timestamp.
In Hazelcast Jet 0.4, this is achieved via event-time processing (a logical, data-dependent timestamp, embedded in the event itself). However, a major drawback of event-time processing is that events may arrive out of order or late, so you can never be sure if you see all events in a given time window.
Read more: Software AG: Why streaming analytics matter to the IoT
Cleaner windows
To alleviate this issue, the latest release of Hazelcast Jet also includes windowing functionality that enables users to evaluate stream processing jobs at regular time intervals, regardless of how many incoming messages the job is processing.
“The Jet project is progressing faster than we could have hoped. The new functionality in 0.4 brings stream processing for the first time,” said Greg Luck, CEO of Hazelcast.
“As with batch, we are achieving a new performance level, giving us a real edge over alternative market solutions. Jet’s architecture is performance and low-latency driven, which is why there are no real surprises in the results of our latest benchmark. Driven by the community, Jet is an easy to deploy fast data solution for programmers built on the premise of simplicity.” The company highlights the following features:
Fixed/tumbling: Time is partitioned into same-length, non-overlapping chunks and each event belongs to exactly one window.
Sliding: Windows have fixed length, but are separated by a time interval (step) which can be smaller than the window length itself. Typically, the window interval is a multiplicity of the step.
Session: Windows have various sizes and are defined based on data, which should carry some session identifiers.
Read more: Over the IoT data waterfall, in a barrel
Dimensions of space and time
As we start to break down the component parts of the IoT and understand the mechanics of its data flows at this much more granular level, we will start to understand how elements of data in the IoT exist and travel around much like atoms in our own planet’s atmosphere. Only when we can ascribe time, value, defined shape, role and event-specific logic to these values can we start to truly manage the relative space and time of the IoT itself.
Einstein (or perhaps Dr Who) would surely be proud of us. Let’s dig deeper.