This is a blog post was contributed by senior field architect Mark Hudson from StreamBase's London office.
The Moving Finger writes; and, having writ,
Moves on; nor all your Piety nor Wit
Shall lure it back to cancel half a Line,
Nor all your Tears wash out a Word of it.
-- The Rubaiyat of Omar Khayyam
The famous Persian poet was referring, I suppose, to the impossibility of living one's life over again. Fortunately, in the world of stream processing, things are easier - with the right tools. It is indeed feasible to watch what the moving finger of real-time data writes. Better still, yes you can go back - at least to extract valuable insights while there is still time to avoid shedding tears.
Typical StreamBase customers in financial markets are avid consumers of time-based data - "tick data" in the industry jargon - for pricing trades and spotting profit opportunities.
Outside financial markets, there’s increasing interest from every kind of company in the potential of CEP. Very frequently, we hear the need to mine usable insights from a flood of "sensor data".
SENSORS AND (NOT) SENSIBILITY
What's a sensor for this purpose? Well, it could be a gadget in a pump or a generator, way out in the wilds (sometimes literally - in the depths of a forest maybe). It could be an SMS-like message from an aircraft flight computer. It could be a sensor on a door, part of a burglar alarm system. One among many, connected to a remote monitoring centre.
The common requirement here is figuring out the signal from the noise. Sensors generally aren't clever. Usually, they are cheap, reliable and rather dumb. They send a lot of repeat messages. Each sensor usually reports a limited number of measurements or alarm conditions. The sensors aren't clever enough, or don't have enough context awareness, to report more than bare, isolated facts.
The result? The central monitoring function is "flooded" by very large amounts, in total, of individually low-quality (high noise) signals. As often as not, time is a factor, in figuring out the real signal. Events - signals, alarm conditions, unusually small or large measurements - mean a great deal more when several happen in a short burst.
MAKING SENSE OF SENSOR DATA
It's usually even more significant if a certain set of events happens in a specific order. When I get a migraine it’s flashing lights then headache then nausea in that order. A “march of events” as my doctor calls it.
Suppose we have a generator with temperature, voltage and vibration sensors. A reading of 160C happens typically several times a day. The sensor reports every minute, so an individual "hot" period can result in 5-10 "high temperature" alerts but this means little in itself. Similarly, the under-voltage and excessive vibration indicators happen all the time. Crying wolf, if you like.
A much more significant sequence would be something like the following:
1) Vibration alarms being sent for 10 minutes then
2) Temperature consistently higher than the hourly average for a sustained 20 minutes then
3) Under specified voltage by 10% for 20 minutes
and all these things happen in the same hour in that order
HIDDEN MEANING IN THE SENSORY DELUGE
You can probably see where I'm going with this. There is a worn bearing that causes a cooling fan to malfunction for a while. This lets the temperature go up, at which point a feedback loop in the generator controller throttles back the voltage until things cool down.
Normally, say with a database and a language such as Java, it would be very difficult to detect these kinds of multiple, complex, ordered & time-dependent conditions. But with CEP, we can build a system with just 20 visual elements!
Here's our generator scenario written in StreamBase's visual EventFlow language. It starts with a flow of sensor messages. These typically come from an “input adapter” such as the IBM WebSphere MQ example shown in the picture. (StreamBase has over 150 other adapters too).
Finally – though not shown explicitly here – we would take the “bearing problems” combined signals, and act on these. Maybe we send an email or SMS to the operations team. Maybe we send a message directly to the generator to shut down. Maybe we log the event in a database for inspection every morning. Any or all of these responses would be equally easy to plug in.
CEP, such as the StreamBase EventFlow here, makes it very natural to express the solution. You can do some very complicated programming easily. It looks and feels the way an engineer would describe the requirements.
By the way, here’s the StreamBase configuration panel for that clever little Pattern operator:
As you can see, there is a specialized mini-syntax in here for event ordering. The ‘input1’ is the complex vibration alarms criterion, the ‘input2’ is the complex temperature criterion, and the ‘input3’ is the voltage criterion. (That isn’t obvious from the top-level EventFlow diagram, but if you were to look into the details, you’d find it so). So we’re telling the pattern operator to watch for the sequence vibration criterion followed by temperature criterion followed by voltage criterion, as in the original specification. We could even pick & choose further between the details of our three ordered event criteria, by using a more complex Predicate than just the ‘true’ value here. But you’d need to see more details of the underlying operator settings to understand this fully.
AGILE EVENT HANDLING
Just as important, the specialist engineer can understand the logic, and can work with the developer to achieve the correct pattern detection. I'm not a generator engineer, so there are probably some flaws in my scenario. But it doesn't really matter. The right engineer would be able to spot patterns that are diagnostic of underlying issues. That would include problems serious enough to justify a service visit - even to that generator, far out in the woods, powering an aircraft warning light.
Just as important: even expert engineers might not know all the important patterns already. Playing around with the data allows new patterns to be spotted.
The burglar alarm scenarios are even easier to think up. Three intruder alerts on five successive days at different doors belonging to the same company anyone? But our recognition of such "insightful" patterns grows all the time. So, it's important to trap new combinations in an agile way.
In conclusion, I can’t promise “Piety nor Wit”. That’s too much for any software (even StreamBase!).
However, with a clutch of Aggregate operators, a Pattern operator, and few others, finding true sense in the data could at least be yours. You may even get your engineer out into the woods before the moving finger spells disaster. Perhaps even Omar Khayyam would have been impressed!
Drop me an email if you’d like to see the details of a StreamBase EventFlow solution like this for sensor monitoring.