In the increasingly complicated world of IT – Big Data, Cloud, Super Datacentres, DevOps and Agile Development legacy IT support teams are overwhelmed with the new world that they live in.
Traditional server architectures on a 3 or 5 year refresh cycle are being replaced with cloud solutions which may only exist for a matter of hours, but are equally important to the success of the organisations they underpin.
In most organisations IT monitoring systems are designed in from the ground up, as most CIO’s and CTO’s have been caught out by an unexplained outage at some point. However, many IT departments still live blissfully unaware of what is happening in their infrastructure…. Until it’s too late!
Regardless of the actual monitoring technology you will generally see a team of IT professionals attempting to understand this raft of new data. They have to assess multiple streams of monitoring data, understand its context, point the finger at the potential problem and then perform some form of remediation to resolve it.
This has led to the rise of Complex Event Processing (CEP) engines coupled with IT work flow automation tools, such as NetIQ Aegis which utilises StreamBase as its CEP engine.
NetIQ Aegis allows you to listen to multiple streams of data from your IT monitoring systems simultaneously, for example:
- Server Alerts
- Storage Systems
- Network Switches
- Network Intrusion Detection Systems
- Firewalls
- Audit Logs
You can then create “triggers” using the StreamBase CEP to perform data correlation in real time; this is referred to as a temporal query. In the old world of querying static data, imagine this to be a static query that the data washes over.
This kind of “data in flight” analysis has changed the financial markets – you can now analyse events as they are happening – this type of processing allows you to watch for events across all available data streams in real time – e.g.
- If you see a missing ping alert within any 5 minute window:
- AND the URL www.myurl.com stops responding
- AND there is an alert from the storage subsystem
- THEN Trigger
Every time the CEP engine sees a group of matching events it raises a new alert or trigger…. This is where it gets interesting….
NetIQ Aegis can use event triggers to kick off workflow processes. These workflow’s can contain any number of steps to remediate known faults or raise service desk tickets or make any number of changes to your systems to resolve the problem, automatically, with no human input.
This isn’t to say that we don’t need skilled IT professionals anymore, far from it, but why have a highly skilled worker repeating very standard and similar fixes all day long, when they could be put to use making your business more successful? Complex Event Processing can help you achieve this.
About the author
Peter Rossi is an IT automation expert in the UK currently working on exciting new highly automated cloud project Skyscape Cloud Services (www.skyscapecloud.com). Independently of this, Peter is the author of a well known Microsoft scripting blog www.poshpete.com which focusses on the use of PowerShell in automation.