Key Indicators (KIs) Versus Key Performance Indicators (KPIs)

January 31, 2008

SL‘s new web page, Solutions for CEP Engine Users, discusses how CEP is a “technology that is used to help companies detect both opportunities and threats in real-time with minimal coding and reusable key performance indicators (KPIs) and business models.”

I agree with SL, but would like to suggest my friends at SL expand the notion of KPIs in CEP to include the idea of KIs.  In my opinion, the SL phrase should read,  “technology that is used to help companies detect both opportunities and threats in real-time with minimal coding and reusable key indicators (KIs) and business models.”  

The reason for my suggestion is that KPIs are a subset of KIs.   KIs designate, in my mind, more than just performance.  

CEP is used to both detect opportunities and threats in real-time which may, or may not be, performance related.  For example, when a CEP engine detects evidence of fraudulent behavior, this is a KI.  The knowledge, or pattern, used to estimate this situation is a KI not a KPI, per se.   Also, when a CEP application is processing market data and indicates that it is the right time to purchase an equity and enter the market,  the knowledge used in this decision support application is a KI, not a KPI.

Therefore, I recommend when folks think about the notion of  “key performance indicators” (KPIs) in CEP and BAM, they should also think in terms of “key indicators” (KIs).   Detecting opportunities and threats in real-time are much broader than the traditional notion of KPIs. 

An Overture to the 2007 CEP Blog Awards

January 9, 2008

Before announcing the winners of the 2007 CEP Blog Awards I thought it would be helpful to introduce the award categories to our readers.

I have given considerable thought to how to structure The CEP Blog Awards. This was not an easy task, as you might imagine, given the confusion in the event processing marketspace. So here goes.

For the 2007 CEP Blog Awards I have created three event processing categories. Here are the categories and a brief description of each one:

The CEP Blog Award for Rule-Based Event Processing

Preface: I was also inclined to call this category “process-based event processing” or “control-based event processing” and might actually do so in the future. As always, your comments and feedback are important and appreciated.

Rule-based (or process-based) event processing is a major subcategory of event processing. Rule-based approaches to event processing are very useful for stateful event-driven process control, track and trace, dynamic resource management and basic pattern detection (see slide 12 of this presentation). Rule-based approaches are optimal for a wide-range of production-related event processing systems.

However, just like any system, there are engineering trade-offs using this approach. Rule-based systems tend not to scale well when the number of rules (facts) are large. Rule-based approaches can also be difficult to manage in a distributed multi-designer environment. Moreover, rule-based approaches are suboptimal for self-learning and tend not to process uncertainty very well. Never the less, rule-based event processing is a very important CEP category.

The CEP Blog Award for Event Stream Processing

Stream-centric approaches to event processing are also a very important overall category of event processing. Unlike a stateful, process-driven rule-based approach, event stream processing optimizes high performance continuous queries over sliding time windows. High performance, low latency event processing is one of the main design goals for many stream processing engines.

Continuous queries over event streams are genenerally designed to be executed in milliseconds, seconds and perhaps a bit longer time intervals. Process-driven event processing, on the other hand, can manage processes, resources, states and patterns over long time intervals, for example, hours and days, not just milliseconds and seconds.

Therefore, event stream processing tends to be optimized for a different set of problems than process-based (which I am calling rule-based this year) event processing. Similar to rule or process-based approaches, most current stream processing engines do not manage or deal with probability, likelihood and uncertainty very well (if at all).

The CEP Blog Award for Advanced Event Processing

For a lack of a better term, I call this category advanced event processing. Advanced event processing will more-than-likely have a rule-based and/or a stream-based event processing component. However, to be categorized as advanced event processing software the software platform must also be able to perform more advanced event processing that can deal with probability, fuzzy logic and/or uncertainty. Event processing software in this category should also have the capability to automatically learn, or be trained, similar to artificial neural networks (ANNs).

Some of my good colleagues might prefer to call this category AI-capable event processing (or intelligent event processing), but I prefer to call this award category advanced event processing for the 2007 awards. If you like the term intelligent event processing, let’s talk about this in 2008!

Ideally, advanced event processing software should have plug-in modules that permit the event processing architect, or systems programmer, to select and configure one or more different analytical methods at design-time. The results from one method should be available to other methods, for example the output of a stream processing module might be the input to a neural network (NN) or Bayesian Belief (BN) module. In another example pipeline operation, the output of a Bayesian classifier could be the input to a process or rule-based event processing module within the same run-time environment.

For all three categories for 2007, there should be a graphical user interface for design-time construction and modeling. There should also be a robust run-time environment and most, if not all, of the other “goodies” that we expect from event processing platforms.

Most importantly, there should be reference customers for the software and the company. The CEP Blog Awards will be only given to companies with a proven and public customer base.

In my next post on this topic, I’ll name the Awardees for 2007. Thank you for standing by. If you have any questions or comments, please contact me directly.

Adapters and Analytics: COTS? NOT!

December 26, 2007

Marc Adler shows why his musings are rapidly becoming one of my “must read” blogs in his post, CEP Vendors and the Ecosystem.

We have been making similar points in the event processing blogosphere, namely the important of adapters and analytics.   Today, event processing vendors are surprisingly weak in both areas. 

For one thing, there was way much emphasis on rules-based analytics in 2007.  Where are the rest of the plug-and-play commercial analytics end users need for event processing??

And another thing….. 🙂

Why are there so few choices of adapters and why do we have to write our own??

Sometimes I think that if I read another press release on 500,000 events per second I’m going to shout out – the event processing software on the market today cannot even connect to a simple UNIX domain socket out-of-the-box, so how about ZERO events per second!

The bottom line is that the market is still wide open for a software vendor to come to the party with a wide array of plug-and-play, grab-and-go, adapters and analytics.  

Folks are talking COTS, but more often it is NOTS.

Due Diligence on CEP Vendors – Think Business Not Technology

December 2, 2007

In another one of his excellent blog posts, Financial Due Diligence, Marc Adler mentions a New York Times article that describes the same effects on software companies I discussed a few weeks ago in The Subprime Crisis and the Impact on the CEP Market.

Marc blogged that some CEP “pure-play” companies have been laying off their employees due to the current crisis in financial services.   He is quite correct that companies should purchase CEP technologies from software vendors who have a strong, sustainable and viable business.    Marc suggests that companies serious about acquiring CEP software should:

  • Insure the source code is written in a “mainstream” language;
  • Keep the vendor’s source code in escrow;
  • Do an analysis on the vendor’s business model and balance sheet; and,
  • Understand the depth and situations of the key technical personnel.

In my roles as lead systems engineer, consultant and trusted advisor on many IT projects over twenty years, I used to follow a standard weighted-matrix developed that factored in the criteria above (and other criteria).  I’ll try to find and post it – but it is pretty standard systems engineering analysis.   

Many of the CEP vendors have been promoting and marketing “low latency.”   As Marc implies, the key criteria when evaluating a product is oriented toward business and sustainability factors.   Most companies have very similar technologies and today’s leader in one small technical detail will be a lagger tomorrow.   Technology changes very fast and having a nose up in the horse race is less important than having strong healthy legs.

So, don’t be fooled by technology babble and buzzwords when you are making investments in event processing.  Do your due diligence on the business, include the financials, as Marc reminds us.

COTS Software Versus (Hard) Coding in EP Applications

November 21, 2007

Opher Etzion has kindly asked me to write a paragraph or two on commercial-off-the-shelf  (COTS) software versus (hard) coding software in event processing applications. 

My thoughts on this topic are similar to my earlier blog musings, Latency Takes a Back Seat to Accuracy in CEP Applications.

If you buy a EP engine (today) because it permits you run some quick-and-dirty (rule-based) analytics against a stream of incoming events, and you can do this quickly without spending considerable software development costs, and the learning curve and implementation curve for the COTS is relatively low, this could be a good business decision, obviously.   Having a software license for an EP engine that permits you to quickly develop and change analytics, variables and parameters on-the-fly is useful. 

On the other hand, the current state of many CEP platforms, and their declarative programming modelling capabilities, is that they focus on If-Then-Else, Event-Condition-Action (ECA), rule-based analytics.  Sophisticated processing requires more functionality that just ECA rules, because most advanced detection-oriented applications are not just ECA solutions.

For many classes of EP applications today, writing code may still be the best way to achieve the results (accuracy, confidence) you are looking for, because CEP software platforms have not yet evolved to plug-and-play analytical platforms, providing a wide range of sophisticated analytics in combination with quality modelling tools for the all business users and their advanced detection requirements.

For this reason, and others which I don’t have time to write about today, I don’t think that we can say blanket statements that “CEP is about using engines versus writing programs or hard coding procedures.”   Event processing, in context to specific business problems, is the “what” and using a CEP/EP modelling tool and execution engine is only one of the possible “hows” in an event processing architecture.  

As we know, CEP/EP engines, and the marketplace for using them, are still evolving and maturing; hence, there are many CEP/EP applications, today, and in the foreseeable future, that require hard coding to meet performance objectives, when performance is measured by actual business-decision results (accuracy). 

Furthermore, as many of our friends point out, if you truely want the fastest, lowest latency possible, you need to be as close to the “metal” as possible, so C and C++ will always be faster than Java byte code running in a sandbox written in C or C++.   

And, as you (Opher) correctly point out, along with most of the end users we talk to, they do not process megaevents per second on a single platform, so this is a marketing red herring.  This brings us back to our discussions on the future of distributed object caching, grid computing and virtualization – so I’ll stop and go out for some fried rice and shrimp, and some cold Singha beer.

Latency Takes a Back Seat to Accuracy in CEP Applications

November 21, 2007

Opher asks, The only motivation to use EP COTS is to cope with high performance requirements” – true or false?.

The answer: True and False.

If high performance is discussed in the context of event processing speed and latency, then it is Absolutely False that speed and latency are the most important performance criteria for event processing applications. 

Detection accuracy (the performance of the detection algorithms for detecting derived events or situations) is the most important criteria, hands down. 

Emerging CEP/EP applications are centered around the concept of detecting (and acting upon) opportunities and threats in real-time.   The most important performance criteria is the confidence in the detection of the derived event, or situation, depending on your EP vocabulary.

For example, one of the most promising areas for CEP/EP applications is fraud detection.   There is a fundamental tradeoff in most, if not all, detection-oriented systems – the tradeoff between false positives and negatives.  The same is also true for cybertrading and other detection-oriented applications. 

If you miss an opportunity or threat, it does not matter how fast you missed it, or how low the latency was in processing, you simply missed it!     In theory, you could process events just below the speed of light –  So what?!  Making mistakes faster than others is not considered to be a superior skill that leads to a higher paying job!  (Well, we all have known quite a few who made a lot of mistakes but were buddy-buddy with the boss, but that is another story for another day!)

Likewise, if you detect a false opportunity or threat, if does not matter if you detected it in nanoseconds, or if the latency was just below the the speed of light.   Detecting false positives does not demonstrate superior performance.

Most, but not all, of the current CEP/EP vendors have relatively simple rules-based detection approaches and many have marketed “low latency” as their core capability.  The fact of the matter, well expressed by Kevin Pleiter, highlighted in Complex Event Processing – Believe the Hype? earlier this week, is that performance is critical, if the definition of performance is “accuracy” and “actionable” detection.  Latency takes a back seat to accuracy – as it should.

Kevin echoed what I have been saying for a number of years in the CEP community.  Detection accuracy that leads to high confidence, actionable business decisions is the most important performance criteria for CEP applications.

So, if we define performance in the context of event processing accuracy and confidence in decision making, then the answer is that is it Absolutely True  that performance is one of  the most important criteria for event processing applications. 

Latency discussions are a distraction, a red herring,  something intended to divert attention from the real problem or matter at hand. 

Clustered Databases Versus Virtualization for CEP Applications

November 16, 2007

In my earlier post, A Model For Distributed Event Processing, I promised to address grid computing, distributed object caching and virtualization, and how these technologies relate to complex event processing.   Some of my readers might forget my earlier roots in networking if I continue to talk about higher level abstractions!  So, in this follow-up post I will discuss virtualization relative to database clustering.

In typical clustered database environments there are quite a few major performance constraints.  These constraints limit our capability to architect and design solutions for distributed, complex, cooperative event processing problems and scenarios.  Socket-based interprocess communications (IPCs) within database clusters create a performance limitation contrained by low bandwidth, high latency, and processing overhead.

In addition, the communications performance between the application layer and the database layer can be limited by both TCP and operating system overhead.  To make matter worse, hardware input-output constraints limits scalability for connecting database servers to disk storage.   These are standard distributed computing constraints.

The physical architecture to address scalability in emerging distributed CEP solutions require a low-latency network communications infrastructure (sometimes called a fabric).  This simple means that event processing agents (EPAs) require virtualization technologies such as Remote Direct Memory Access (RDMA).  CEP agents (often called CEP engines) should have the capability to write data directly to the memory spaces of a CEP agent fabric (sometimes called an event processing network, EPN).   This is similar to the concept of shared memory as an IPC in UNIX-based systems applied to distributed computing, so all “old hat” UNIX systems engineers will easily grok these concepts.

RDMA virtualization helps improve performance by bypassing operating-system and TCP overhead resulting in significantly higher bandwidth and lower latency in the EPF (Event Processing Fabric – I just minted a new three letter acronym, sorry!).  This, in turn, improves the communication speed between event processing agents in an event processing network (EPN), or EPF (depending on your taste in acronyms).

Scheduling tasks such as a distributed semaphore checking and lock management can also operate more efficiently and with higher performance.    Distributed tables scans, decision tree searches, rule-engine evaluations, Bayesian and neural analytics can all be performed in parallel,  dramatically improving both performance and scalability of distributed event processing applications.

In addition, by adopting transparent protocols with existing socket APIs, the CEP architect can bypass both operating-system and TCP protocol overhead.   In other words, communications infrastructures for CEP that optimize networking, interprocess communications, and storage, provide architects with the underlying tools to build better solutions to computational complex problems.

Many of the communications constraints of earlier distributed architectures for solving complex problems, such as blackboard architectures,  can be mitigated with advances in virtualization.  So, in a nutshell, virtualization technologies, are one of the most important underlying capabilities required for distributed, high performance CEP applications, in my opinion.

The article, Virtualization hot; ITIL, IPv6 not,  appears to indicate that some of the top IT managers at Interop New York might agree with me.  

Unfortunately for a few software vendors, virtualization threatens to dilute their market share for ESB and message bus sale.  (OBTW, SOA is DOA.)   “Old hat” UNIX system programmers will recall how the UNIX IPC called “message queues” lost favor to sockets, pipes and shared memory.   A similar trend is happening in the virtualization world with RDMA as a distributed shared memory technology versus message-based communications technologies.  I will opine more on this topic later.