On Elephants and Analytics

In On EP and Analytics, good friend and respected colleague Opher Etzion applies the well known metaphor of the big elephant to describe how, if you are observing certain specific domains of a subject, like fraud detection, then your view of the whole elephant is biased by your lack of perspective of the entire big elephant.

I am pleased that dear Opher continues to use this metaphor in counterpoint because the same metaphor can be used to describe the carefully selected group of vendors that have banded together to called themselves CEP Vendors.  This group, many founding members of the EPTS, have formed a merry band of well-intended event processing “specialists” and the same lovely elephant causes this group of bonded colleagues to make elephant-blinded statements, as Opher has made in his quoted post:

“Currently most CEP applications do not require analytics.” 

The reason, I believe, that Opher makes the statement above is because the group of software vendors calling themselves “CEP vendors” represent a very small part of the overall event processing elephant;  and hence, since these self-described CEP applications appear to require very little or no analytics, then, by the same logic, CEP requires no analytics. 

(I should outline the boolean logic in a future post!)

For example, one friend and colleague in Thailand is the CTO of True Internet, a leading telecommunications, voice, Video and Internet service provider in Thailand.   True processes myriad events on their network using a dynamic, self-learning neural networking technology.    The US company providing this very clever and highly recommended event processing application does not call themselves a “CEP vendor”; however, they process complex events better and more interesting than the band of merry self-described “CEP players”.

Again,  visualize the gentle giant elephant metaphor that Opher likes to use as a basis for his comments in CEP counterpoint.

When folks define the term “complex event processing” to match a technology marketing campaign that is primarily driven by software running rules against time-series data streaming in a sliding-time windows, and then go on to take the same software capabilities and apply these capabilities to problems that are suitable for that domain, then you match Opher’s elegant description of “a small view of the overall elephant”.

The fact of the matter is that the overall domain of event processing is at least two orders of magnitude larger (maybe more) than the combined annual revenue of the self-described companies marketing what they call “CEP engines.”  The very large “rest of the big elephant” is doing what is also “complex event processing” in everyday operations that are somehow overlooked in “other” analysis and counterplay.

Therefore,  I kindly remain unmoved from my view  that the self-described CEP community, as currently organized, is not immune to counterpoint using the same gentle giant elephant metaphor.  I like this metaphor and hope well-respected colleagues will continue to use this metaphor; because we can easily apply this elegant manner of discussion to explain why the current group of self-described CEP vendors are, in a manner of speaking, selling Capital Market Snake Oil because they are making outrageous claims about the capabilities of their products, as if they can solve the entire “elephant” of event processing problems.   Recently, in this article, CEP was positioned as a technology to mitigate against corporate megadisasters like the subprime meltdown.

Advice:  Tone down the hype.

Furthermore, the noise in the counter arguments marginalize most of the real event processing challenges faced by customers.

In consistant and well respected rebuttal, Opher likes to use the “glass half-full, half-empty” metaphor.   Opher’s point is a valid attempt to paint my operational realism as “half empty” negativism; while at the same time positioning the promotion of the (narrow) event processing capabilities of the self-described CEP rules community as “half-full” thinking. 

For the record, I do see my worldview as “half full” or “half empty”; but an unbiased pragmatic view based on day-to-day interaction with customers with what they would call “complex event processing” problems. 

These same customers would fall over laughing if we tried to bolt one of these rule-based, time-series streaming data processing engines on their network and told them they can detect anything other than trival business events, business opportunities and threats, in near real-time. 

Is it “half empty” thinking to caution people that a “glass” of software that is being touted as the answer to a wide range of complex (even going so far in a recent news article to imply CEP would have magically stopped the subprime crisis!) tangible business problems is not really as that it is hyped to be?  

If so, then I plead guilty to honesty and realism, with the added offense of a sense of fiscal responsibility to customers and end users.

About these ads

10 Responses to On Elephants and Analytics

  1. Hi Tim, just for the records – WareLite is not a CEP vendor. We belong to yet another part of the elephant – aka Event Driven Processing (not Data Driven Processing). I hope we are still in the half-empty part of the glass (that should be on the top, I guess).
    Best Regards
    Daniel

  2. Peter Lin says:

    Nice post Tim. I agree with you. There seems to be a disconnect between products and reality. Sadly, I fear many CTO won’t be able to tell the difference and will fall for the pitch. Just my bias opinion, but the VWAP examples are a joke.

  3. [...] some ongoing discussions in blogland as a result of a recent blog posting here by my good friend Tim Bass where he applauds the use of analytics to the point of damning the [...]

  4. In my view the purpose of CEP is to deal with complex events and high volume data streams, which in most cases cannot be addressed by analytics products. Analytics products can feed information to CEP engines or could get processed events from them. If we take a position that analytics required, we might as well state that Dashboards, data and event collectors are required too. CEP products are usually generic engines that are part of specific solution blue print. CEP can be applied to dealing with security and fraud detection problems, transaction and application performance management, high volume trading systems, etc.

    Companies that are currently claim to be a pure play CEP vendors most likely will sell their products as OEM to larger vendors or be acquired by them, or will re-position their company to solve specific problems. They may sell CEP to enterprises’ development groups of end users to solve specific problems. In those environments CEP engines will be integrated with the best of breed or homegrown analytics products or Dashboards. So the fact that Tibco bought Insightful helps them to enhance their overall product catalog and give some competitive advantage against bigger players, but I do not see how it will make them the best of bread CEP company or why would it make their CEP better than other vendors’ solutions just because they added analytics product to their portfolio.

  5. Tim Bass says:

    Hi David,

    Thank you for stopping by and visiting the blog.

    Factually, CEP has almost zero to do with “high volume data streams” The “high volume” angle was completely manufactured by the event stream processing vendors; who in the beginning mostly rejected CEP (it was “too complex” for marketing they said) and, instead, called their technology, more appropriately in my view, “event stream processing” (ESP).

    Also, if you read the original CEP literature, there is really nothing about “high volume” data streams. The literature is all there on CEP, the thrust is detection of opportunity and threats, and the volume is not the defining principle.

    In other words, the key performance metric is detection accuracy and detection confidence, not volume nor latency.

    Yours faithfully, Tim

  6. You say that the key performance metric is detection accuracy and detection confidence.

    Most half decent Business Intelligence based application vendors already have most of the basic tools to perform advanced analytics on data sets. In the mobile telco world, call data is analysed for revenue leakage – essentially a detection confidence problem using advanced analytics. Following the logic in your recent posts, this meets your description of CEP.

    I believe that CEP is particularly good in dealing with “windows’ of events/data. If there’s no “window”, then I don’t think I’d bother using CEP to solve the problem – I’d stick with existing analytic techniques. But because most use cases involve a window (either of time or of volume), you’ll see that the sorts of problems that it is really good at addressing are ones that involve either rapidly changing data, or large volumes of data.

    So I’d suggest that while CEP doesn’t have to be applied to a problem involving an aspect of volume or latency, I can’t see why I’d bother with CEP unless it did.

    Brian

  7. Opher Etzion says:

    I agree with Tim here (surprise, surprise) that “low latecny” is not a property that describes EP applications, there are some who need it and some who don’t (again, the elephant analogy), there is some bias towards this kind of thinking, since the early adopters in the capital markets industry have viewed low latency as important, but there are planty of other applications in which it does not hold. The reason to use EP COTS is the TCO – they are more sutable to applications that are looking at detecting business situations out of events and combination of event than regular languages or retrospective analytic tools.

    cheers,

    Opher

  8. Opher, I believe that perhaps you have misunderstood what I’m saying – I’ll try to restate.

    There are some things that CEP is really good at, which is a subset of all the things that CEP can possibly do. In this regard, I agree that Latency is not a mandatory property for all EP applications.

    But I have yet to find a use case (or other problem description) that doesn’t involve an aspect of either Volume or Latency, where CEP demostrates capabilities above and beyond “other” existing well-know mature technologes. In other words, if a problem exists today, there is a choice of possible solutions. Which problems can CEP excel at (to the point of being the best choice) that don’t involve aspects of Volume or Latency? If anyone can think of an example….

    Brian

  9. As Opher stated there are some who need “low latency” and there are some who do not. Therefore “low latency” must become a property of event processing for some cases where it is important, regardless whether it was described in the original CEP literature. Also the speed of processing high volumes of events that may come as datastreams in CEP is important. If one vendor detects in 5 hours accuracy and confidence of KPI that could be detected in 5 min, then clearly as user I’d go with one that can do it in 5 minutes.

  10. Tim Bass says:

    Hi David,

    Yes, Opher and I agree that there are event processing applications that require low latency. However, that is not what you said in your earlier comment at The CEP Blog. In your comment you defined CEP as follows:

    “In my view the purpose of CEP is to deal with complex events and high volume data streams.;…”

    Your original statement is simply not true.

    For example, the purpose of a car is to transport people (and their goods) on the ground, hopefully in a safe way. Sometimes you need to go fast in your car, but that is not the purpose of the car.

    In a more specific example, most security events on the network occur over long periods of time and not not defined by speed. A criminal might scan your network on Monday, exploit a vulnerability the following Thursday, and plant malware on a server on the weekend. Processing fast rules in stream oriented time windows will generally not detect sophisticated network attacks.

    Because self-described CEP software is not designed or capable to detect the major of CEP applications defined the the orginal literature; the same vendors have focused on simply processing of low latency events, mostly using CEP as a marketing term.

    CEP is not about low latency event processing – it is about detecting complex events and detecting complex events. Latency and speed are always important in computing, but latency and speed do not define the primary purpose of computing.

    In closing, getting it wrong fast is not superior than getting it right a bit slower. Accuracy and confidence in detection is paramount in CEP, not speed and latency.

    Yours sincerely, Tim

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: