Clouding and Confusing the CEP Community

April 20, 2008

Ironically, our favorite software vendors have decided, in a nutshell, to redefine Dr. David Luckham’s definition of “event cloud” to match the lack-of-capabilities in their products.  

This is really funny, if you think about it. 

The definition of “event cloud” was coordinated over a long (over two year) period with the leading vendors in the event processing community and is based on the same concepts in David’s book, The Power of Events. 

But, since the stream-processing oriented vendors do not yet have the analytical capability to discover unknown causal relationship in contextually complex data sets, they have chosen to reduce and redefine the term “event cloud” to match their product’s lack-of-capability.  Why not simply admit they can only process a subdomain of the CEP space as defined by both Dr. Luckham and the CEP community-at-large? 

What’s the big deal?   Stream processing is a perfectly respectable profession!

David, along with the “event processing community,” defined the term “event cloud” as follows:

Event cloud: a partially ordered set of events (poset), either bounded or unbounded, where the partial orderings are imposed by the causal, timing and other relationships between the events.

Notes: Typically an event cloud is created by the events produced by one or more distributed systems. An event cloud may contain many event types, event streams and event channels. The difference between a cloud and a stream is that there is no event relationship that totally orders the events in a cloud. A stream is a cloud, but the converse is not necessarily true.

Note: CEP usually refers to event processing that assumes an event cloud as input, and thereby can make no assumptions about the arrival order of events.

Oddly enough, quite a few event processing vendors seem to have succeeded at confusing their customers, as evident in this post, Abstracting the CEP Engine, where a customer has seemingly been convinced by the (disinformational) marketing pitches  – “there are no clouds of events, only ordered streams.”

I think the problem is that folks are not comfortable with uncertainty and hidden causal relationships, so they give the standard “let’s run a calculation over a stream” example and state “that is all there is…” confusing the customers who know there is more to solving complex event processing problems.

So, let’s make this simple (we hope). referencing the invited keynote at DEBS 2007, Mythbusters: Event Stream Processing Versus Complex Event Processing.

In a nutshell…. (these examples are in the PDF above, BTW)

The set of market data from Citigroup (C) is an example of multiple “event streams.”

The set of all events that influence the NASDAQ is an “event cloud”.

Why?

Because a stream  of market data is a linear ordered set of data related by the timestamp of each transaction linked (relatively speaking) in context because it is Citigroup market data.    So, event processing software can process a stream of market data, perform a VWAP if they chose, and estimate a good time to enter and exit the market.  This is “good”.

However, the same software, at this point in time, cannot process many market data feeds in NASDAQ and provide a reasonable estimate of why the market moved a certain direction based on a statistical analysis of a large set of event data where the cause-and-effect features (in this case, relationships) are difficult to extract.  (BTW, this is generally called “feature extraction” in the scientific community.)

Why?

Because the current-state-of-the-art of stream-processing oriented event processing software do not perform the required backwards chaining to infer causality from large sets of data where causality is unknown, undiscovered and uncertain.

Forward chaining, continuous query, time series analytics across sliding time windows of streaming data can only perform a subset of the overall CEP domain as defined by Dr. Luckham et al.

It is really that simple.   Why cloud and confuse the community?

We like forward chaining using continuous queries and time series analysis across sliding time windows of streaming data. 

  • There is nothing dishonorable about forward chaining using continuous queries and time series analysis across sliding time windows of streaming data.   
  • There is nothing wrong with forward chaining using continuous queries and time series analysis across sliding time windows of streaming data. 
  • There is nothing embarrassing about forward chaining using continuous queries and time series analysis across sliding time windows of streaming data. 

Forward chaining using continuous queries and time series analysis across sliding time windows of streaming data is a subset of the CEP space, just like the definition above, repeated below:

The difference between a cloud and a stream is that there is no event relationship that totally orders the events in a cloud. A stream is a cloud, but the converse is not necessarily true.

It is really simple.   Why cloud a concept so simple and so accurate?


Bankers Voice Scepticism Over New Event Processing Technologies

November 28, 2007

This week I completed a presentation on complex event processing at Wealth Management Asia 2007 where I had a chance to field some tough questions from risk management experts working for some of the top banks in the region.

In particular, one of the meeting attendees voiced strong scepticism over emerging event processing technologies.   The basis for his scepticism was, in his words, that the other “65 systems” the bank had deployed to detect fraud and money laundering (AML) simply did not work.  In particular, he referenced Mantas as one of the expensive systems that did not meet the banks requirements. 

My reply was that one of the advantages of emerging event processing platforms is the “white box” ability to add new rules, or other analytics, “on the fly” without the need to go back to the vendor for another expensive upgrade. 

Our friend the banker also mentioned the huge problem of “garbage-in, garbage-out” where the data for real-time analytics is not “clean enough” to provide confidence in the processing results. 

I replied that this is always the problem with stand-alone detection-oriented systems that do not integrate with each other, for example his “65 systems problem.”    Event processing solutions must be based on standards-based distributed communications, for example a high speed messaging backbone or distributed object caching architecture, so enterprises may correlate the output of different detection platforms to increase confidence.   Increasing confidence, in this case, means lowering false alarms while, at the same time, increasing detection sensitivity.

As I have learned over a long 20 year career of IT consulting, the enemy of the right approach to solving a critical IT problem is the trail of previous failed solutions.   In this case, a long history of expensive systems that do not work as promised is creating scepticism over the benefits of CEP.


COTS Software Versus (Hard) Coding in EP Applications

November 21, 2007

Opher Etzion has kindly asked me to write a paragraph or two on commercial-off-the-shelf  (COTS) software versus (hard) coding software in event processing applications. 

My thoughts on this topic are similar to my earlier blog musings, Latency Takes a Back Seat to Accuracy in CEP Applications.

If you buy a EP engine (today) because it permits you run some quick-and-dirty (rule-based) analytics against a stream of incoming events, and you can do this quickly without spending considerable software development costs, and the learning curve and implementation curve for the COTS is relatively low, this could be a good business decision, obviously.   Having a software license for an EP engine that permits you to quickly develop and change analytics, variables and parameters on-the-fly is useful. 

On the other hand, the current state of many CEP platforms, and their declarative programming modelling capabilities, is that they focus on If-Then-Else, Event-Condition-Action (ECA), rule-based analytics.  Sophisticated processing requires more functionality that just ECA rules, because most advanced detection-oriented applications are not just ECA solutions.

For many classes of EP applications today, writing code may still be the best way to achieve the results (accuracy, confidence) you are looking for, because CEP software platforms have not yet evolved to plug-and-play analytical platforms, providing a wide range of sophisticated analytics in combination with quality modelling tools for the all business users and their advanced detection requirements.

For this reason, and others which I don’t have time to write about today, I don’t think that we can say blanket statements that “CEP is about using engines versus writing programs or hard coding procedures.”   Event processing, in context to specific business problems, is the “what” and using a CEP/EP modelling tool and execution engine is only one of the possible “hows” in an event processing architecture.  

As we know, CEP/EP engines, and the marketplace for using them, are still evolving and maturing; hence, there are many CEP/EP applications, today, and in the foreseeable future, that require hard coding to meet performance objectives, when performance is measured by actual business-decision results (accuracy). 

Furthermore, as many of our friends point out, if you truely want the fastest, lowest latency possible, you need to be as close to the “metal” as possible, so C and C++ will always be faster than Java byte code running in a sandbox written in C or C++.   

And, as you (Opher) correctly point out, along with most of the end users we talk to, they do not process megaevents per second on a single platform, so this is a marketing red herring.  This brings us back to our discussions on the future of distributed object caching, grid computing and virtualization – so I’ll stop and go out for some fried rice and shrimp, and some cold Singha beer.


Clustered Databases Versus Virtualization for CEP Applications

November 16, 2007

In my earlier post, A Model For Distributed Event Processing, I promised to address grid computing, distributed object caching and virtualization, and how these technologies relate to complex event processing.   Some of my readers might forget my earlier roots in networking if I continue to talk about higher level abstractions!  So, in this follow-up post I will discuss virtualization relative to database clustering.

In typical clustered database environments there are quite a few major performance constraints.  These constraints limit our capability to architect and design solutions for distributed, complex, cooperative event processing problems and scenarios.  Socket-based interprocess communications (IPCs) within database clusters create a performance limitation contrained by low bandwidth, high latency, and processing overhead.

In addition, the communications performance between the application layer and the database layer can be limited by both TCP and operating system overhead.  To make matter worse, hardware input-output constraints limits scalability for connecting database servers to disk storage.   These are standard distributed computing constraints.

The physical architecture to address scalability in emerging distributed CEP solutions require a low-latency network communications infrastructure (sometimes called a fabric).  This simple means that event processing agents (EPAs) require virtualization technologies such as Remote Direct Memory Access (RDMA).  CEP agents (often called CEP engines) should have the capability to write data directly to the memory spaces of a CEP agent fabric (sometimes called an event processing network, EPN).   This is similar to the concept of shared memory as an IPC in UNIX-based systems applied to distributed computing, so all “old hat” UNIX systems engineers will easily grok these concepts.

RDMA virtualization helps improve performance by bypassing operating-system and TCP overhead resulting in significantly higher bandwidth and lower latency in the EPF (Event Processing Fabric – I just minted a new three letter acronym, sorry!).  This, in turn, improves the communication speed between event processing agents in an event processing network (EPN), or EPF (depending on your taste in acronyms).

Scheduling tasks such as a distributed semaphore checking and lock management can also operate more efficiently and with higher performance.    Distributed tables scans, decision tree searches, rule-engine evaluations, Bayesian and neural analytics can all be performed in parallel,  dramatically improving both performance and scalability of distributed event processing applications.

In addition, by adopting transparent protocols with existing socket APIs, the CEP architect can bypass both operating-system and TCP protocol overhead.   In other words, communications infrastructures for CEP that optimize networking, interprocess communications, and storage, provide architects with the underlying tools to build better solutions to computational complex problems.

Many of the communications constraints of earlier distributed architectures for solving complex problems, such as blackboard architectures,  can be mitigated with advances in virtualization.  So, in a nutshell, virtualization technologies, are one of the most important underlying capabilities required for distributed, high performance CEP applications, in my opinion.

The article, Virtualization hot; ITIL, IPv6 not,  appears to indicate that some of the top IT managers at Interop New York might agree with me.  

Unfortunately for a few software vendors, virtualization threatens to dilute their market share for ESB and message bus sale.  (OBTW, SOA is DOA.)   “Old hat” UNIX system programmers will recall how the UNIX IPC called “message queues” lost favor to sockets, pipes and shared memory.   A similar trend is happening in the virtualization world with RDMA as a distributed shared memory technology versus message-based communications technologies.  I will opine more on this topic later.


Event Cloud Computing – IBM Turning Data Centers Into ‘Computing Cloud’

November 15, 2007

 I predict we may experience less debates on the use of the term “event cloud” related to CEP in the future, now that both IBM and Google  have made announcements about “cloud computing” and “computing cloud”, IBM Turning Data Centers Into ‘Computing Cloud’

“The initiative also builds on IBM’s announcement with Google last  month that they are developing cloud computing environments for  academic use.  But today’s announcement is aimed more widely at corporations and government users that want the extreme scale made possibly by lashing together pools of computers.”

This also furthers the notion and architecture we have been advocating, that event processing will move toward a distributed, cooperative computing model..


A Model For Distributed Event Processing

November 1, 2007

In my last post, Analytical Patterns for Complex Event Processing, I provided an overview of a few slides I presented in March of 2006 at first event processing symposium titled Processing Patterns for Predictive Business.  In that same presentation (slide 15), I also introduced a generic high level architecture (HLA) for event processing in the illustration below:

HLA for Distributed Event Processing

The figure above is a application of historical work, more than a decade ago, in distributed blackboard architectures applied to complex event processing.

The genesis of the blackboard architectural concept was in the field of Artificial Intelligence (AI) to address issues of information sharing among multiple heterogeneous problem-solving agents.   As the name suggests,  the term “blackboard architecture” is a processing metaphor where intelligent agents collaborate around a blackboard to solve a complex problem.

Basically, the HLA consists of two key functional elements, (1) distributed data set (the “blackboard”) and “knowledge sources” (KS) that function as (self actuated or orchestrated) intelligent agents working together to solve complex problems.  

When I was first introduced to Dr. David Luckham’s book, The Power of Events: An Introduction to Complex Event Processing in Distributed Enterprise Systems, I immediately understood that distributed (complex) event processing, which included an event processing network  (EPN) and collaborative distributed event processing agents (EPAs), follow the same generic architectual pattern as other distributed, collaborative problem-solving software architectures. This also made perfect sense to me considering Dr. Luckham’s strong background in AI at Stanford.

In a nutshell, in my mind, “CEP engines” should operate as intelligent agents collaborating to solve complex distributed computing problems.   Professionally, I have much stronger interest in collaborative distributed agent-based network computing that stand-alone event processing.

An exciting complimentary technology for complex event processing is distributed object caching and grid computing, which I will discuss in more detail in a later post.  Together, these architectures, analytics and technologies help paint a total picture of the future of event processing, at least in my mind.