Rules Engines and Bayes’ Theorem

Charles Young kindly calls out my blog post, Bending Rules for CEP, and the discussion on rules and Bayesian analytics in his post, Rules Engines and Bayes’ Theorem.

FWIW, implementing a very simple Bayes network with a rules engine, as in Charles interesting example, does little to counter the argument that rules engines are not an efficient implementation for larger Bayes nets and more complex data sets.  It is not convincing to implement somewhat trivial solutions to make a point that does not hold up to the state-of-the-art in Bayesian networks for complex problems.   The complexity and inefficiency (with rule-based systems) comes with larger data sets; a long standing problem with rules.

If rule-based systems were efficient for large data sets, complex spam filters would use rule-based systems, but they don’t.    Most modern fraud detection algorithms are implemented with Bayesian algorithms (not rules).  Few, if any, credible large companies uses rules for these classes of problems anymore – they all use specially formulated Bayesian algorithms.

In fact, in the late 90s at Langley Air Force Base we soon discovered the same problem dealing with massive distributed email bomb attacks on the Internet (one example reference, also see Popular Science article, WAR.COM, by Frank Vizard).  After documenting our rule-based approach, subsequent researchers and implementations all commented that a rules-based approach is primitive (paraphrased) compared to modern Bayesian techniques.

These comments are not designed, BTW, to disparage rules or  rule-based systems.  Rules are great; but they are not very efficient in large, complex problems.

I can provide more historical and current references on this topic if anyone is interested.

Note:

See, Introduction to Bayesian Belief Nets by Russ Greiner, University of Alberta and note the basic compexity  in the example,  Forecasting Oil Prices in slide 31.

5 Responses to Rules Engines and Bayes’ Theorem

  1. Thanks. Let me be clear, though. I never suggested that Rules Engines are a natural or best choice for implementing full-blown Bayesian analysis. That would be a really silly argument. In much the same way, I would never suggest that full-blown CEP requirements can adequately be met simply by using Rules Engines, although I do think they are quite well suited to many straight-forward event handling scenarios. All I was trying to show is that some of the statements made about Rules Engine technology are not accurate, and therefore do not serve the discussion well.

    The idea that there is some kind of fundamental impedance mismatch between rules processing and Bayesian analysis that means that any attempt to implement Bayes theorem in conjunction with a Rules Engine is either impossible, or doomed to inefficiency, is simply not true, any more than the apparent claims in the paper you referenced that Rules Engines cannot efficiently or naturally handle dependencies among uncertain beliefs. The example I used may be trivial, but it still demonstrates this point clearly.

    Would I want to implement sophisticated Bayesian analysis using a Rules Engine in order to solve very complex problems? No, of course not. I would want to use a toolset that has been explicitly designed for this purpose. I have no argument with you on that! However, that is similar to saying that I would not want to build everything from scratch in C++, Java or C#, either.

    Rules engines have been handling general purpose pattern matching requirements over data sets, both large and small, for a very long time. Over quarter of a century, they have become very efficient, and many lessons have been learned about the higher level representation and management of large rule sets. All this overlaps heavily with CEP and other approaches, so let’s learn from each other (on this side of the fence, we clearly have a lot to learn about complex event processing), and let’s get the facts straightened out so that we can move on to promote synergy rather than sectarianism based on a correct understanding of the various technologies involved.

  2. Tim Bass says:

    Hi Charles,

    I do not agree with you that educating folks in where rule-engines are optimized versus where neural networks or Bayesian networks optimized are contributing to “sectarianism”. Educating people on the pros-and-cons of different technical approaches is our responsibility.

    Folks don’t implement complex detection or classification-oriented solutions using rules-based systems. There are many Bayesian tool kits in Java, C, C++ and other languages available; hence, there are few people who would build even the most complex (or simple) Bayes classifier “from scratch”.

    Rules are very complimentary to most, of not all, classes of CEP applications. I am glad we agree that rules are not the “holy grail” of all CEP classes of problems.

  3. Yes, I do think there are strong grounds for more agreement than disagreement here. Personally, I would like to see further development of Rules Engines by taking on some of the lessons to be learned from the world of CEP. I think they can (and do) naturally serve a very useful role in servicing simpler event processing scenarios and that this can be extended. I also think that the rules processing approach could prove potentially highly complementary to the more advanced scenarios which CEP addresses. Likewise, as far as Bayesian analysis is concerned (and other approaches), my little foray into implementing Bayes theorem within a rule set suggests that these worlds are complementary, and that there really isn’t any fundamental barrier that prevents synergy by bringing these different worlds together. They can be joined at the hip, if it proves beneficial to do so.

  4. peter lin says:

    For what it’s worth, here is my experience with performing aggregations on real-time transactions. I don’t know if others would consider it within CEP/ESP domain, but it might be interesting for others.

    Back in 2003, I worked on a OMS (order management system) pre-trade compliance application. One of the things we needed to do was perform simple aggregations on 12-16 dimensions. When we tried to use the rule engine to reactively calculate the aggregates in real-time for 500K-1million rows of data, it brought the rule engine to a crawl. The meory space required quickly became the limiting factor. With 1million rows of data, and hundreds of thousands of aggregates, the system couldn’t handle it. My solution to the problem was to call out to microsoft Analysis Service, which handled the aggregates with 2-3 cubes.

    Basically, I used rules to filter the events (aka transactions). Rather than blindly get all aggregates for a given transactionset of 500-20K, I used rules to filter and get just the aggregates I needed to performance compliance validation. The compliance rules were from 1940Act and FSA regulations. Using a naive approach and fetching all aggregates that “could” be involved in the transactionset was about 5-10x slower than getting just the aggregates needed.

    We also had requirements like integrate with Tibco analytics. When rules are used to compliment analytic tools, the solution is very powerful. When one considers cases like Stanford’s Stanley vehicle, combining analytics with rule engines does produce some very impressive results. Although stanley used Kalman filters and not bayesian filters, I think it still provides a strong example of how to integrate analytics with rule engines.

    peter

  5. Peter, I so strongly agree with you (yes, I really do agree with you, quite a lot of the time 🙂 ) The answer to managing complex requirements lies in the careful separation of concerns at the architectural level, and the application of appropriate technologies in appropriate fashions to those various concerns. Again, I wouldn’t naturally use a rules engine to implement complex Bayesian analytics.

    The point here surely is getting the synergy between rules-based programming and various forms of analytics. I don’t believe that there is any fundamental barrier to combining Bayesian analytics with rule processing. From the perspective of a rules developer, that is really quite exciting, because it means we can happily handle dependencies among uncertain beliefs. There are a range of architectural and implementation possibilities – everything from the integration of different specialised services that handle different concerns, the extension of existing systems using specialised toolsets through to the kind of code I wrote in Jess, implementing Bayes theorem directly and naturally in a rules-orientated language. It’s great to have that kind of range of possibilities available to us.

Leave a comment