Skip to content


Real-time Big Data or Small Data?

big_little_bird

Have you heard of products like IBM’s InfoSphere Streams, Tibco’s Event Processing product, or Oracle’s CEP product? All good examples of commercially available stream processing technologies which help you process events in real-time.

I’ve been asked what I consider as “Big Data” versus “Small Data” in this domain. Here’s my view.

Real-Time Analytics Small Data Big Data
Data Volume None None
Data Velocity 100K events / day (<<1K events / second) Billion+ events / day (>>1K events / second)
Data Variety 1-6 structured sources AND 1 single destination (an output file, a SQL database, a BI tool) 6+ structured and 6+ unstructured sources AND many destinations (a custom application, a BI tool, several SQL databases, NoSQL databases, Hadoop)
Data Models Used for “transport” mainly. Little to no ETL, in-stream analytics, or complex event processing performed. Transport is the foundation. However, distributed ETL, linearly scalable in-memory and in-stream analytics are applied, and complex event processing is the norm.
Business Functions One line of business (e.g. financial trading) Several lines of business – to – 360 view
Business Intelligence No queries are performed against the data in motion. This is simply a mechanism for transporting transaction or event from the source to a database.Transport times are <1 second.Example: connect to desktop trading applications and transport trade events to an Oracle database. ETL, sophisticated algorithms, complex business logic, and even queries can be applied to the stream of events as they are in motion.  Analytics span across all data sources and, thus, all business functions.Transport and analytics occur in < 1 second.Example: connect to desktop trading applications, market data feeds, social media, and provide instantaneous trending reports. Allow traders to subscribe to information pertinent to their trades and have analytics applied in real-time for personalized reporting.

Want to see my view of Batch Analytics? Go Here.

Want to see my view of Ad Hoc Analytics? Go Here.

Here are a few other products in this space:

 

Posted in Big Data.

Tagged with , , , , , , , , , .


3 Responses

Stay in touch with the conversation, subscribe to the RSS feed for comments on this post.

  1. bernhardttom says

    Missing in the CEP space are important players, in any order
    Drools, Esper, WSO2, TIbco

Continuing the Discussion

  1. Ad Hoc Queries with Big Data or Small Data? – Jim Kaskade linked to this post on May 9, 2013

    [...] Want my view on Real-time analytics? Look here. [...]

  2. Big Data versus Small Data – Jim Kaskade linked to this post on May 9, 2013

    [...] Want to see my view on Real-Time Analytics? Go here. [...]

You must be logged in to post a comment.



Switch to our mobile site