Skip to content


Ad Hoc Queries with Big Data or Small Data?

big-dog-little-dog

Do you think that you’re working with “Big Data”? or is it “Small Data”? If you’re asking ad hoc questions of your data, you’ll probably need something that supports “query-response” performance or, in other words, “near real-time”. We’re not talking about batch analytics, but more interactive / iterative analytics. Think NoSQL, or “near real-time Hadoop” with technologies like Impala. Here’s my view of Big versus Small with ad hoc analytics in either case.

Ad Hoc Analytics Small Data Big Data
Data Volume Megabytes – Gigabytes Terabytes (1-100TB)
Data Velocity Update in near real-time (seconds) Update in real-time (milliseconds)
Data Variety 1-6 structured data sources 6+ structured AND 6+ unstructured data sources
Data Models Aggregations with tens of tables Aggregations with up to 100s – 1000s of tables
Business Functions One line of business (e.g. sales) Several lines of business – to – 360 view
Business Intelligence Queries are simple, regarding basic transactional summaries/reports.Response times are in seconds across a handful of business analysts. 

 

Example: retrieve a customer’s profile and summarize their overall standing based on current market values for all assets.

 

This is representative of the work performed when a business asks the question “What is my customer worth today?”

 

The transaction is a read-only transaction. Questions vary based on what business analyst needs to know interactively.

Queries can be as complex as with batch analytics, but generally are still read-only and processed against aggregates. Queries span across business functions.Response times are in seconds across large numbers of business analysts.Example: retrieve a customer profile and summarize activities across all customer-touch points, calculating “Life-Time-Value” based on past & current activities.

This is representative of the work performed when a business asks the question “Who are my most profitable customers?”

 

Questions vary based on what business analyst needs to know interactively.

Want my view on Batch Analytics? Look here.

Want my view on Real-time analytics? Look here.

Here are a few products in this space:

Posted in Big Data.

Tagged with , , , , , .


3 Responses

Stay in touch with the conversation, subscribe to the RSS feed for comments on this post.

Continuing the Discussion

  1. Big Data versus Small Data – Jim Kaskade linked to this post on May 9, 2013

    [...] Want to see my view on ad hoc and interactive analytics? Go here. [...]

  2. Real-time Big Data or Small Data? – Jim Kaskade linked to this post on May 9, 2013

    [...] Want to see my view of Ad Hoc Analytics? Go Here. [...]

  3. Ad Hoc Queries with Big Data or Small Data? – Jim Kaskade | Sykes' Blog linked to this post on July 23, 2013

    [...] See on jameskaskade.com [...]

You must be logged in to post a comment.



Switch to our mobile site