When I talk to SMB companies about their use of public cloud services, it’s a no-brainer. Pay as you go, lower costs upfront, quick time-to-market. With private and public cloud solutions, both big and small companies benefit.
So what about big data platforms? Is there an equivalent opportunity, or is Big Data only suited for Big Companies with Big Problems?
I think Big Data applies to all, and here’s why.
OLTP & DSS Era
When I first started working on Teradata’s next-generation switch fabric, the BYNET, back in the early 90′s, Teradata was around $250M in revenue (now over $2B). The concept of decision support systems (DSS) evolved into data warehousing and grew to become the largest pool of enterprise data (Big Data) for those enterprises who had a large enough business to create meaningful amounts of data, and who had the money to invest in their data infrastructure.
Operational systems, known as online transaction processing systems (OLTP), were built on top of smaller data infrastructure powering transaction-oriented applications. Oracle owns this space much like Teradata owns the data warehousing space. Unlike the DSS/Data warehouse space, smaller companies benefited from the use of these operational systems, much like their larger competition.
Analytic Appliance Era
Then enters the era of analytic appliances. In the mid-1990′s I became involved in an effort to lead in-database analytics (powering further innovations in OLAP and Data Mining). With the vision of pushing the analysis of the data closer to the data itself, many followed suit and companies like Netezza, Greenplum, ParAccel, AsterData, Kickfire, Vertica, and others entered the market, addressing the need to provide rapid analysis of data volumes scaling into petabytes. The key words here are “rapid” (or real-time), “analysis” (or analytics), and “petabytes” (big data).
Why didn’t the incumbents like Teradata and Oracle sieze the opportunity here? Lots of reasons…politics, the inability to respond quickly to changing market dynamics, etc.
Big Data Era
A number of things led to the creation/adoption of the Hadoop/MapReduce framework – one of those including the need to have a frictionless data playground where data scientists could simple investigate…discover.
Given birth by the large-scale early adopters such as Google and Yahoo!, Hadoop /MR is now well-positioned to address the needs of medium and small-sized companies.
I argue that we will see the following evolution of Big Data technologies which originated from the large web-scale companies like Yahoo!, Linkedin, Twitter, and the like:
- Hardening of the Hadoop ecosystem
- Broad integration with existing toolsets (e.g. BI)
- Real-time enablement
- Further cloud-enablement
- Clear application use-cases / offerings across verticals
This is an obvious exaggeration to emphasize my point that we may see a shift of dollars to new emerging players. But more importantly, the pie will grow to include the creation of new data infrastructure market share for SMBs due to the innovations in the Big Data space.
Companies like Teradata will benefit from new revenues from acquisitions like AsterData, and integration with Hadoop. Companies like Oracle will also benefit from integration with Hadoop.
However, the potentially larger opportunity will be for new startups who are not tied purely to the needs of the Fortune 3,000 and can quickly tap into the burgeoning market of smaller companies seeking data analytics solutions.
New players who can appreciate the needs of these smaller clients and leverage the product of Silicon Valley’s large web-scale companies (built on commodity hardware and open source software) will be able to capture a large, growing, untapped, data-driven market.
Just take a look at some of the events thus far:
- Teradata-Hortonworks Partnership to Accelerate Business Value from Big Data Technologies
- Oracle, Cloudera unveil Hadoop appliance
- IBM has InfoSphere BigInsights, BigSheets, and Streams as their Big Data offering
- HP pulls together Autonomy (algorithms) and Vertica (data warehousing/analytics) to address Big Data with their Idol server
- Dell has already partnered with Cloudera for their Hadoop solution (CDH 3)
- CISCO provides UCS servers, Nexus switches, and nexus fabric extenders for you to run your Hadoop distribution on top of. Again, hats off to Cloudera for the certification on UCS
- EMC continues to drive it’s Big Data strategy through it’s BI analytics platform with Greenplum and a Hadoop extension
…and then ask yourself this…do you hear the sucking sound? That’s the sound of data coming out of traditional data stores into Hadoop data stores. Yes, there will be some information coming from Hadoop back into Data Warehouses….but where’s the “single source of truth” or “entire view of the customer” going to be in the long run? Just take a look at the new Big Data Warehouse.