Big Data PaaS?
Cloud-based PaaS is pretty high on the hype curve. I’ve been of the opinion that we’ll begin to see vertical PaaS offerings as the enterprise begins to understand the potential impact of application development acceleration. So, to continue to expand on that idea, how about a Big Data PaaS?
As many will agree, Big Data was originally driven by the need to discover. Yes, there are many practical examples of using the Hadoop framework in “operational” applications. However, we could argue that many of these production applications were born out of discovery sandbox initiatives.
If we’re going to support the many data scientists and their application developer counterparts in an even more experimental, data-driven enterprise, we better pay attention to the roots of the Hadoop framework.
Providing the enterprise a way to build their own internal sandbox applications using Hadoop building blocks will require the talent of a Hadoop-savvy team.
Ideally, the existing staff needs to be educated and provided with simple application development environments which support the use of unstructured data technologies.
With an internal deployment of a Big Data PaaS, the organization is provided with a full, turnkey stack which is not too dissimilar from Hortonworks, Cloudera, etc, but with maybe one compelling difference.
The early Hadoop-centric vendors could be compared to the “IaaS providers” in the world of Cloud (e.g. Eucalyptus, Nimbula, Surgient, OpenStack, Enomaly, Cloud.com, etc.). These vendors focus on infrastructure and less on the application developer….less on the next level in the stack which completes the PaaS layer.
The compelling difference with a PaaS is that it includes a comprehensive suite of services for the app-dev teams and a robust API that can be expanded as services are developed and added to the platform. It can include a number of other application-centric services, such as:
- Application lifecycle management
- Application-level monitoring/management
- Application metering
- Entitlement management
- Authentication/Authorization
The end-goal? Abstracting the infrastructure and increasing time-to-market for new BI applications.
Some questions to ponder:
- Is the market ready for a turnkey Big Data platform offering? Is it too early for a Big Data PaaS?
- Is Big Data PaaS just a flavor of Private PaaS?
- What kind of hybrid Big Data architectures will we see?
- Do we need to first see Big Data “killer apps”? The killer use-case in the private Cloud IaaS market was/is “test and dev clouds”…essentially sandboxes.
- Production applications are born out of the sandbox cloud environments, and companies are moving to support the development team to facilitate productization. Are we going to see the same in Big Data?
What are your thoughts?