SAP & Big Data
SAP customers are confused about the positioning between SAP Sybase IQ and SAP Hana as it applies to data warehousing. Go figure, so is SAP. You want to learn about their data warehousing offering, and all you hear is “Hana this” and “Hana that”.
It reminds me of the time after I left Teradata when the BI appliances came on the scene. First Netezza, then Greenplum, then Vertica and Aster Data, then ParAccel. Everyone was confused about what the BI appliance was in relation to the EDW. Do I need an EDW, a BI appliance, an EDW + BI appliance?
With SAP, Sybase IQ is supposed to be the data warehouse and Hana is the BI or analytic appliance that sits off to its side. Ok. SAP has a few customers on Sybase IQ, but are they the larger well-known brands? Lets face it….since its acquisition of Sybase in 2010, SAP has struggled with positioning it against incumbents like Teradata, IBM, and even Oracle.
SAP’s move from exploiting it’s leadership position in enterprise ERP to exploring the new BI appliance and Big Data markets has been impressive IMHO. With acquisitions of EDW and RDBMS company, Sybase, in 2010 after earlier acquisition of BI leader, Business Objects, in 2007 was necessary to be relevant in the race to providing an end-to-end data infrastructure story. This was; however, a period of “catch-up” or “late entry” to the race.
The beginning of its true exploration began with SAP Hana and now strategic partnership with Hadoop commercialization company, Hortonworks. The ability to rise ahead of Data Warehouse and database management system leaders will require defining a new Gartner quadrant – the Big Data quadrant.
SAP Product Positioning
Lets look back in time at SAP’s early positioning. We have the core ERP business, the new “business warehouse” business, and the soon to be launched Hana business. The SAP data warehouse equation is essentially = Business Objects + Sybase IQ + Hana. Positioning Hana, as with most data warehouse vendors, is a struggle since it can be positioned as a data mart within larger footprints, or as THE EDW database altogether in smaller accounts. One would think that with proper guidelines, this positioning would be straightforward. But there is more than database size, and complexity of queries, but a very challenging variable of customer organizational requirements and politics that play into platform choice. As shown above, you can tell that SAP struggled with simplifying its message for its sales teams early on.
SAP Hana – More than a BI Appliance
SAP released its first version of their in-memory platform, SAP HANA 1.0 SP02, to the market on June 21st 2011. It was (and is) based on an acquired technology from Transact In Memory, a company that had developed a memory-centric relational database positioned for “real-time acquisition and analysis of update-intensive stream workloads such as sensor data streams in manufacturing, intelligence and defense; market data streams in financial services; call detail record streams in Telco; and item-level RFID tracking.” Sound familiar to our Big Data use-cases today?
As with most BI appliances back then, customers spent about $150k for a basic 1TB configuration (SAP partnered with Dell) for the hardware only – add software and installation services and we were looking at $300K, minimally, as the entry point. SAP started off with either a BI appliance (HANA 1.0) or a BW Data Warehouse appliance (HANA 1.0 SP03). Both of these using the SAP IMDB Database Technology (SAP HANA Database) as their underlying RDBMS.
BI Appliances come with analytics, of course
When SAP first started marketing their Hana analytics, you were promised a suite of sophisticated analytics as part of their Predictive Analysis Library (PAL) which can be called directly in a “L wrapper” within an SQL Script. The inputs and outputs are all tables. PAL includes seven well known predictive analysis algorithms in several data mining algorithm categories:
- Cluster analysis (K-means)
- Classification analysis (C4.5 Decision Tree, K-nearest Neighbor, Multiple Linear Regression, ABC Classification)
- Association analysis (Apriori)
- Time Series (Moving Average)
- Other (Weighted Score Table Calculation)
HANA’s main use case started with a focus around its installed base with a real-time in-memory data mart for analyzing data from SAP ERP systems. For example, profitability analysis (CO-PA) is one of the most commonly used capabilities within SAP ERP. The CO-PA Accelerator allows significantly faster processing of complex allocations and basically instantaneous ad hoc profitability queries. It belongs to accelerator-type usage scenarios in which SAP HANA becomes a secondary database for SAP products such as SAP ERP. This means SAP ERP data is replicated from SAP ERP into SAP HANA in real time for secondary storage.
BI Appliances are only as good as the application suite
Other use-cases for Hana include:
- Profitability reporting and forecasting,
- Retail merchandizing and supply-chain optimization,
- Security and fraud detection,
- Energy use monitoring and optimization, and,
- Telecommunications network monitoring and optimization.
Applications developed on the platform include:
- SAP COPA Accelerator
- SAP Smart Meter Analytics
- SAP Business Objects Strategic Workforce Planning
- SAP SCM Sales and Operations Planning
- SAP SCM Demand Signal Management
Most opportunities were initially “accelerators” with its in-memory performance improvements.
Aggregate real-time data sources
There are two main mechanisms that HANA supports for near-real-time data loads. First is the Sybase Replication Server (SRS), which works with SAP or non-SAP source systems running on Microsoft, IBM or Oracle databases. This was expected to be the most common mechanism for SAP data sources. There used to be some license challenges around replicating data out of Microsoft and Oracle databases, depending on how you license the database layer of SAP. I’ve been out of touch on whether these have been fully addressed.
SAP has a second choice of replication mechanism called System Landscape Transformation (SLT). SLT is also near-real-time and works from a trigger from within the SAP Business Suite products. This is both database-independent and pretty clever, because it allows for application-layer transformations and therefore greater flexibility than the SRS model. Note that SLT may only work with SAP source systems.
High-performance in-memory performance
HANA stores information in electronic memory, which is 50x faster (depending on how you calculate) than disk. HANA stores a copy on magnetic disk, in case of power failure or the like. In addition, most SAP systems have the database on one system and a calculation engine on another, and they pass information between them. With HANA, this all happens within the same machine.
SAP HANA is not a platform for loading, processing, and analyzing huge volumes – petabytes or more – of unstructured data, commonly referred to as big data. Therefore, HANA is not suited for social networking and social media data analytics. For such uses cases, enterprises are better off looking to open-source big-data approaches such as Apache Hadoop, or even MPP-based next generation data warehousing appliances like Pivotal Greenplum or similar.
SAP’s partnership with Hortonworks enables the ability to migrate data between HANA and Hadoop platforms. The basic idea is to treat Hadoop systems as an inexpensive repository of tier 2 and tier 3 data that can be, in turn, processed and analyzed at high speeds on the HANA platform. This is a typical design pattern between Hadoop and any BI appliance (SMP or MPP).
SAP “Big Data White Space”?
Where do SAP customers need support? Where is the “Big Data White Space?”. SAP seems to think that persuading customers to run core ERP applications on HANA is all that matters. Are customer responding? Answer – not really.
Customers are saying they’re not planning to use it, with most of them citing high costs and a lack of clear benefit (aka use-case) behind their decision. Even analysts are advising against it – Forrester research said the HANA strategy is “understandable but not appealing”.
“If it’s about speeding up reporting of what’s just happened, I’ve got you, that’s all cool, but it’s not helping me process more widgets faster.”, SAP Customer.
SAP is betting its future on HANA + SaaS. However, what is working in SAP’s favor for the moment is the high level of commitment among existing (european) customers to on-premise software.
This is where the “white space” comes in. Bundling a core suite of well-designed business discovery services around the SAP solution-set will allow customers to feel like they are being listened to first, and sold technology second.
Understanding how to increase REVENUE with new greenfield applications around unstructured data that leverages the structured data from ERP systems can be a powerful opportunity. This means architecting a balance of historic “what happened”, real-time “what is currently happening”, and a combined “what will happen IF” all together into a single data symphony. Hana can be leveraged for more ad-hoc analytics on the combined historic and real-time data for business analysts to explore, rather than just be a report accelerator.
This will require:
- Sophisticated business consulting services: to support uncovering the true revenue upside
- Advanced data science services: to support building a new suite of algorithms on a combined real-time and historic analytics framework
- Platform architecture services: to support the combination of open source ecosystem technologies with SAP legacy infrastructure
This isn’t rocket science. It just takes a focused tactical execution, leading with business cases first. The SAP-enabled Bid Data system can then be further optimized with cloud delivery as a cost reducer and time-to-value enhancer, along with a further focus around application development. Therefore, other white space includes:
- Cloud delivery
- Big Data application development
SAP must keep its traditional customers and SI partners (like CSC) engaged with “add-ons” to its core business applications with incentives for investing in HANA, while at the same time evolving its offerings for line of business buyers.
Some think that SAP can change the game by reaching/selling to marketers with new analytics offerings (e.g. see SAP & KXEN), enhanced mobile capabilities, ecosystem of start-ups, and a potential to incorporate its social/collaboration and e-commerce capabilities into one integrated offering for digital marketers and merchandisers.
Is a path to define a stronger CRM vision for marketers? It won’t be able to without credible SI partners who have experience with new media, digital agencies and specialty service providers who are defining the next wave of content- and data-driven campaigns and customer experiences.
Do you agree?