The COIL team in PAL has been working with a number of colleagues spanning solution marketing, EIM and our Ecosystem and Channels partner managers to explore an optimal way to explore how to enable co-innovation with partners in response to the rise in demand for Big Data Hadoop deployments in 2012.
Before getting too deep into focusing on how best to enable co-innovation project work as it relates to Big Data and how Hana factors into a hadoop deployment, I would first redirect anyone not yet too familiar with hadoop to check out the scads of resources on the Internet discussing the topic across many dimensions. You can begin here. If you want to know more about Big Data in general, then you can maybe start with Wikipedia before spending the next 6 months then googling the term and tryng to discern what everyone means when they refer to Big Data; Caution: it's similar to trying to grock "Cloud". What you should first come to understand perhaps is that it is generally accepted that big data is thought of across 3 dimensions- amount of data, input and output speed of data and variety of data.
From a COIL perspective, our goal is to not simply support a single, definitive project, but create a platform capable of eanbling and sustaining a number of big data projects that will best support SAP Big Data strategy. This will mean to accomodate an exploration of the applicable technologies neccessary for crafting a solution architecture featuring SAP and SAP Sybase products and to hit upon an optimal solution architecture that can enhance and simply a hadoop deployment. There is a need to work with partners to both. A platform approach therefore considers the differentiation possible for a given solution that considers reporting needs (e.g. visualization), Analytics Modeling and then the platform itself be it distributed are metal and/or virtual computing and local storage resources and bandwidth.
The virtual project team here in COIL is already actively developing projects and looking to engage with a variety of Hadoop software distribution firms as well as our different alliance technology partners. We are already developing use cases featuring the hadoop architecture and HDFS. The first proposed project (and demo) has two components; one to simply use BI4 to visualize data within HDFS- no requirement to move the data. The 2nd use case would be to select some segment of the data set and bring it into HANA for deeper/faster analysis that can then be visualized. An effort would be made leverage large repostiories of both coprorate and external data to then demonstrate an ability to uncover trends, present meaningful statistics and to obatin information that can be acted upon by management. One example would be to pull vast amounts of customer sentiment data from a public dB (like Amazon) and examine it alongside of product details found in ECC that map to uncovered product sentiments.
It is desirable to understand how we can begin to identify the most interesting use cases to explore that can lead to increasing customer confidence and value in big data technologies. There is no question that we are exploring how Hana and hadoop can and should work together as it is becoming more and more commonplace to hear big data discussions typically include how to use hadoop despite the fact that their are many other non-hadoop approaches to high perfomance cluster computing. We itend to explore both realms. It is all still relatively new. Roughly 1 percent of all enterprises have even attempted to deploy hadoop into production environments but the trend to play around and explore what is possible is accelerating. There are still many production hurdles to clear ranging from security concerns to change management, performance and TCO issues. The challenge is to identify the leading types of problems bsuinesses are trying to solve for and what use cases can drive useful POC projects. One thing is for certain, that even with an increased use of big data technologies, companies are finding an increased need to attract and to develop better business intelligence expertise to help in developing and using much needed analytic modeling.
If you are a big data subject matter expert or just an afficionado, what sort of big data projects would you envision SAP doing with partners? What are your ideas for the sorts of use cases that could prove interesting? Who should we work with and why? How would you choose whcih partners and which scenarios to examine first? The project work is just spinning up now in earnest but the efforts will span all of 2012 and well into 2013. If you have ideas or an interest in all of this, please comment.