1 2 3 13 Previous Next

SAP Predictive Analytics

187 Posts

This component adds a Pareto Plot to SAP Predictive Analytics.

It also outputs the aggregated data that underlies the plot, so that the transformed data can be used elsewhere.



Please note that this component is not an official release by SAP and that it is provided as-is without any guarantee or support. Please test the component to ensure it works for your purposes.


  • Libraries dplyr and gplots have to be installed.



Please let me know should you encounter any limitations.



These parameters can be set by the user:

Categorical Variable

The label column by which the numerical variable will be summarised.

Numerical VariableThe column, which will be summarised by the lables found in the categorical variable.


Output columns:


The values found in the categorical variable.

ValueThe numerical variable summarised by the row's label. The data is sorted descendingly on this column.
ValueCumulatedThe cumulated value.
PercentThe row's percentual contribution to the total sum of the numerical variable.
PercentCumulatedThe cumulated percentage.


How to Implement

The component can be downloaded as .spar file from GitHub. Then deploy it as described here. You just need to import it through the option "Import/Model Component", which you will find by clicking on the plus-sign at the bottom of the list of the available algorithms.


You can try the pareto plot on the attached dataset, which lists how many passengers embarked on an airplane at San Francisco airport. Select "PassengerCount" as numerical variable and choose a categorical column.


You can see the resulting pareto plot. Here the plot is broken down by geographical region.


You can also see the aggregated data that was used to build the plot.


You could use that data for instance in the "Prepare" tab to easily produce a more interactive pareto plot.


We are almost done with 2015 and the predictive community has been very active and attractive this year! On behalf of the predictive team I wanted to thank you for your interest and engagement here. As a retrospective, please find below the top 10 articles the community voted in 2015. Enjoy discovering or re-descovering them and Season’s Greetings to you all!


#1 Official Product Tutorials – SAP Predictive Analytics, SAP Predictive Analysis and SAP InfiniteInsight

One of the most essential assets! This document gathers all the official training on SAP Predictive Analytics 2.0 to 2.4, as well as for previous versions of the products SAP Predictive Analysis 1.0.1 to 1.21 and SAP InfiniteInsight 7.0.


#2 Introducing SAP Predictive Analytics 2.0!

This blog post introduced the first version of our new product SAP Predictive Analytics (combination of SAP Predictive Analysis and SAP InfiniteInsight, and their advanced predictive capabilities), full of tips and delivering the essential: explaining the Why, what it brings to you, what’s new in, how to get started and what’s next!


3 versions have followed, we are now in version 2.4:

Announcing SAP Predictive Analytics 2.2!

Discover all the new capabilities released in the version 2.2 of our product whether you are a business analyst with the Automated Analytics mode or a data scientist with the Expert Analytics mode!


SAP Predictive Analytics 2.3 Released!

Another full of news and tips blog about the 2.3 version of SAP Predictive Analytics! Learn more especially about more control for predictive model comparison in Expert Analytics and HANA views support in Recommendation and Social in Automated Analytics.


Announcing SAP Predictive Analytics 2.4!

The latest version new product capabilities are here, including Sentiment Analysis and integration of SAP HANA demand forecasting techniques!


#3 14 Examples on How to Use Predictive Analytics Solutions

14 short and fun videos covering a large spectrum of key predictive possibilities on common business cases that can also give you fresh ideas on how to address critical questions whatever your company’s industry!


#4 Basket Analysis with SAP Predictive Analytics

Market Basket Analysis (MBA) is a common predictive use case. It allows to find relationships between purchases when many products are involved. The three part article explains how you can do MBA with SAP Predictive Analytics and SAP HANA.


Part 1: Basket Analysis with SAP Predictive Analysis and SAP HANA - Part 1

Part 2: Basket Analysis with SAP Predictive Analysis and SAP HANA - Part 2: Visualisation of Results

Part 3: Enhancing Market Basket Analysis with PA 2.0 and SAP HANA


#5 What is the SAP Automated Predictive Library (APL) for SAP HANA?

A detailed, very complete article about (APL) – a major milestone in our efforts to integrate and embed our advanced analytics services everywhere and into everything!


#6 SAP Predictive Analytics 2.0 30-Day Trial Now Available!

The most expected SAP Predictive Analytics 2.x trial is available and updated as soon as a new version is released, try it now!


#7 How does Automated Analytics do it? The magic behind creating predictive models automatically

How is it possible that a tool can create predictive models automatically? How can steps be automated, that would take a highly skilled expert a lot of time and effort to produce manually? Learn more about the fundamental concept behind the automated approach!


#8 Frequently Asked Questions - Downloading, Installing and Activating SAP Predictive Analytics

The essential links are here! You should find an answer to your question! If not, poke Antoine on the forum ; )


#9 Extending SAP Predictive Analytics Functionality

Power Users can extend and customize the functionality of the “Expert Analytics”-mode by adding their own R-Scripts. The logic is encapsulated in so called “Custom R Components”, which casual users can easily work with without having to have any R skills.


#10 Predictive Smackdown: Automated Algorithms vs The Data Scientist

3 profiles, 1 tool. Whether you are a Data Scientist, a Business Analyst or a Business User, SAP Predictive Analytics will meet your needs and expectations. Learn more through a funny analogy!


There are many predictive resources available on SCN and sap.com/predictive! Here are 3 ways to get engaged:

  • Follow the SAP Predictive Analytics community to be informed as soon as there is something new posted or discussed here
  • Check the ‘Content’ tab to make discoveries here
  • Follow your favorite authors to be informed when they publish a new piece


But please don’t prevent yourself from also:


Hope you enjoyed this blog and wish you a very Happy New Year 2016!


SCN Questions

Posted by Antoine CHABERT Dec 22, 2015

Dear SCN Predictive user community,


I just wanted to quickly let you know that in a effort to ease future question monitoring and handling, I did a global review today of all pending or unanswered questions, and flag the oldest questions as "Assumed Answered" or "Answered" depending on the latest status.


I recently has the honor to become a SCN moderator for SAP Predictive Analytics, and I now fully enter the role ;-)


The intent is to make sure we are not left with very old questions and start fresh in 2016!


Of course, you are always welcome to post new questions, please note that according to the SCN policy it is a good behavior not to resurrect the old threads.


Best regards, and happy predictive! Wishing to all of you a happy holiday season!




PS/ Stay tuned for more 2015 posts!

Dear readers,

Four weeks ago, I documented my first steps in PA here: http://scn.sap.com/community/predictive-analytics/blog/2015/11/23/my-first-steps-in-sap-pa--time-series-analysis
My questions (http://scn.sap.com/message/16376119 ) were all answered, unfortunately I did not have much time since then to continue a more detailed analysis.

One more fact contributed in neglecting that topic: The tool is just not handily enough for my purpose!

I don’t want to spend hours in finding out how to use those KxShell script language, so I got quite annoyed after repeating all the required single steps to predict the share values per single company.

At least, I decided to create second - and for the time being -  last blog about “PA – how it should work”:

Most important thing is: I want to analyze more sets of data (e.g. different companies) at once, uploaded from ONE single csv file!

That means, I would like to see an additional characteristic e.g. between Target and Weight, like “Category” which differentiates the individual data sets.

In my case it would be the WKN (ISIN) which organizes the split of companies to be analyzed:


The idea is that I want to be able to execute the steps like loading the Source data, loading the Description file (which is identical for all sets) and selecting my variables (also the same for all sets) only once.

I guess there are reasons I don’t understand yet, but in my personal opinion, the PA engine should be able to use the different data sets to recognize the overall trend, too. Not necessarily always, but e.g. on demand by another checkbox, for independent or inter-dependent analysis.

Finally of course, the Forecasts should be enhanced by the category, which enables a Filter to select one set after the other and analyze the outcome. E.g.:


In that way it would be possible to check the results for e.g. 10 companies within seconds. As of now repeating all steps over and over again to create the forecast graphs takes ~5 minutes for one. And this each time!


In addition I found two more “bugs”, for which I don’t have an explanation or don’t know the root cause?!

I saved a data model 4 weeks ago, and now wanted to create the chart again, but it looks like this:


Whereas 4 weeks ago it looked like this (the red line above is the green one below):


It seems PA recognizes the current date (Operating system date?) and finds 4 missing weeks from the source data to today and therefore does not work properly anymore. Could that be?

I just loaded the model and displayed the chart. The current version with outliers in every week never ever appeared before.


Then, two more things were strange…

I tried to use the PA test version two days before expiration date… SAP, tell me please what is this?

Expiration message_klein.png

An information message which exceeds the monitor screen’s height, where you cannot see the bottom or any “OK”-button?

Second strange thing was that I could not load my files anymore.

Every time I tried to browse the Folder or the Data Set the whole program closed without any message. Neither warning nor error message.

I’m just not able to create a new time series analysis again. Only loading my old model still works.

Final summary: Even if - or maybe especially because - I’m from IT department and know how an “ergonomic” software should work, I’m quite disappointed about the features (I used version 2.2).

The quality of my predicted data result seems to vary tremendously. Either there are up to +/- 40% min/max values or sometimes also none.

The MAPE value however seems always good (~ 0,0xx) if I skip those variables which PA identifies as suspicious.

Of course I must admit that my evaluation is solely based on time series analysis, but in the current state I cannot see any added value which justifies the license costs.

Maybe I need to collect daily data for better results. But the effort here is huge, especially when prediction can only be done for one set after another. And I don’t want a day-based forecast but I want results and a trend (always!) for the next 12 to 24 weeks.

Thanks for reading,


PS: Feel free to share a document/blog post if you know one, about “Creating a KxShell script for dummies” that describes what I would like to have had with my “category” above.


SAP Predictive Analytics greatly simplifies the consumption of algorithms on SAP HANA and continuing on this path, version 2.4 now surfaces one of the robust demand forecasting techniques call HANA Demand Forecasting. This component is based on Unified Demand Forecast (UDF) which has been so far used by SAP Customer Activity Repository. With this component, it is now possible to forecast demand for multiple products and locations by configuring it just once and the powerful UDF under the hood manages all the complexities.


HANA Demand Forecasting builds models on historical data and then forecasts for the time window requested. To ensure robust forecasting, following points should be observed:

  • Granularity of historical transactional data must a day.
  • To ensure better interpretation of demand influencing factors (DIFs) like seasonality, trend, etc., historical data for two years or more should be provided.


The analysis tries to explain the impact that each DIF had on customer demand based on the historical sales data provided. This is then used to forecast the effects of similar DIF occurrences in the future and forecast the demand. The forecast is for a combination of products and locations as specified in the input.


Using HANA Demand Forecasting in analysis

As with all HANA AFL algorithms, we’ll start with connecting to a HANA instance and selecting a dataset that has historical sales data… once in the canvas, the HANA Demand Forecasting component can be found under Time Series category of algorithms:


Step 1: The component can be added to the chain by double clicking on it or dragging it to the component it needs to be added to:

UDF in Panel.jpg














Step 2: Configure the HANA Demand Forecasting component by double clicking or through the context menu (cog icon).


The feature mapping is auto-filtered by expected data types so you don't have to scroll through all columns reducing mapping errors:








Step 3: Review/configure the Advanced settings…

Hovering over the parameters shows the description:



Step 4: Once the component has been configured, click the run button to execute the training of the component:



Step 5: Now click  OK to view the results

Result Grid:




Product ID


Location ID


Timestamp From


Timestamp To


Actual Unit Sales


Forecast Confidence Index (FCI)


Forecasted Unit Sales


Intercept of the time series decomposition component.


Trend of the time series decomposition component.


Seasonality of the time series decomposition component.


Day-of-week of the time series decomposition component.


Holiday of the time series decomposition component.


Sales Promotion of the time series decomposition component.


Product-location specific future price on a daily basis. Historical price calculated based on sales and unit price.


Price Elasticity. Measures the responsiveness of the quantity demanded of a good or service to a change in its price.


Provides additional infornation that explains the Forecast Confidence Index (FCI).


Provides additional information that explains the demand influencing factor for that impacts the forecast.



Summary of the forecasting shows how many product/location are elastic and to what degree:


Charts shows the forecast graphically:


General Introduction

SAP Predictive Analytics is a statistical analysis and data mining solution that enables you to learn and operationalize predictive models to discover hidden insights and relationships in your data, from which you can make predictions about future events.

SAP Predictive Analytics, desktop version combines SAP InfiniteInsight and SAP Predictive Analysis in a single desktop installation.

SAP Predictive Analytics, desktop version includes two user interfaces, Automated Analytics and Expert Analytics.

SAP Predictive Analytics can also be deployed in client/server mode.

Additional components include Model Manager and the SAP HANA Automated Predictive Library.


Most useful links

Try it!

SAP Predictive Analytics – Free 30-day Trial!

Learn it!

SAP Predictive Analytics Help (help.sap.com/pa)

Official Product Tutorials – SAP Predictive Analytics, SAP Predictive Analysis and SAP InfiniteInsight

Inofficial Tutorials on SAP Predictive Analytics

Predictive Pearls of Wisdom

Predictive Analytics Customer Successes

Know what's next

SAP Product Roadmap for SAP Predictive Analytics

Implement it!

SAP Predictive Analytics Supported Platforms (PAM)

Get Involved!

SAP Predictive Analytics Community

SAP Predictive Analytics Idea Place


SAP Predictive Analytics 2.4

SAP Predictive Analytics 2.4 - Useful Links


SAP Predictive Analytics 2.3

Useful links for SAP Predictive Analytics 2.3


SAP Predictive Analytics 2.0

Introducing SAP Predictive Analytics 2.0!

Maintenance Strategy

For information on the maintenance duration and the maintenance strategy, refer to here.


In case you encounter problems when installing, upgrading or running SAP Predictive Analytics 2.4, report an incident using the component BI-RA-PA.

Yesterday I installed the latest version of SAP Predictive Analytics (download a trial here:  Welcome | SAP) and I couldn't wait to try out the new 'sentiment analysis' component in Expert Analytics, one of the most intriguing novelties in this release.


The component looks at a SAP HANA table or view with a text column and, for each record, finds if the text conveys a positive, negative or neutral sentiment (and it also identifies the sentiment of emoticons, problem statements and profanities -which you might want to automatically filter out of your analysis-). Behind the scene this components manipulates the SAP HANA Text Analysis engine exposing it with a simple interface.


Let's see what I could do in a few minutes as a first exploration of the functionality.


The initial requirement is that you need to have SAP HANA and, in HANA, you need a table (or view) with a column containing text.


My colleague and friend Jayanta Roy provided me a sample dataset containing tweets on football (related to Manchester United -#MANU- and Liverpool -#LFC- )


After launching SAP Predictive Analytics 2.4 I created a new document 'connected' to SAP HANA, selected the table containing the tweets and isolated only a few fields which I wanted to use for my test: the tweet text, its hashtag and the country where the tweet originated.

data selection.jpg

In the next screenshot you can see some of the content. The dataset contains more than 300 000 records.



Now I was ready to go to the Predict room and drag the Sentiment Analysis component into the project page and apply it to the dataset. Notice that this component is found under the Data Preparation blocks.

This is quite important: sentiment analysis is not necessarily a goal per se but is rather to be intented as a step to enrich data in a predictive project.


Opening the Configure Settings page of the component I declared that the text field to analyze is 'TWEET'


then in the Advanced tab I set that the analysis didn't need to identify problem statements and just had to return me three sentiment values: Positive, Negative and Neutral.

I did so by simply typing the return value text into each possible sentiment detected by the tool (in the example here below I just say that all different 'positive' levels of sentiment map to the Positive keyword, the same for Negative and Neutral)


After the mappings were typed in,  I clicked on Done and then executed the project.

In the picture below you can see an excerpt of the output showing how the sentiment analysis component has identified the sentiment of the text, the 'token' (the word or concept in the text, which justifies the sentiment choice) and the Parent_Token which gives the context into which the token has been used.



The data generated here could now be used to enrich an existing dataset (e.g. the number of positive or negative tweets about  a football team could be related to the renewal rate of subscriptions to the team magazine or to the number of home match tickets sold).

The sentiment analysis component can also be a source for another component (e.g. if you want to filter it or  write back the analysis to SAP HANA for further processing).


In my small and unrepresentative dataset I just wanted to visualize the results and have a very basic summary of the tweets.

Going in the Visualize room of SAP Predictive Analytics I created a new measure from the "Sentiment" dimension with the Count aggregation and then defined a barchart graph where the Sentiment was plotted against the hashtag.


I could see that the larger number of tweets was related to the #LFC hashtag and that, in general, there were many more positive tweets than negative ones.

Using a tag cloud I also isolated the 30 most used token words, again here positive ones are appearing most often (and if you look well, you can see that I blurred out a profanity from the image because I forgot to enable the profanity filter in the Configure Settings panel beforehand :-)).


That's it. In about 10 minutes I was able to run a very simple project to analyze some tweets in SAP Predictive Analytics by manipulating the SAP HANA Text Analysis engine without a single line of code.

This visual approach saved me a lot of time and reduced the trial and error phase I would have gone through if I wanted to do all from scratch within SAP HANA Studio.

Now that I understood what the sentiment analysis component of SAP Predictive Analytics does and which results I could get, it is time to look around me and see how I can apply it to improve my business. I hope you are going to do the same!

Our latest release SAP Predictive Analytics 2.4 has been delivered on SAP Support Portal today, November 27, one month before 2015 is coming to an end!

PA Desktop 2.4.PNG


If you are not yet a SAP Predictive Analytics user, you can download your own 30-day trial of SAP Predictive Analytics 2.4, desktop version here.


I would recommend you to start with Ashish Morzaria release announcement, where you will get useful tips and much more about this new version!

Our product managers are blogging about this release:


Here is a curated collection of useful links for SAP Predictive Analytics 2.4:


Links for APL 2.4:

We are currently preparing the next edition of our newsletter. Register now to know more!



Enjoy SAP Predictive Analytics 2.4, ask questions, send us feedback, start discussions in our SCN Predictive Analytics user community.

We are looking to hearing from you!


In the United States, the cliché “Christmas comes early” is often applied to the American Thanksgiving holiday, which occurs each year on the last Thursday of November.  However in the case of SAP Predictive Analytics, the development teams (they don’t seem to like being called “elves”…) are in Ireland and France,  so things are clockwork-like "business as usual".  That means the delivery of SAP PA 2.4 really *is* an early Christmas present for us all!



We are proud to announce that SAP Predictive Analytics 2.4 has been formally released and is now available on SMP for licensed customers. For those of you who are not already using it, you can download the 30-day trial here.



What’s New in PA 2.4?


You can learn more about everything in SAP Predictive Analytics 2.4 in the What’s New document, but here are some highlights of this release:

Even Bigger, Better Support for Predictive in SAP HANA:


  • HANA SPS10: Certified support for SAP HANA SPS10 using both Automated and Expert interfaces as well as the SAP HANA Automated Predictive Library (APL).
  • APL Training Delegation: Automated Analytics now supports model training delegation to the SAP HANA Automated Predictive Library (find out more about the APL here: What is the SAP Automated Predictive Library (APL) for SAP HANA? )
  • APL Stored Procedures: The APL now comes with SQLScript stored procedures to take care of signature tables, table types, and wrappers – dramatically reducing the time and code required to develop with the APL.


New Predictive Features:


Expert Analytics has been enhanced to include additional native-to-HANA components:


  • SAP HANA Demand Forecasting Component: Perform high volume, near real-time processing of algorithms to create sales predictions, future demand, data on price elasticity, natively in HANA.
  • SAP HANA Optimization Function Component: Create reusable objective functions with linear constraints to solve complex optimization functions like how to maximize profits on a product.
  • SAP HANA Sentiment Analysis Component:  Analyze complex streams of text to determine the opinions and influencing factors of discovered entities and transform your unstructured data into easily understandable categories.


Updates and Refinements:


  • Updated database support: Automated Analytics now supports IBM PureData System for Analytics 7.1.x.
  • Improved installation: Now apply licenses automatically during silent installations.
  • Improved support process: Automated Analytics can now generate log files from errors to simplify obtaining support from SAP.



Wanna See More?


Our product management team has also prepared videos to show you more about what's awesome in PA 2.4:










How To Get Started?

SAP PA 2.png


  1. Download the trial!
  2. Check out the online materials and tutorials:
  3. Participate in the SCN Community: SAP Predictive Analytics
    • Learn, ask questions, get answers!


Looking Back on 2015 and Forward to 2016!




This has been a banner year for us on the predictive team.  We released *five* versions of SAP Predictive Analytics,

unleashed our Automated Predictive Library for HANA-native predictive workloads, and we put our final touches on SAP HCP predictive services (which will launch early next year). 


But we are not resting. Our development team has already started work on the next version within our PA 2.x line and internal development on our next generation SAP Predictive Analytics 3.x platform is ongoing.  We “predict” that 2016 will be an even bigger year for us with a new server platform, new cloud capabilities, and of course continuing innovation on the core components that make SAP Predictive Analytics one of the best predictive tools available on the market. 


Stay tuned to SCN @ SAP Predictive Analytics for more news, webinars, and whitepapers. Better yet, sign up for “Email Notifications” on the top right of this page!


SAP Predictive Analytics Newsletter

Finally, don't forget to sign up for our recently launched newsletter that brings together the most popular articles, blog posts, and tutorials on SAP Predictive Analytics right to your inbox.


>> SUBSCRIBE Here <<

Dear community,

I recently installed PA 2.2 for testing purposes of the "Time series analysis".


This blog describes my steps to the final result plus I have some questions, since the final outcome seems very poor to me.


After having watched this video: http://scn.sap.com/docs/DOC-62239


I prepared a list of ~130 German companies with ~20 stock market key figures for the last 115 consecutive weeks.

The import file contained weekly data from CW 36/2013 to CW 46/2015, and my expected outcome was the share value by CW 5/2016.


Fortunately I had good help of a working student, who developed the "structure" file for me.

It seems to work but we are not sure if it is the best setup (further information is appreciated).


The blog is focusing on one example company "SAP", for which a trend line was generated.

Question 1: Why do some results show trend lines and others don't?


In my variables I used only "total value" key figures and avoided to mix them with percentage key figures.


I chose 12 future weeks to predict:


Warning message shown:


Obviously it can "predict" only 4 weeks?

Question 2: What does this warning mean? I found some warnings with 2 or 3, this with 4 as maximum horizon.


UPDATE: I forgot to include the following screenshot:


However, I continued and this is the result ... quite ...hm...  strange ... or ridiculous :-D

SAP Forecast.png

The table shows the whole "catastrophe"... almost only 40% variance between minimum and maximum.


...This and several other result seems to dice for finding the forecast.

Another highlight, Lufthansa: up & down and up& down:

LH_Wuerfeln.png FC_vs_Signal_LH.png


Finally I have some more questions and would love to learn more about the tool and "Time series analysis":

3) How can the structure file be optimized? Is there a how-to or SCN document/blog available?


4) Is there a way to analyze more than only one company at a time?

I would like to load the whole DAX (German main index) and use the same ~20 key figures of all companies for finding the results per company.

Since all shares have the "same attention" (like when in DAX or SDAX or MDAX) I would like to use additional "trends" within the market for analysis.

Is there somehow a "learning effect" I can initiate in the tool by using different data with same variables?

5) Is there a way to use no 4) "more companies at once" and getting only the trend lines per company as result ... not the single predicted values?

6) How do I find out which of the 20 key figures I should keep or change for better results?

7) How does PA deal with mixed input of "total values" and "percentage key figures"?

8) How can I tell PA which relationships exist between key figures, e.g. those which are an outcome of a formula using the weekly share value.

9) I checked the logs and found statements like:

"The automatic variable selection process discarded all the extra-predictable variables when estimating the trend(<list-of-variables>)" or

"The trend model (Regression<list-of-variables>...has been discarded from the competition." What does this mean?

Are all my 20 key figures in the file neglected and the forecast is based only on the historic share values? What could be the reason?

Thanks for reading... and any feedback is appreciated :-)

Best regards,


The next major release of SAP Predictive Analytics (3.x product line) is currently planned for release in the first semester of 2016.

SAP plans to remove Windows 32-bit operating system support in this major release, in order to speed up product innovation deliveries on other operating systems, including Windows 64-bit operating systems.

The last SAP Predictive Analytics version that will provide Windows 32-bit operating system support is a minor release (part of the 2.x product line) that is currently planned for release during the first quarter of the year 2016. 

For customers who wish to continue installing and using SAP Predictive Analytics on 32-bit operating systems, critical fixes for the SAP Predictive Analytics 2.x product line will be available until the 10th of February 2017.

For more information about this communication, feel free to contact me directly via email (see my SCN profile).


Legends of the Fall

Posted by Antoine CHABERT Nov 9, 2015

The Rugby World Cup (RWC) 2015 is over! I can't wait until 2019!

This has been a fantastic edition, fully packed with emotion.

I enjoyed the Brave Blossoms resilience, I was inspired by the fighting spirit of “Los Pumas”, I was delighted by the wonderful moves of the All Blacks “golden generation” and I was sad about the wrecking of “Les Bleus”.

In part 1 and part 2 of these blog series, I used SAP Predictive Analytics to create my predictive model based on historical data and tested some scenarios.

I now apply the predictive model to determine the players that will make it to my hall of fame due to their overall RWC performance. The focus is not really on those new talents that emerged across this particular edition, as my data is summing up the performances across the different editions of the RWC.


I reload the model I had created and saved, then I click on Run and Apply Model.

Load a Model.png

Apply Model.png


In the Applying a model screen:

  • The Application Data Set is the data set on which I will apply my model and determine which player should be considered a legend or not. Mine is named RWC 2015 Player List, it contains the figures across the different editions for the players that participated to RWC 2015.
  • The Generation Options determines the output that is generated from the model. In this case I am selecting the Probability & Error Bars option. If the player is given a probability superior to 0,5 in the resulting file, it should be considered a legend. There are more generation options possible, I find probabilities quite easy to interpret.
  • The Results Generated by the Model is the place where I output the results. Here I am generating the results into an Excel file.
  • I click on Apply so that the file gets generated.

Applying the Model.png

I open the Excel file and look into the column D, it is corresponding to the probability of each player being considered (by me) a legend. A probability is a figure between 0 and 1. 0 means that the player is very probably not a legend (for me!), 1 means that the player is very probably a legend. 

My New Rugby Legends.png


I loaded the Excel file into SAP Lumira, and selected the players with a probability of more than 0,5:

  • The list contains 24 players in total.
  • Most of the players originate from South Hemisphere teams and the All Backs are well represented!


Frederic Michalak is represented in a high position (#26) in the overall list. He is falling a bit short to become one of my legendary players due to a probability equal to 0,42. OK, I’ll give him a bonus because he is a French guy ;-).


Jonathan Sexton or Sergio Parisse are not yet legends for me.



Now I’ll remove the players that I was already considering legends in the RWC history.

My list previously included Dan Carter, Bryan Habana, Richie McCaw, Fourie du Preez, Drew Mitchell, Kieran Read and Victor Matfield. All these legendary players shined this year and through their RWC career!

Old Legends.PNG

My final shortlist does include 17 players, from 5 different countries.

Player by Country.PNG

10 All-Blacks:

3 Australian players:

2 South-African players:

1 Argentinian player:

1 Irish player:


I do agree with most predictions:


As we have seen, it’s very easy to apply a predictive model to generate results on new data samples. 


I hope you enjoyed my blogs and the RWC 2015!

What are your personal RWC legends? 


You can follow me on Twitter: @ChabertAntoine

Fellow SCN Predictive Enthusiasts,


The attempt of this post is to make you familiar with the process of building predictive models using APL (Automated predictive libraries) through an example.


For those who have not heard of APL yet - SAP APL is a native C++ implementation of the automated predictive capabilities of SAP Predictive Analytics running directly in SAP HANA. The key differentiator for the SAP APL over other predictive components within SAP HANA is the “A” for “automated”.  Using APL you can run real time automated predictive algorithms on your data stored in SAP HANA without requiring a data extraction process.


Another advantage of APL based model is, it simply needs to be set up and be instructed what type of data mining function needs to be applied. APL then takes over from there by composing its own models, creating and selectively eliminating metadata as required, and ultimately come up with the most optimal model given the data we provided – in a mostly automated way.


I have put together a document which shows a step by step example, how an insurance company can analyze past insurance fraud data in order to create a predictive model in SAP HANA using the Automated Predictive Libraries (APL) to identify potential future fraudulent auto insurance claims.


You can also see this example in action in this recorded webinar in it, we cover an overview of the predictive analytics in SAP HANA and a live demonstration of SAP Predictive Analytics and the APL in action.




D -7 before SAP TechEd 2015! Held in Barcelona from November 9-12, this is a great opportunity for you to catch up on our latest predictive innovations, learn new skills by taking a workshop, see what other SAP customers are doing in this domain, or simply network with your peers. To help you get the most, I put together a list of sessions, workshops, and activities you should put on your agenda now.

My Top 5 List of Predictive Sessions


Tuesday, November 10


BA160 Use SAP Predictive Analytics with SAP Business Warehouse on SAP HANA

14:30-18:30 - Hands-On Workshop

Learn how you can use SAP Predictive Analytics software in combination with your data from SAP Business Warehouse powered by SAP HANA.


Wednesday, November 11


BA111 Become a Data-Driven Business: Exploratory and Prescriptive Analytics

11:15-12:15 - Lecture

The automation of predictive analytics is the key basis of two new categories of analytics: Exploratory analytics (for showing executives what is really driving business) and prescriptive analytics (to improve operations).


BA806 Road Map Q&A: SAP Predictive Analytics

16:00-17:00 PM - Roadmap session

Join us for an exclusive introduction and Q&A to our SAP Predictive Analytics strategy and road map.


BA272 Automated Predictive Analytics Integration and Scripting
16:45-18:45 - Hands-On Workshop

Discover the integration and scripting capabilities offered by the automatics module from SAP Predictive Analytics software.


Thursday, November 12


BA112 Predictive Maintenance and Service: Practical Internet of Things Experience

16:45-17:45 - Lecture

The SAP Predictive Maintenance and Service solution is in the domain where customers merge large amounts of machine sensor and failure event data with structured ERP data. Learn what more than a dozen SAP customers have done in this domain.


Don't Miss...

SS34 Predictive Demo on the Showfloor

Tuesday, Wednesday and Thursday

A chance to see live demos and chat 1:1 with members of the predictive team.

DG107 Developer's Garage

Bring your laptop and asks questions to our Predictive Expert, Adbel Dadouche. Abdel will also showcase how you can build an application on SAP HANA Cloud Platform and use cool predictive services.


Looking for more?

We have over 24 lectures, hands-on workshops, sessions and demos showcasing predictive.

View all the TechEd predictive sessions here.


The SAP Predictive Analytics team look forward to seeing you in Barcelona!


Filter Blog

By author:
By date:
By tag: