Currently Being Moderated

College Basketball Analysis Powered by SAP Lumira & Predictive Analytics

2015 College Basketball Analysis and Predictions

With SAP Lumira and SAP Predictive Analytics




Using SAP Lumira & SAP Predictive Analytics, the SAP Data Viz team analyzed data from the top 68 teams to fill out a bracket and determine who will reach the Final Four. Each selection was made based on data, removing any gut feel or school bias from the equation.

Get Involved and Viz the Madness!

Think you know which teams will make it to the finals? Try SAP Lumira and share your insights and predictions via SAP Lumira Cloud.


Here is how you can join the action:


  1. Download SAP Lumira and/or SAP Predictive Analytics
  2. Download the SAP Lumira college basketball dataset or use your own
  3. Publish your visualization to SAP Lumira Cloud
  4. Share a link to your public analysis and selections via the SAP Community Network and tweet your URL with #VizTheMadness @SAPAnalytics
  5. Join the SAP Data Viz Team bracket pool and let the trash talk begin! See how your team stacks up against our basketball data viz experts and be entered to win bragging rights for 2015 and an official #VizTheMadness certificate.


View and Interact with the Insights and Analysis

View this interactive visualization powered by SAP Lumira Cloud and see what insights you can come up with.

Here is the 2015 bracket:

Bracket - 4.jpg


Follow @SAPAnalytics on Twitter for details. Learn more about SAP Lumira and try SAP Lumira Edge for your team or department today.


What was the Data and How was it Analyzed?


The Data Viz and Predictive Analytics team pulled publically available data from multiple sources including:, Wikipedia & GPS Visualizer. The key statistics include: College Basketball Power Index (BPI), Strength of Schedule, Conference Rank, Win/Loss in the last 12 games and Net Points vs. Average. Each individual match-up was compared and analyzed on a team-by-team basis.


Variable Selection

SAP Predictive Analytics determines significant measures making the most contribution to our predictive model.  Of all 351 Division 1 basketball teams, we isolate the 68 tournament teams for a robust analysis.



Looking at initial results, we see that Average Scoring Margin and Strength of Schedule together make up more than 25% of our model.


Next, we isolate the variables that gives us more than 90% model certainty.


We use the KxIndex results to rank and choose our teams in our bracket.  Several matchups however are too close to confidently pick so we are deep diving with additional analysis.


Offensive / Defensive Performance and Scoring Margin

Looking at scoring margin (net points vs. average) and offensive / defensive performance (quotient) we can better understand a team’s playing style, its effectiveness, and how they win. Notre Dame, for example, rely mainly in their scoring ability while a team like Wichita State is a stronger defensive team.



Quality of Wins & Losses vs. Quality Opponents



How do teams play against quality opponents? Quality wins against tough opponents can be a great predictor.  While Kentucky has gone unbeaten this year it has only played 5 games against top-25 ranked teams.  Kansas is seemingly battle-tested having played 14 top-25 ranked opponents and winning 9 games.


Recent Wins, Winning Streak & Quality Wins


We are also watching hot teams streaking into the tournament.  Lower ranked teams like Stephen F. Austin are particularly dangerous since they’ve had to win their conference tournaments to qualify.


Tournament Teams by School Location

Traditional powerhouse conferences like SEC and ACC are well represented this year.  Mid-west teams are strong, with 7 teams (Michigan State, Wisconsin, Purdue, Indiana, etc.) from the Big Ten.


We then turned to SAP Predictive Analytics to crunch through the data. Building a cluster analysis to get an idea of how the teams split out based on their rankings:


Then we looked at each cluster by strengths and weaknesses.


The team then built a Decision Tree on the variables to get an idea if this lined up with the Cluster break out found in the first 2 steps while using variables from the regression analysis:



Be sure to check back to for weekly updates to the bracket during the tournament.

How’d we do in 2014?

  • Final Four: 2 out of 4 teams correct
  • Sweet Sixteen: 11 out of 16 teams correct
  • Overall: 40 out of 63 teams correct
  • Finished in top 15% of all NCAA brackets


2014 Analysis and Data Story Examples:

*Disclaimer: this analysis and predictions are an attempt to showcase SAP technology not accurately predict sport outcomes.


Delete Document

Are you sure you want to delete this document?