cancel
Showing results for 
Search instead for 
Did you mean: 

Model Evaluation

Former Member
0 Kudos

Hello Everyone,

I am a new user of SAP Predictive Analytics. I am used to make my analysis with R so I am really excited about the possibility

to use R in SAP Predictve Analytics. Regarding of the usage of PA there are some things that I dont understand. One thing is the

evaluation of a model. Normally for evaluating a model I use k-fold cross validation generating the MSE, MSPE or use measurements like AIC & BIC or calculate confidence intervals using the bootstrap method. So I am not sure which statistical measurement is behind the integrated KI and KR value or how they are calculated. It would be great to have some additional information for these statistics.

Best Regards

Viktor

Accepted Solutions (1)

Accepted Solutions (1)

achab
Product and Topic Expert
Product and Topic Expert
0 Kudos

Hi Viktor, here is a more visual way to understand how KI and KR are determined. I think this diagram is currently lacking from our documentation, I'll ask that this gets added in the future.

Our documentation states the following

Best regards,

Antoine

Former Member
0 Kudos

Hello Antoine,

Thank you very much for this diagram. I am getting a better understanding how KI and KR are calculated. Two small things are still unclear for me. In the documentation it is mentioned that the training data set is cut in an estimation and a validation set. From the diagram I see that the KI is calculated for the estimation and the validation set. So how is the overall training set KI generated? Is it the sum of the KI for the estimation and validation set? The second thing is the calculation of Wizard area A of a perfect model. Is it calculated with the principle of a convex hull? I have not found information regarding this calculation in the documentation.

Thanks & regards

Viktor

achab
Product and Topic Expert
Product and Topic Expert
0 Kudos

Hi Viktor,

The calculation of the Predictive Power relies on the Validation dataset - the one displayed by default in the Model Graphs curve. There is a typo in the doc as it states the Estimation dataset is the default plot.


About the curves themselves:


  • The green curve shows the maximum possible profit (obtained by using the target variable itself as a model). For example, if 25% of your population had the target category of the target variable, then the best model would correctly classify all 25% of the target category with 25% of the population.

  • The red curve shows the minimum profit (obtained by a random model). By randomly taking 50% of the population, you would identify 50% of the target category of the target variable.

  • The blue curve shows the profit generated using the model on the validation set. This curve would show the lift from the random curve based on the model.                  

I'll share a concrete example of this curve, based on our Census sample dataset.

Thanks & regards


Antoine

achab
Product and Topic Expert
Product and Topic Expert
0 Kudos
achab
Product and Topic Expert
Product and Topic Expert
0 Kudos

The Predictive Power of value 0,8097 is the ratio between the two blue and green areas. The green area contains the blue one, hence it is not fully visible.

Answers (2)

Answers (2)

achab
Product and Topic Expert
Product and Topic Expert
0 Kudos

Hi Viktor, is your question answered? We will add the diagram to our documentation. Thanks & regards Antoine

achab
Product and Topic Expert
Product and Topic Expert
0 Kudos

Hi Viktor,

Can you please start with the following sources of information?

The guide here http://help.sap.com/businessobject/product_guides/pa25/en/pa25_class-clust_user_en.pdf on pages 38 and 39.

The whole article here as well the specific section related to model selection and variable selection.

Warm welcome to the SAP Predictive Analytics community!

Thanks & regards

Antoine