Alvaro Tejada Galindo

R meets HANA

Posted by Alvaro Tejada Galindo Jan 29, 2012

If you read my last blog called HANA meets R you will remember that we read data from HANA into R directly, without having to download an .csv file, but using ODBC. This time, we're going to read data from HANA as well, but after do some nice tricks on R, we're going to post back the information into HANA.

 

Keep in mind, that is not an standard SAP solution. This only relies on a custom R package that can work with ODBC enabled tables, and like any custom packages, there are many limitations...anyway...this should be fixed when SAP released the official R into HANA integration.

 

In my previous blog Prediction model with HANA and R we create a stored procedure in HANA to populate a table called TICKETS_BY_YEAR, then on R we calculate the prediction for the next year and generate a nice graphic showing both the real data and the prediction. So...of course I'm not going to repeat all that.

 

This is the R code that we need to use...

 

R_Meets_HANA.R

library("RODBC")

ch<-odbcConnect("HANA",uid="P075400",pwd="***")

Flight_Tickets<-sqlFetch(ch,"P075400.TICKETS_BY_YEAR")

period=Flight_Tickets$PERIOD

tickets=Flight_Tickets$TICKETS

var_year=substr(period[1],1,4)

var_year=as.integer(var_year)

var_year=var_year+1

var_year=as.character(var_year)

new_period=gsub("^\d{4}",var_year,period)

next_year=data.frame(year=new_period,stringsAsFactors=FALSE)

prt.lm=lm(tickets ~ period)

pred=predict(prt.lm,next_year,interval="none")

period=next_year

tickets=pred

PREDICTION_TICKETS<-data.frame(period,tickets)

sqlDrop(ch,"PREDICTION_TICKETS",errors=FALSE)

sqlSave(ch,PREDICTION_TICKETS,rownames="id")

odbcClose(ch)


After we execute this code, we can check on HANA that our new table called PREDICTION_TICKETS was created...

 

R_Meets_HANA_01.png

 

And the data was populated as expected...

 

R_Meets_HANA_02.png

 

You may wonder...which are the limitations? Everything seems to work like a charm? Easy...not a lot, but important limitations...

 

* We don't have a way to validate if the table exists or not.

* We must delete the table before doing the insert, otherwise is not going to work.

* Even when the date field was called PERIOD, R named it "year" and pass it into HANA.

* We can't specify the type of the fields, nor the lenght

* We are forced to have an additional column with a numeric index, that we can nicely call "Id"...

 

As I said early...this is just a custom package that allows us to play...this shouldn't be used as a final solution, but as a playground. Enjoy!

Alvaro Tejada Galindo

HANA meets R

Posted by Alvaro Tejada Galindo Jan 26, 2012

In my previous HANA and R blogs, I have been forced to create .csv files from HANA and read them on R...an easy but also boring procedure...specially if your R report is supposed to be run on a regular basis...having to create an .csv file every time you need to run your report it's not a nice thing...

 

After spending some time reading and researching R...I finally came to a library that can read data from any relational database and being HANA, ODBC capable, the work is just a piece of cake -;)

 

For this examples, we must install two libraries: RODBC and Plotrix and create the DSN connection as shown here...

 

HANA already provides us a driver, so we're cool

HANA_Meets_R_02.png

 

Here we're going to "Add..." a new "System DSN"

 

HANA_Meets_R_01.png

 

Assign a name for the "Data Source Name", "Description" is optional and "Server:Port" should be of course filled.

 

Now...we're ready to go to our HANA studio an create a table and a stored procedure...

 

HANA_Meets_R_03.png

 

 

GetTicketsByYearMonth

CREATE PROCEDURE GetTicketsByYearMonth(IN var_year NVARCHAR(4),IN var_month NVARCHAR(2))

LANGUAGE SQLSCRIPT AS BEGIN

select count(bookid), carridfrom sflight.snvoice

where year(fldate) = VAR_YEAR

  and month(fldate) = VAR_MONTH

group by carrid

into TICKETS_BY_YEAR_MONTH;

END;

 

 

After we run our Stored Procedure...we have all the information in the table...Ok...only two fields...today was a hard day...I'm tired -:P

 

HANA_Meets_R_04.png

 

Finally...we can code some R! First, we're going to create a Fan Plot (The Plotix library is needed for that one) and then a Bar Plot...I used the same code for both, so just replace the comment on one by the other one and run it again...I know...I'm being lazy again...but at least I'm not reinveting the wheel -;) Two codes with only 1 different line? No thanks...

 

 

Fan_Plot_Graph.R

library("plotrix")

library("RODBC")

ch<-odbcConnect("HANA",uid="P075400",pwd="***")

res<-sqlFetch(ch,"P075400.TICKETS_BY_YEAR_MONTH")

fan.plot(res$TICKETS,labels=res$CARRIER,

         main="Tickets for December 2011")

#barplot(res$TICKETS,names.arg=res$CARRIER)

odbcClose(ch)

 

The code is very simple...we call the libraries we need, we stablish a communication to our DSN, we fetch the data from the table, we create the graphics and finally we close the connection.

 

And here come the graphics...

 

image

image

I will keep investigating on this way to connect HANA and R...more blogs should be on the way -;)

Hello SCN,

 

In this blog, we will discuss about the structural changes of In-Memory Cube when compared with Standard Info Cube and also discuss about different advantages and limitations of using In-Memory technology with Cube. This blog is in continuation with my previous blog: *BW on HANA: Working of In-Memory DSO (BW on HANA: Working of In-Memory DSO)

 

Let us now discuss about existing schema of Info cube:

Extended Star Schema of Standard Info Cube:


 

For detailed understanding of extended star schema, please read my wiki: Extended Star Schema

 

How Does In Memory Cube look like?

 

As we have an additional boost of HANA computation power. Dimension tables are removed and Master Data tables are directly linked with F table.

 

There is no more E table. You can see the structure of In-Memory Cube below:

How does a Fact table of In Memory look like?

 

The fact table (F) no longer contains Dim Id (Except packet dim id). We can now see SID's in Fact table as shown below:

 

 

Dictionary of Standard Info Cube Tables:

Example: We can see in SE11:Or we can use the Report "RSDRI_CONVERT_CUBE_TO_INMEMORY”.
What does this conversion do?

 

 

Thank you for reading this blog

Hello SCN,

We will discuss about HANA Architecture with BW and how it improves performance with
BW and understand all we need to know about In-Memory DSO in this blog.

 

How does today’s Standard DSO work?

 

Data Store Objects are used for staging purposes and also for operational reporting in SAP BW. The Standard DSOs consists of 3 Tables Active table, Activation Queue and Change log table.

 

1.jpg

 

Activation of DSO means calculating the delta and updating the current data snapshot
into Active table and writing the history of changes made into change log table.

 

 

We can see a detailed flow of the same in the following diagram:

 

Let us take an example scenario:

 

We have an ATM1 in which we have a initial
balance of 400 and an delta load of balance 100.

 

2.jpg

 

How activation of Request takes place?

 

The request activation takes place on BW side and saves the changes to the database as depicted in the figure below:

 

3.jpg

Where is the today’s DSO performance critical now?

 

As discussed above, we have 3 tables which DSO uses for calculating Delta. Application Server performs delta calculation and executes every SQL statement in the database. Thus for every calculation Application server needs to communicate with the DBMS, thereby creating heavy load on RDBMS server. These round trips makes the activation time very high, which takes more than 80% of runtime.

 

As we know about HANA computation power, SAP has decided to move the calculation part to SAP HANA DATABASE to optimize the activation time.

 

5.jpg

 

What is IMCE (HANA) and what do you need to know about column store?

 

 

Main Storage:

It contains main parts of the data which is not only highly compressed but also read optimized.

 

History Storage:

It helps in versioning of data (History).

 

Delta Storage:

It exists only in main memory. Change operations are stored here and are used for delta management.

 

4.jpg

 

What is Delta Merge operation?

 

  1. It moves data from Delta Storage to Main Storage because
    Main Storage is read optimized.
  2. It uses MVCC to maintain the consistency in the database for read & write operations. MVCC is discussed in detail in the later sections.

 

Write Operation:

 

  1. It is used only on delta Storage as it is write-optimized.
  2. The updating of database is performed by inserting a new entry into delta storage.

 

Read Operations:

 

  1. It reads from Main & Delta storages and merges the results.
  2. IMCE uses MVCC (Multi Version Concurrency Control) to ensure consistent read operations.

 

How this architecture with BW does improve the activation times?

 

The request activation along with updating of database takes place on SAP HANA DB side as
depicted in the figure below:

 

5.jpg

This means there will be no more roundtrips to database for saving the updated information. As the activation is taking place in the database
itself. Hence reduces the activation time.

 

Note: We can make DSO as an In-Memory based DSO only if it is of Standard type.

 

Only DSO of type “Standard” can be converted to Standard DSO. You can find the same in the screenshots below:

 

Standard DSO:

6.jpg

 

You can see the same option is not available for other types of DSO’s.

 

Direct Update:

7.jpg

Write-Optimized DSO:

8.jpg

What is MVCC?

 

Multi Version concurrency control (MVCC) for ensuring consistent concurrent access to the real time databases.

 

 

The purpose of MVCC is:

 

  1. To ensures database consistency.
  2. To resolve read-write and write-write conflicts.
  3. To prioritize transactions and avoid deadlocks.

 

How does MVCC work?

 

We have a Table in database which looks like this:

 

ObjectValue
A10

 

Let us consider Read operation as R (A) and Write operation as W (A).The timestamps registered, are used by MVCC to lock and ensure consistency. Ti, Where i = 1,2,3,4 and 5 (Timestamps 1<2<3<4<5)

 

The order of transactions takes place as the below Diagram:

10.jpg

In the above manner, MVCC ensures consistent read operations.

 

How does In Memory DSO work?

 

In IMCE, the request activation takes place at IMCE side as shown below. And BW server will have a view on these tables at SAP HANA DB.

 

In Memory DSO consists of same tables i.e. Activation Queue, Change log and Active Data table. Along with control tables /BIC/A<ODSNAME>70 and /BIC/A<ODSNAME>80. </p><p>Change log and Active Data table consists of additional field IMO_INT_KEY.

 

Request Activation in In Memory DSO:

Below figures will depict the process of request activation in In Memory DSO.

11.jpg


Now let us take the same example of a sample transaction which was discussed earlier using Standard DSO.

Example:

We have an ATM1 in which we have an initial balance of 400 and a delta load of balance 100.

Step 1:

12.jpg

 

Step 2:

13f.PNG

Hope you understood the working of InMemory DSO. Now let us discuss migration concepts of DSO to InMemory DSO. Different options available for migration:

 

Simple Migration:

 

It is a Simple Migration:

 

Only Active data gets converted and change log history gets deleted. Conversion is faster and requests cannot be rolled back
after migration.

 

2. Full Migration:

 

Active data gets converted along with change log history. Conversion is slower but there is no restriction on rollback of requests.

 

This is about HANA Architecture with BW and how
it improves performance with SAP BW and working of InMemory DSO.

 

We will discuss about working of In-Memory Cube, in my next blog.

BW on HANA: Working of In-Memory Info Cube

 

 

 

Related content:

 

Please find the links
below which helped me in writing this blog

 

 

<a href="http://www.sdn.sap.com/irj/scn/go/portal/prtroot/docs/library/uuid/70d16119-ad21-2e10-de8b-eaaedf86b9cd?QuickLink=events&overridelayout=true" target="_top">SAP High-Performance Analytic Appliance 1.0 (SAP HANA)</a>


<a class="jive_macro jive_macro_blogpost" href="" __jive_macro_name="blogpost" modifiedtitle="true" __default_attr="58152"></a>

Actions

Filter Blog

By author:
By date:
By tag: