It's live now and we already see people talking about it! We are excited to announce that SAP HANA SPS7 revision 70 is now available as the new developer edition.

 

The process works just like before however we've made some massive changes! Now it's also a reason you don't immediately see an "update" volume either. We added so much content pre-loaded, new configuration settings and a whole brand new experience that an "update" would probably conflict with your existing systems and cause you lots of problems.

 

The first thing we did was put up the brand new SAP HANA Studio, here. Some of you will notice that there is no Mac version there. This is because we don't have a supported Mac version at the moment however it's already on it's way with some very cool features for all of us Mac users out there. Internally we are already testing it and getting things squared away.

 

Next we only actually updated the AWS images for revision 70. Several reasons for this but that we can save for another discussion. SPS7 is available on two sizes and again is fully pre-configured to meet the needs of any developer and even has pre-loaded content from Thomas Jung and Rich Heilman to help you get started as quickly as possible.

 

  • 4 vCPU's, 34.2GB RAM 154GB disk
  • 8 vCPU's, 68.4GB RAM 154GB disk

 

We've also modified the instance to run the XS Engine through port 80, so no need to worry about changing or re-configuring port numbers or anything like, and this time once you start your instance you can copy the hostname to your browser and instead of the traditional "XS Engine is running" screen you'll now be prompted to login using the user "SYSTEM" and the default password "manager" and you'll get a quick start landing page with some links already in place to get you moving along!

 

Screen Shot 2014-02-03 at 5.27.38 PM.png

 

Some of the configuration changes we made were automatically enabling the debugger, developer_mode and more as well as giving "SYSTEM" some of the default roles necessary to use a lot of the built in tools like the XS Web IDE, ADMIN, transport system, etc.

 

We are very excited about this change and we hope you enjoy it as well!

 

Grab your new developer edition today, just 10 quick steps to start the process!

 

  1. Go to http://aws.amazon.com (you can use your existing Amazon login) and be sure to sign up (you gotta give them your credit card number) for Amazon EC2 (Elastic Computing Cloud)
  2. Once you have signed up in be sure to go to your management console https://console.aws.amazon.com/console/home?# and on the left side select EC2 – once this page loads you will need to select the region you are working or will be working in (top right corner) - for example "Ireland" for the EU https://console.aws.amazon.com/ec2/v2/home?region=eu-west-1#
  3. On the left side scroll down until you see "key pairs" https://console.aws.amazon.com/ec2/v2/home?region=eu-west-1#KeyPairs: here you will need to create one – the reason for this is if you have large datasets or want to install the R language you'll need this to jump onto the server. The name of the key pair can be whatever you want.
  4. Now select your "account" page https://portal.aws.amazon.com/gp/aws/manageYourAccount within Amazon AWS and copy your account ID to the clipboard.
  5. Now head over to http://developers.sap.com and choose the HANA section http://scn.sap.com/community/developer-center/hana (oh make sure you are logged in)
  6. At the top of the HANA section you'll see the current SP and Revision that is available. Choose the "Developer Edition" http://scn.sap.com/docs/DOC-31722 if you go direct to Amazon you'll get our productive instance which has an additional 0.99 USD per hour charge. You'll also want to download the HANA Studio and HANA Client from here as well. The Studio is how we interact with the HANA Server (admin, monitoring, development, modeling) the Client is needed for native development, connecting to local tools (e.g. Excel)
  7. We have multiple hosts that provide the servers but in this case we'll sign up for the AWS one https://sapsolutionsoncloudsapicl.netweaver.ondemand.com/clickthrough/index.jsp?solution=han&provider=amazon
  8. Enter your Account ID that you copied earlier and select the region you chose as well then "Accept" the license.
  9. You will be redirected to AWS once your information is verified.
  10. Then follow the wizard

I am a newbie started exploring SAP HANA and Sybase Event Stream Processor. I was trying to figure out how I can load data from a text file into SAP HANA server in real time. This file is not located in SAP HANA server. Assume there is a text file, it can be a log file or any file that we continuously keep appending records to it and we need to load these records into HANA server in real-time.

 

Then, I found Sybase Event Stream Processor and installed free trial version. Sybase ESP has SAP HANA Output Adapter that uses ODBC connection to load information from Event Stream Processor into the SAP HANA server.

 

In this example, I thought of a scenario that there is a log file which has transaction logs. Each transaction is a line in the text file formatted as:

 

Transaction ID|Transaction Type|Transaction Status|Transaction Date|Details

 

So, I created a simple Java project to generate random transaction data and write this data to a log file called transactionLog.csv.


To be able to load data into HANA server from ESP, first you need to configure the HANA ODBC data source. Open ODBC Data Source Administrator and add HANA ODBC driver.

odbc.png

Figure1: Creating new HANA ODBC datasource

hanaodbc.png

Figure 2: Adding and testing datasource


After you configured ODBC data source successfully, go to %ESP_HOME%\bin\service.xml and add the following entry into your service.xml file.


<Service Name="HANAODBCService" Type="DB">

  <Parameter Name="DriverLibrary">esp_db_odbc_lib</Parameter>

  <Parameter Name="DSN">HDB</Parameter>

  <Parameter Name="User">***user***</Parameter>

  <Parameter Name="Password">***password***</Parameter>

</Service>


Then, I created an ESP Project and added a .ccl file. I used three tools from the palette.

  1. File/Hadoop CSV Input Adapter
  2. Input Window
  3. SAP Hana Output Adapter.

 

 

diagram.png

Figure 3: Event Processing


1. File/Hadoop CSV Input Adapter is added to read transactionLog.csv file.

TransactionAdapter Configuration:

    • Directory: Directory path of the data file at runtime.
    • File: File which you want the adapter to read (you can specify regex pattern as well)
    • Dynamic Loading Mode: Adapter supports three modes, namely static, dynamicFile and dynamicPath. You need to use either dynamicFile or dynamicPath mode if you need to keep polling the new appended content into the file. I set this parameter to dynamicFile.
    • Column Delimiter: | for this example
    • Has Header: False (the text file that I generated doesn’t contain the descriptions of the fields).
    • Poll Period (seconds): Period to poll the specified file. It is set to 5 seconds in this example.


2. Input Window has a schema, which defines the columns in the events. In this example, we can say that each transaction is an event. TransactionInputWindow’s schema has columns transactionId, transactionType, transactionDate, status and description.


3. SAP HANA Output Adapter is used to load data rapidly from Event Stream Processor into SAP HANA database table.

TransactionHANAOutputAdapter Configuration:

    • Database Service Name: HANAODBCService (service name defined in %ESP_HOME%\bin\service.xml)
    • Target Database Schema: Source schema in HANA server
    • Target Database Table Name: Table where the data is loaded into.


Finally, I created a corresponding HANA database table into which the output adapter loads transaction data. Then, I ran my log generator and Event Stream Processor. Transaction data was loaded successfully into the table.


Before running log generator, the log file is empty and there is no event streamed into ESP and the HANA database table is empty as shown in the figures.

sapemptytable2.png

Figure 4: Transaction Input Window before generating data (0 rows)

hanaemptytable2.png

Figure 5: Result of select query on HANA table before generating data (No rows retrieved)


After running log generator, transaction data written into log file is streamed into ESP via input adapter and loaded into HANA via HANA output adapter. 37,770 transaction records are added to the table.

espstreamdata2.png

Figure 6: Streaming transaction events after running log generator


count2.png

Figure 7: Number of rows in HANA table after running log generator


Keep running log generator... New appended data is loaded into HANA table, the number of transactions has increased to 44,733 as seen in the figure.

countupdated2.png

Figure 8: Number of rows in HANA table after running log generator


After making sure that I am able to load the data into HANA, I created an attribute view and a calculation view.

attributeView3.png

Figure 9: Attribute View


An attribute view is created and calculated columns are added to format transaction date and status information.

Transaction Status:

0: Error

1: Warning

2: Success

 

Case() function under Misc Functions is used to format status information.

case.png

Figure 10: Case function

 

calculationView2.png

Figure 11: Calculation View


A Calculation View is created. Transaction data is grouped by transaction status.


After creating views, I created OData services to expose the views.


transactionView.xsodata

service namespace "experiment.services" {

       "experiment.model::VW_TRANSACTION"

       as "Transactions"

       keys generate local "ID";

}

 

status.xsodata

service namespace "experiment.services" {

       "experiment.model::CV_STATUS"

       as "TransactionStatus"

       keys generate local "ID";

}


Since the data is exposed, let’s consume it. I created a transaction view under my SAPUI5 project and added a table and a viz chart to show transaction data.

 

transactionsview.png

Figure 12: transaction.view.js

transactionshtml.png

Figure 13: transaction.html


Below is the final ui for this example.

transactionspage.png

Figure 14: Transactions page

If you're new to this HANA thing, and if you're a Basis Consultant, you normally start with reading the installation guides and related and referenced OSS Notes before beginning the installation of a new solution.

 

So did I.

 

While preparing our company for RDS Qualification of Business Suite on HANA Migration, I started with downloading the RDS materials and then I downloaded the guides for SAP HANA installation Version 1.0 SPS07 from https://help.sap.com/hana_appliance.

 

Also, as I'm not going to install the system on a certified hardware, I read the blog How to install the HANA server software on a virtual machine for installing the system on Amazon Web Services EC2 instances.

 

But, at the beginning, I noticed a contradiction between guides and materials. Here is the list of materials and contradiction between them.

 

Material IDLink to Material
Material 1 (M1)http://help.sap.com/hana/SAP_HANA_Server_Installation_Guide_en.pdf
Material 2 (M2)How to install the HANA server software on a virtual machine
Material 3 (M3)1793303 - Install and configure SUM for SAP HANA

 

According to M1, on page 16 (2.3.1  Recommended File System Layout) and M2, your HANA system's FileSystems has to be as follow:

 

File SystemDefault PathRecommendations
Root/Recommended disk space = 10 GB
Installation path/hana/shared/

The installation path (mount directory) requires disk space equal to

the default system RAM. (GB = RAM)

System instance/usr/sapThe system instance directory requires at least 50 GB disk space
Data volume

/hana/data/<SID>

The data path requires disk space equivalent to four times the size of

the system RAM. (GB = 4*RAM)

Log volume/hana/log/<SID>

The log path requires disk space equivalent to the default system

RAM. (GB = RAM)

 

According to M3 (Which is updated on 19.03.2013);

Standard locations of HANA components as of HANA SPS04:

  • SAP HANA database - /usr/sap/<SID> - this can be just a link but it must exist
  • SAP HANA client - /usr/sap/hdbclient
  • SAP HANA studio - /usr/sap/hdbstudio
  • SAP HANA studio update repository - /usr/sap/hdbstudio_update
  • SAP Host Agent - /usr/sap/hostctrl

If SAP HANA Database uses non-standard directory layout, SUM for SAP HANA cannot support it and the system has to be reinstalled.

If some of the other components are installed in different locations, you should remove them as the script and the subsequent update with SUM for SAP HANA will install them in the proper locations.

 

If we ignore the M2 (Which is a Blog and not written by a SAP Worker) how we can explain the difference between the Guide and SAP Note.

 

Is there any official statement for this? If yes, can you please share it?

This is very strong and important features of SAP HANA.

 

Follow some steps to implement Sentimental analysis.I am using my previous Blog  data for Sentimental analysis

 

Following are the steps to implement Sentimental analysis.

 

1.Create a column table

 

Sentimental Analysis only takes VARCHAR,NVARCHAR,NCLOB,CLOB,BLOB for applying analysis on the text or document column.


Sentimental Analysis supported  only for English, French , German, Spanish , Chinese.

 

CREATE COLUMN TABLE "TEST"."TEST" (ID NVARCHAR(50), TEXTANA NVARCHAR(5000), PRIMARY KEY(ID))


"TEST"."TEST"-- > SchemaName.TableName

2.Insert some records into Test table


insert into "TEST"."TEST" values('1','Barely hours after sitting on a hunger strike at the Jantar Mantar in Delhi')

insert into "TEST"."TEST" values('2','off-spinner Ravichandran Ashwin admitted that India have been below par in the ODIs against New Zealand and they want to avoid another series defeat when they face the Black Caps in the fourth ODI here Tuesday.')

insert into "TEST"."TEST" values('3','Ashwin said the tied match at the Eden Park in Auckland was disappointing.')

insert into "TEST"."TEST" values('4','Pune weather is good')

insert into "TEST"."TEST" values('5','Bangalore weather is also good')

insert into "TEST"."TEST" values('6','good better best')


3.

a.Create FullText Index "TEST"."SENTIMENT" On "TEST"."TEST"("TEXTANA")

TEXT ANALYSIS ON

CONFIGURATION 'EXTRACTION_CORE_VOICEOFCUSTOMER';


b.Create FullText Index "TEST"."SENTIMENT" On "TEST"."TEST"("TEXTANA") ASYNC FLUSH EVERY 1 MINUTES LANGUAGE DETECTION ('EN') TEXT ANALYSIS ON;

 

Both above SQL query are same in one we are asking to SAP HANA engine to put the synchronization by default and in second we are using ASYNC<Asynchronous>  .Don't confuse  when you are trying to execute the query.

 

4. Once you will execute the query which is mention in Step 3, SAP HANA will generate the Analysis table in the same schema where your source table reside.

 

5.You will find another table with the prefix $TA_TEST_ANA which will contain the results. Please check the attachment for result table.

 

6.This analysis will give you the positive,negative,neutral sentiment of your text or document. Just for example take below text

'bad good best'

bad- StrongNegativeSentiment

good-WeakPositiveSentiment

best-StrongPositiveSentiment


We can use sentimental Analysis to read the customer's sentiment about the products or etc.

 

Please check the out put table.

rahul kumar

Text Analysis In SAP HANA

Posted by rahul kumar Jan 28, 2014

SAP HANA has introduced some new features like Text Analysis in SPS05 onwards.

 

With some few steps you can implement Text Analysis in SAP HANA environment.

 

Following are the steps to implement Text Analysis In SAP HANA.

 

1.Create a column table <Text Analysis only takes VARCHAR,NVARCHAR,NCLOB,CLOB,BLOB>

 

CREATE COLUMN TABLE "TEST"."TEST" (ID NVARCHAR(50), TEXTANA NVARCHAR(5000), PRIMARY KEY(ID))


"TEST"."TEST"-- > SchemaName.TableName


2.Insert some records into Test table


insert into "TEST"."TEST" values("1","Barely hours after sitting on a hunger strike at the Jantar Mantar in Delhi")

insert into "TEST"."TEST" values("2","off-spinner Ravichandran Ashwin admitted that India have been below par in the ODIs against New Zealand and they want to avoid another series defeat when they face the Black Caps in the fourth ODI here Tuesday.")

insert into "TEST"."TEST" values("3","Ashwin said the tied match at the Eden Park in Auckland was disappointing.")

insert into "TEST"."TEST" values("4","Pune weather is good")

insert into "TEST"."TEST" values("5","Bangalore weather is also good")



3. Create FullText Index "TEST"."TEST_ANA" On "TEST"."TEST"("TEXTANA")

TEXT ANALYSIS ON

CONFIGURATION 'EXTRACTION_CORE';

 

4. Once you will execute the query which is mention in Step 3, SAP HANA will generate the Analysis table in the same schema where your source table reside.

 

5.You will find another table with the prefix $TA_TEST_ANA which will contain the results. Please check the attachment for result table.

 

6.This is very important point as you have created the index now If your are inserting some new values into your table that value will be updated into the result<$TA_TEST_ANA> table as well. By this you can apply the Text Analysis on real time data.

 

  

 

Note:For above example I have used SYSTEM as a user in HANA DB(HDB).

For those that need a rich tool for managing data flows between SAP and NON SAP systems (supported by SAP), then first stop will probably be the SAP Data Services tool, formerly known as BODS.

 

Here are a couple of useful links:

 

https://help.sap.com/bods

Getting ready for Big Data with Data Services Webcast - Part 1

Configuring Data Services and Hadoop - Enterprise Information Management - SCN Wiki

 

 

With that said though, opensource HADOOP also has a tool for moving large amounts of data between HADOOP and RDBMS systems, known as SQOOP

 

If you need support beyond your own IT organisation then the leading HADOOP vendors (such a Cloudera and Hortonworks) offer support contracts, and will presumably enable you to enhance the tool as required.

 

 

Sqoop currently has 2 flavours  Version 1 & Version 2 (which is almost Production ready).

Version 1 is a command line tool,  which has been integrated with OOZIE to enable SQOOP to be easily used within an HADOOP workflow.

Version 2 has enhanced security, UI support but isn't yet integrated with OOZIE yet (but is apparently in the pipeline).  As a workaround it theoretically it can be used with OOZIE now if incorporated within Shell scripts or a wrapper JAVA program.

 

To demonstrate sqoop I first need to create some test data in HANA:

 





SQOOP2 (Version 2)


Sqoop2  has a new UI which has been added to the HADOOP User Interface (HUE)  from version 2.5 onwards.


The HUE Website Hue - Hadoop User Experience - The Apache Hadoop UI - Tutorials and Examples for Hadoop, HBase, Hive, Impala, Oozie, Pig… has some nice videos demonstrating it's features  


NOTE: To use SQOOP2 with HANA you first need to have copied the the HANA JDBC drivers ngdbc.jar (from HANA Client download)  to the SQOOP2 directory on your HADOOP cluster (e.g. /var/lib/sqoop2)


At the bottom of the below HUE screen capture you can see I've created 2 jobs for IMPORTING and EXPORTING from SAP




When you create the first job to HANA you will need to create a connection , which can be share with subsequent jobs:


Add new connection:




With a connection to HANA created then the Job can be defined.



First lets define and run an IMPORT from SAP to HDFS

Note: the 'Extractors' section enables the data to be extracted in parallel (in this case 5 parallel tasks)


Click Save and Run.


(I've skipped the detailed logging screens)



Finally the data is downloaded to HADOOP in 5 separate files (representing the 5 parallel task).

One of the task files is:


Don't worry about the files being split like this,  with funny names, HADOOP loves it like this.


These files can now be very easily used by HADOOP  HIVE or PIG etc. for Batch processing OR combined with HANA Smart Data Access to be brought back into HANA as a Virtual Table.

Smart Data Access with HADOOP HIVE & IMPALA

 

 

Now lets repeat the process in reverse to load back into HANA into a different table.


NOTE: SQOOP does NOT have complex field mapping rules, so SOURCE and TARGET must have same column structure.

If you need complex mapping rules then you might be best of use SAP Data services.

Alternatively you could use HADOOP PIG to first reformat the data into the correct TARGET format, prior to using SQOOP.


Now lets define and run an EXPORT from HDFS to SAP



After 'Save and Run' we then have the following results in HANA:




Pretty Easy stuff really.


Once Sqoop2 is officially production ready then it's definitely worth doing a bit more stress testing with.



SQOOP1 (Version 1)

 

Sqoop1 is a command line which should achieve similar results

 

The following statements are used:

 

Import from HANA:

sqoop import --username SYSTEM --password xxxyyyy --connect jdbc:sap://xxx.xxx.xxx.xxx:30015/ --driver com.sap.db.jdbc.Driver --table HADOOP.HANATEST1 --target-dir /user/sqoop2/ht1001 --split-by id

 

NOTE: I'm not sure if just my HADOOP setup yet but the Sqoop1 imports fails for me with the following error 'java.io.IOException: SQLException in nextKeyValue'.    For the moment I'm happy with Sqoop2 for imports so I'm not that fussed to investigate, but it anyone has the answer then I welcome the feed back.

 

 

Export to HANA:

sqoop export -D sqoop.export.records.per.statement=1 --username SYSTEM --password xxxxyyyy --connect jdbc:sap://xxx.xxx.xxx.xxx:30015/ --driver com.sap.db.jdbc.Driver --table HADOOP.HANATEST2 --export-dir /user/admin/HANATEST1_SQOOP1

 

Sqoop1 Export works for me. The results in HANA were:

 

NOTE: Sqoop1 and Sqoop2 appear to  handle strings slightly differently when exporting and importing so you just need to be careful with your format.

 

 

Sqoop1s advantage over Sqoop2 is that the command line can be easily added to an OOZIE to enable a full workflow scenario to be processed.  For a bit more details on using OOZIE with HANA then see Creating a HANA Workflow using HADOOP Oozie

 

 

Do give Sqoop a try (which ever flavour) and let me know how you get on.

I started to learn SAP HANA just now. While going through  Open SAP Course I found certain deviation in XSJS debugging configuration.


Reasons for deviation:

  • I was going through first SAP HANA tutorial material on Open SAP ( Introduction to Software Development on SAP HANA, May-June 2013 )
  • And I used SAP HANA Version SPS6, Rev 68 ( SAP HANA Studio and Client installed are developer edition Revision 68 )

 

I fixed the above deviation by referring to SAP HANA discussion forum. And then I thought it would be better to share my findings here at this forum. So that it helps other new learner like me.

 

  1. Changing SAP HANA system Configuration. Here we need to add debugger (section)  to "xsengine.ini".
    • We just need to add "enabled = true" ( parameter ) to debugger (section)
    • Parameter "listenport" is no longer required.
    • HANA_SYS.png
  2. Creating Debug Configuration for XS JavaScript in DEBUG
    • Earlier we were suppose to provide listenport ( from HANA system Configuration ) as port
    • But now we have to provide actual HTTP port 80XX ( XX = Instance Number ) ( e.g:- like for me its 8000)
    • Screenshot 2014-01-25 09.05.48.png
  3. Please add "sap.hana.xs.debugger::Debugger" role . To do so execute below query from HANA System SQL console.
    • CALL GRANT_ACTIVATED_ROLE('sap.hana.xs.debugger::Debugger','<ROLE_NAME>'); 


References:

debug server side JavaScript -- socket connection problem

Hana AWS - Serverside Javascript Debug timeout | SAP HANA

 

Complete Guide to XSJS Debugging:

http://help.sap.com/openSAP/HANA1/openSAP_HANA1_Week_05_Unit_05_Debugging_XSJS_Presentation.pdf

 

I hope that this content is useful for new learners. And all experienced developer please let me know in case I need to add more points here. Or your thought on how to improve this content.

 

-----------------------

Prakash Saurav

We at SAP Research-Boston, have been using SAP HANA’s data analytics capabilities for quite some time now in our research with the medical datasets (e.g. the well-known MIMIC2 datasets). Thus, in our line of work, we regularly need to import data from various sources into our SAP HANA databases.

 

Luckily, SAP HANA provides a handful number of features to import data in different ways from different kinds of sources. Now, importing data using .csv files is one of the effective methods for migrating data to SAP HANA from different popular DBMS. Also, HANA can be extremely fast too, when importing the .csv files at the server-side, using its control (.ctl) files.

 

Much has already been written all around the web on these processes of importing .csv files into SAP HANA (a very good overview can be found here: http://wiki.scn.sap.com/wiki/display/inmemory/Importing+CSV+files+into+SAP+HANA). However, one challenge that may not have been thoroughly discussed is about dealing with .csv files that contain free-text or unrestricted natural language as the vital data to be imported. In this blog, I will be presenting the issues one may encounter when dealing with the free-text data, and also the details of how to preprocess this free-text data to prepare the .csv files, so that they are ready to be imported to SAP HANA with zero problems. I will be using the PostgreSQL database as the source database for my examples. Similar methodology can be applied import such free-text data from the other popular database systems as well (e.g. Oracle, MySQL, SQL Server etc.)

 

The Problem

To migrate data to SAP HANA from any other database systems, we first need to build the table structures in SAP HANA, representing the data-types of the fields/columns correctly following SAP HANA’s standard. The list of supported data-types in SAP HANA can be found here: http://help.sap.com/hana/html/_csql_data_types.html.

 

After the table structures are built on SAP HANA, the next step is to prepare the .csv files using the source database system, where each .csv file contains the data of one table.

 

All the common database systems are equipped with the feature of exporting the data of a table as a .csv file. These .csv files usually follow the same structure, where each record (or row) is delimited by line-break or a newline character (\n). Moreover, the text-type values are usually enclosed by double quote characters ("), and in case a double-quote character appears within a text-type value, it is usually escaped by another double-quote character (") appearing immediately before it.

 

Now, PostgreSQL, like many other database systems, allows one to choose any character for this escape character. However, like most other databases, it always delimits the records by a newline character (\n), with no option to choose otherwise.

 

In contrast, when importing .csv files, SAP HANA allows one to choose any character that have been used for delimiting the records, which is generally chosen to be a newline character (\n) in most cases. However, when importing .csv files, SAP HANA uses a backslash character (\) as the only escape character, with no option to choose otherwise.

 

Therefore, when exporting a table of any database system, like PostgreSQL, as a .csv file one should be paying attention to the above restrictions, in case the .csv file is meant to be imported on SAP HANA. Thus, the command to use in PostgreSQL is as follows:

 

COPY schema_name.table_name TO '/path/table_name.csv'

WITH CSV QUOTE AS '"' ESCAPE AS '\';

 

A .csv file exported with the above command usually gets imported to SAP HANA with no problem, when using its control file-based server-side CSV import feature. However, dealing with free-text can be a little harder. Here's the reason why: Text-type fields often hold unrestricted natural language or free-text values, which can contain line-breaks or the newline characters (e.g. \n, or \r, or a combination of both) and also rarely backslash characters (\) that do not get escaped in the exported .csv files.

 

This creates problem for SAP HANA when importing these .csv files, as its CSV parser (which is used during the control file-based import) wrongly assumes the start of a new record as soon as it encounters a newline character, even if it appears within the enclosure of a text-type value.

 

 

Solution

To solve this problem, we need to preprocess the source data in order to replace these newline characters that appear within the text-type values with “something” that will not confuse SAP HANA’s CSV parser. In our case, we chose to insert the html line-break tag (</br>) instead of the newline characters. Moreover, we also need to cleanup (i.e. remove) the backslash characters (\) appearing within the text-type values.

 

To apply this solution, some may choose to preprocess data on the exported .csv files, which I find to be cumbersome, as it requires processing these mammoth .csv files with a powerful (regular expression-based) text-file processing engine, that needs to be able to differentiate between newline characters appearing within text-values and the newline characters used to delimit records.

 

The solution I present here will preprocess the data on the source database system, and then output the .csv file in a way that is ready to be imported on SAP HANA without any problems.

 

The following are the steps of preprocessing the data on a PostgreSQL database:

 

STEP 1: Create a Temporary Copy of the Table

On PostgreSQL’s console, enter the following SQL to first create a temporary schema, and then create a copy of the table to be exported in this temporary schema:

 

CREATE SCHEMA temporary_schema;

CREATE TABLE temporary_schema.table_name AS

(SELECT * FROM original_schema.table_name);

 

 

STEP 2: For Each Text-type Field/Column in the Table:

The text-type fields/columns are of data-types text, char, varchar, varchar2, nvarchar, nvarchar2 etc. Now, do the following for each such text-type field/column in the table:

 

STEP 2.1: Remove All Backslash Characters (\) from the Values of the Text-type Field:

Enter the following SQL on PostgreSQL’s console to remove all the backslash characters (\) from the values of the text-type field:

 

UPDATE temporary_schema.table_name

SET field_name = REPLACE(field_name, '\','');

 

STEP 2.2: Replace All Newline Characters from the Values of the Text-type Field:

Enter the following SQL on PostgreSQL’s console to replace all the newline characters from the values of the text-type field, with the custom string "</br>":

 

UPDATE temporary_schema.table_name

SET field_name = REGEXP_REPLACE(field_name, E'[\\n\\r]+', '</br>', 'g');

 

 

Thus, repeat the steps 2.1 and 2.2 for each text-type fields.

 

 

STEP 3: Export the Preprocessed Data as CSV:

Enter the following SQL on PostgreSQL’s console to export the preprocessed data of the table as a .csv file:

 

COPY temporary_schema.table_name TO '/path/table_name.csv'

WITH CSV QUOTE AS '"' ESCAPE AS '\';

 

The “table_name.csv” file containing the preprocessed data will now be saved to “/path/” on the machine hosting the PostgreSQL database.

 

 

Note: All the SQL used in the steps 1 to 3 can be combined into an SQL script and be sequentially executed altogether. Similar SQL commands can be used to preprocess the data on other database systems as well (like Oracle).

 

 

STEP 4: Transfer the CSV File To the HANA Server

Use SCP or any FTP client to transfer the “table_name.csv” file to “/path-on-hana-server/” on the server hosting the SAP HANA database.

 

STEP 5: Prepare the Control File

Prepare a plain text file, named "table_name.ctl" with the following contents:

 

IMPORT DATA

INTO TABLE target_schema."table_name"

FROM '/path-on-hana-server/table_name.csv'

RECORD DELIMITED BY '\n'

FIELDS DELIMITED BY ','

OPTIONALLY ENCLOSED BY '"'

ERROR LOG ' table_name.err'

 

Then, save the “table_name.ctl” also on the server hosting the SAP HANA database. In this example, I will be saving it in the same location as the “table_name.csv” file, which is “/path-on-hana-server/”.

 

STEP 6: Execute the Import on SAP HANA

As mentioned earlier, please make sure that you already have an empty table, called “table_name”, in the “target_schema” on your SAP HANA instance. Pease also make sure that this empty table has a correctly translated table structure with the SAP HANA data-types correctly identified. Please note that the list of supported data-types in SAP HANA can be found here: http://help.sap.com/hana/html/_csql_data_types.html

 

Now, execute the following command on SAP HANA Studio’s SQL Console:

 

IMPORT FROM '/path-on-hana-server/table_name.ctl';

 

This will start loading the preprocessed data on to the SAP HANA database.

 

If followed correctly, these steps will successfully load all the data of a table containing free-text, on to the SAP HANA database. However, please check the “table_name.err” file on the SAP HANA server to confirm that no error has occurred during this process.

 

Happy migrating your database to SAP HANA.

SAPUI5 VIZ Charts are great but in some scenarios you may need functionality not yet supported:

 

For example:

D3 Path Transitions

 

1.gif

Above is an animated gif of a HANA XS Html, that calls a HANA XSJS  every second and appends latest results to the FAR RIGHT, shifting the results to the LEFT.

 

The HANA XSJS simple calls "select rand() from dummy"

 

The code to replicate this:

 

random.xsjs

function getRandom() {

 

 

  var list = [];

 

 

 

  function getRandom(rs) {

  return {

  "random" : rs.getDecimal(1)

  };

  }

 

  var body = '';

 

  

  try {

  var query = "select rand() from dummy";

  var conn = $.db.getConnection();

  var pstmt = conn.prepareStatement(query);

  var rs = pstmt.executeQuery();

 

 

  while (rs.next()) {

  list.push(getRandom(rs));

  }

 

 

  rs.close();

  pstmt.close(); }

  catch (e) {

  $.response.status = $.net.http.INTERNAL_SERVER_ERROR;

  $.response.setBody(e.message);

  return;

  }

 

 

 

  body = JSON.stringify({

  "entries" : list

  });

 

 

 

  $.response.contentType = 'application/json; charset=UTF-8';

  $.response.setBody(body);

  $.response.status = $.net.http.OK;

}

 

 

 

 

getRandom();

 

 

dynamicChart.html

<html><head> 

    <meta http-equiv='X-UA-Compatible' content='IE=edge' /> 

    <title>Hello World</title> 

 

    <script id='sap-ui-bootstrap'

        src='http://xxxx.xxxxx.xxxx.xxxx:8001/sap/ui5/1/resources/sap-ui-core.js

        data-sap-ui-theme='sap_goldreflection' 

        data-sap-ui-libs='sap.ui.commons'></script>  

 

   <script src="http://xxxx.xxxxx.xxxx.xxxx:8001/sap/ui5/1/resources/sap/ui/thirdparty/d3.js"></script>

  

   <style>

    @import url(../style.css?aea6f0a);

 

  .x.axis line {

   shape-rendering: auto;

  }

 

  .line {

   fill: none;

   stroke: #000;

   stroke-width: 1.5px;

  }

  </style>

 

 

<script> 

 

 

 

 

 

 

  var vRandom = 0;

 

  var n = 40,

  random = d3.random.normal(0, .2);

 

 

  function chart(domain, interpolation, tick) {

 

   var data = d3.range(n).map(random);

 

 

   var margin = {top: 6, right: 0, bottom: 6, left: 40},

       width = 960 - margin.right,

       height = 120 - margin.top - margin.bottom;

 

 

   var x = d3.scale.linear()

       .domain(domain)

       .range([0, width]);

 

 

   var y = d3.scale.linear()

       .domain([-1, 1])

       .range([height, 0]);

 

 

   var line = d3.svg.line()

       .interpolate(interpolation)

       .x(function(d, i) { return x(i); })

       .y(function(d, i) { return y(d); });

 

 

   //var svg = d3.select("body").append("p").append("svg")

   // Custom Mode

   var svg = d3.select(".TickChart").append("svg")

       .attr("width", width + margin.left + margin.right)

       .attr("height", height + margin.top + margin.bottom)

       .style("margin-left", -margin.left + "px")

     .append("g")

       .attr("transform", "translate(" + margin.left + "," + margin.top + ")");

 

 

   svg.append("defs").append("clipPath")

       .attr("id", "clip")

     .append("rect")

       .attr("width", width)

       .attr("height", height);

 

 

   svg.append("g")

       .attr("class", "y axis")

       .call(d3.svg.axis().scale(y).ticks(5).orient("left"));

 

 

   var path = svg.append("g")

       .attr("clip-path", "url(#clip)")

     .append("path")

       .data([data])

       .attr("class", "line")

       .attr("d", line);

 

 

   tick(path, line, data, x);

  }

 

 

 

 

  var html1 = new sap.ui.core.HTML("html1", {

        // the static content as a long string literal

        content:

                "<div class='TickChart'>" +

  "</div>"

                ,

        preferDOM : false,                     

        // use the afterRendering event for 2 purposes

        afterRendering : function(e) {

       

                        

            // Call the Chart Function   DYNAMIC CHART

  chart([0, n - 1], "linear", function tick(path, line, data) {

 

  //

  var aUrl = '../services/random.xsjs';

  var vRand = 0;   //random();

 

     jQuery.ajax({

        url: aUrl,

        method: 'GET',

        dataType: 'json',

        success: function (myJSON) {

                vRandom = myJSON.entries[0].random;                      

              },

        error: function () {sap.ui.commons.MessageBox.show("OK",

    "ERROR",

  oBundle.getText("error_action") );  }

     });

     //sap.ui.core.BusyIndicator.show();

     //sap.ui.core.BusyIndicator.hide();

    

 

 

 

 

   // push a new data point onto the back

   data.push(vRandom); // random()

 

   // pop the old data point off the front

   data.shift();

 

   // transition the line

   path.transition()

       .duration(1000) // wait between reads    //1000 = 1 Second Refresh

       .ease("linear")

       .attr("d", line)

       .each("end", function() { tick(path, line, data); } );

  });

               

 

        }

    });

 

 

    html1.placeAt('content'); 

</script>

 

 

</head>

<body class='sapUiBody'>

    <div id='content'></div>

</body>

</html>

There are several standard ways that HANA Procedures can be scheduled:

e.g:

1) HANA SPS7 XS  Job Scheduling (Job Scheduling | SAP HANA)

2) SAP DataService   (How to invoke SAP HANA stored procedures from D... | SAP HANA)

 

 

For those that use opensource HADOOP for managing Big Data, then OOZIE can also be used to execute HANA procedures in a workflow.

 

For a good overview of HADOOP terms and definitions please refer to:

SAP HANA - Hadoop Integration # 1

 

 

A Big Data workflow integrating HADOOP and HANA might be:

 

Point c)  on the diagram (using Sqoop for data transfer to HANA) is covered in more detail in another blog

Exporting and Importing DATA  to HANA with HADOOP SQOOP

 

 

The focus on the remaining part of this blog is only to demonstrate how HANA Server Side Java script (XSJS) can be used to execute HANA procedures [ point d) in diagram above] via an OOZIE WorkFlow:

 

Oozie is currently described in Wikipedia as

" a workflow scheduler system to manage Hadoop jobs. It is a server-based Workflow Engine specialized in running workflow jobs with actions that run Hadoop MapReduce and Pig jobs. Oozie is implemented as a Java Web-Application that runs in a Java Servlet-Container.

For the purposes of Oozie, a workflow is a collection of actions (e.g. Hadoop Map/Reduce jobs, Pig jobs) arranged in a control dependency DAG (Direct Acyclic Graph). A "control dependency" from one action to another means that the second action can't run until the first action has completed. The workflow actions start jobs in remote systems (Hadoop or Pig). Upon action completion, the remote systems call back Oozie to notify the action completion; at this point Oozie proceeds to the next action in the workflow.

Oozie workflows contain control flow nodes and action nodes. Control flow nodes define the beginning and the end of a workflow (start, end and fail nodes) and provide a mechanism to control the workflow execution path (decision, fork and join nodes). Action nodes are the mechanism by which a workflow triggers the execution of a computation/processing task. Oozie provides support for different types of actions: Hadoop MapReduce, Hadoop file system, Pig, SSH, HTTP, eMail and Oozie sub-workflow. Oozie can be extended to support additional types of actions.

Oozie workflows can be parameterized (using variables like ${inputDir} within the workflow definition). When submitting a workflow job, values for the parameters must be provided. If properly parameterized (using different output directories), several identical workflow jobs can run concurrently. "

 

 

I think I've also read that Oozie was original designed by Yahoo (now Hortonworks) for managing their complex HADOOP workflows.

It is opensource and able to be used by all distributions of HADOOP (e.g Cloudera, Hortonworks, etc).

 

Ooze workflows can be defined in XML, or visually via the Hadoop User Interface (Hue - The UI for Apache Hadoop).

 

Below I will demonstrate a very simple example workflow of HANA XSJS being called  to:

 

A)  Delete the Contents of a Table in HANA

B)  Insert a Single Record in the Table

 

To call procedures in HANA from HADOOP I created 2 small programs:

1) in HANA a generic HANA XSJS for calling procedures (callProcedure.xsjs)

2) In HADOOP a generic JAVA program for calling HANA XSJS (callHanaXSJS.java)

 

The HANA XSJS program has up to 7 input parameters:

iProcedure - is the procedure to be called  (mandatory)

iTotalParameters - is the number of additional input parameters used by the Procedure (Optional - default 0)

iParam1 to iParam5  - are the input parameters of the procedure.

 

In the body of the Reponse I  provide the basic input and output info (including DB errors) in JSON format.

 

HANA: callProcedure.xsjs

var maxParam = 5;

var iProcedure       = $.request.parameters.get('iProcedure');

var iTotalParameters = $.request.parameters.get("iTotalParameters");

var iParam1          = $.request.parameters.get("iParam1");

var iParam2          = $.request.parameters.get("iParam2");

var iParam3          = $.request.parameters.get("iParam3");

var iParam4          = $.request.parameters.get("iParam4");

var iParam5          = $.request.parameters.get("iParam5");

 

var output = {};

 

output.inputParameters = {};

output.inputParameters.iProcedure = iProcedure;

output.inputParameters.iTotalParameters = iTotalParameters;

output.inputParameters.iParam1 = iParam1;

output.inputParameters.iParam2 = iParam2;

output.inputParameters.iParam3 = iParam3;

output.inputParameters.iParam4 = iParam4;

output.inputParameters.iParam5 = iParam5;

 

output.Response = [];

var result = "";

 

// Check inputs

//if (iProcedure === '') {

if (typeof iProcedure  === 'undefined' ) {

  result = "ERROR: '&iProcedure=' Parameter is Mandatory";

  output.Response.push(result);

  $.response.status = $.net.http.INTERNAL_SERVER_ERROR;

  $.response.setBody(JSON.stringify(output));

 

}

 

else {

 

  var conn = $.db.getConnection();

  var pstmt;

 

  if (typeof  iTotalParameters === 'undefined') {

  iTotalParameters = 0;

  }

 

  var sql = "call \"_SYS_BIC\".\"" + iProcedure + "\"(";

 

  if (iTotalParameters > 0 && iTotalParameters <= maxParam) {

  var i;

  for (i=0;i< iTotalParameters;i++) {

  if (i===0) { sql += "?"; }

  else {sql += ",?"; }

  }

  }

  else {

  if (iTotalParameters !== 0 ) {

  result = "WARNING: '&iTotalParameters-' Parameter shoule be between 0 and " +  maxParam;

  output.Response.push(result);

  }

  }

 

  sql += ")";

 

  output.inputParameters.sql = sql;

 

  try{

  //pstmt = conn.prepareStatement( sql );   //used for SELECT

  pstmt = conn.prepareCall( sql );          //used for CALL

 

  if (iTotalParameters >= 1) { pstmt.setString(1,iParam1);}

  if (iTotalParameters >= 2) { pstmt.setString(2,iParam2);}

  if (iTotalParameters >= 3) { pstmt.setString(3,iParam3);}

  if (iTotalParameters >= 4) { pstmt.setString(4,iParam5);}

  if (iTotalParameters >= 5) { pstmt.setString(5,iParam5);}

 

  // var hanaResponse = pstmt.execute();

  if(pstmt.execute()) {

  result = "OK:";

  var rs = pstmt.getResultSet();

  result += JSON.stringify(pstmt.getResultSet());

  }

 

  else {

         result += "Failed to execute procedure";

     }

  } catch (e) {

     result += e.toString();

  }

 

  conn.commit();

 

 

  conn.close();

 

  //var hanaResponse = [];

 

 

  output.Response.push(result);

  $.response.setBody(JSON.stringify(output));

}

 

 

The HADOOP Java program accepts a minimum of 4 input arguments:

arg[0]  - URL of a HANA XSJS, accessible via the HADOOP cluster

arg[1]  - HANA User name

arg[2]  - HANA Password

arg[3]  - HADOOP HDFS Output directory for storing response

arg[4 to n] - are used for the input parameters for the HANA XSJS called

 

HADOOP:  callHanaXSJS.java

package com.hanaIntegration.app;

 

/**

* Calls a HANA serverside javascript (xsjs)

*  INPUTS: (mandatory) HANA XSJS URL, output logfile name, username & password

*          (optional) n parameters/arguments

*  OUTPUT: writes HANA XSJS response to a the logfile on HDFS

*

*/

 

import java.io.IOException;

import java.io.InputStream;

import java.io.OutputStream;

import java.net.HttpURLConnection;

import java.net.URL;

import org.apache.commons.io.IOUtils;

import org.apache.hadoop.conf.Configuration;

import org.apache.hadoop.fs.FSDataOutputStream;

import org.apache.hadoop.fs.FileSystem;

import org.apache.hadoop.fs.Path;

 

 

public class callHanaXSJS

{

  public static void main(String[] args) throws IOException

    {

 

  String sUrl = args[0];

 

  //Append XSJS Command parmeters

  if (args.length > 4) {

  //append first parameter

  sUrl += "?" + args[4];

  //add subsequent

  for(int i= 5;i < args.length;i++) {

  sUrl += "&" + args[i];

  }

  }

 

  System.out.println("HANA XSJS URL is: " + sUrl);

        URL url = new URL(sUrl);

 

        HttpURLConnection conn = (HttpURLConnection)url.openConnection();

 

        String userpass = args[1] + ":" + args[2];   //args[0] user  args[1] password

        String basicAuth = "Basic " + javax.xml.bind.DatatypeConverter.printBase64Binary(userpass.getBytes());

 

 

        conn.setRequestProperty ("Authorization", basicAuth);

 

        conn.connect();

        InputStream connStream = conn.getInputStream();

 

 

        // HDFS Output

 

        FileSystem hdfs = FileSystem.get(new Configuration());

        FSDataOutputStream outStream = hdfs.create(new Path(args[3], "HANAxsjsResponse.txt"));

        IOUtils.copy(connStream, outStream);

 

        outStream.close();

 

 

 

        connStream.close();

        conn.disconnect();

    }

}

NOTE: HADOOP Java programs are compiled as JAR's and stored on HADOOP HDFS prior to execution by OOZIE

 

With the small programs in place I will now show the setup in Ooozie using HUE.

 

Below are Screenshots from my small Hortonworks Hadoop HDP2.0  cluster  running on EC2

( For setting up your own cluster or downloading a test virtual machine see HDP 2.0 - The complete Hadoop 2.0 distribution for the enterprise

 

 


Zoomed in a bit to the 2 workflow tasks:




The definition of the first workflow task is:

             


The JAR I created was:

/apps/hanaIntegration/callHanaXSJS-WF/lib/callHanaXSJS-1.0-SNAPSHOT.jar


The arguments passed to call a delete procedure (no parameters) in HANA are:

${hanaXsjsUrl}  ${user} ${password} ${outputDir} ${procedure_delete}

 

As this is the first task I also delete and create a directory to store the Log files of each task.

This will store the JSON return by the HANA XSJS.

 

 

The Second workflow task is:

The arguments passed to call an INSERT procedure (no parameters) in HANA are:

${hanaXsjsUrl}  ${user} ${password} ${outputDir} ${procedure_insert} ${insertTotalParams}  ${insertId}  ${insertField1}


The follow XML workflow is then created at runtime:



I can then submit/schedule the workflow:


In my Test I passed the following parameters to the workflow:

(NOTE: unfortunately the order of input parameters via HUE is currently messy.  If manually creating XML this can be tidied up in  a more logical order



In a more logical sequence of this is:

Following used by both tasks

${hanaXsjsUrl} http://ec2-xx-xxx-xx-xxx.compute-1.amazonaws.com:8000/OOZIE/OOZIE_EXAMPLE1/services/callProcedure.xsjs

${user} HANAUserID

${password} HanaPAssword

${outputDir} hdfs://ip-xx-xxx-xx-xx.ec2.internal:8020/apps/hanaIntegration/callHanaXSJS-log


Used by Delete task

${procedure_delete} iProcedure=OOZIE.OOZIE_EXAMPLE1.procedures/deleteTestTable


Used by Insert task

 

${procedure_insert} iProcedure=OOZIE.OOZIE_EXAMPLE1.procedures/create_record

${insertTotalParams} iTotalParameters=2

${insertId} iParam1=10

${insertField1}iParam2=fromHADOOP



Once the Workflow runs we see the following:



The following log files were screated by each task:


The  Insert task Log file shows as:




Finally we can check in HANA to confirm the record has been created:

 




OK for inserting one record this isn't very exciting and a bit of an overkill ,  but conceptually this enables the power of HADOOP and HANA to be harnessed and combined in a single workflow.

 

The Java program for calling HANA XSJS is avialble at https://github.com/AronMacDonald/callHanaXSJS

Sometimes you just wake up on a saturday morning with an idea. “I wonder if I can connect to HANA via speech from my Mac”. It would mean I could fire off a request to HANA, get a response back and have the results returned to me via Speech.

 

Now I have showed you on a number of occasions that can be done on the iPhone by using Siri, but I was wondering how difficult it would be to do it from a Mac. It turns out, it’s a piece of cake.

 

The first thing I did was look for an ODBC driver for OS X. Now I pretty much expected that we do not have one, but I tried anyway:

 

Untitled.png

 

 

And lo and behold, my buddy and fellow mentor Carsten responded. No ODBC driver… Now normally that would have ended my adventure, but as HANA is amongst many things a developers platform, it’s pretty easy to use RESTful services (XS, oData) to connect to HANA. Why would anyone still need ODBC I asked myself:

 

 

Untitled.png

 

So with that said, let’s script!

 

A quick Google shows the Mac has something called “speakable items”. You can use voice commands to have your Mac do whatever you want. Well, that means it can also speak to HANA.

 

 

Untitled2.png

 

So whenever I say “Mac + a command”, the command is executed and this little fellow will pop up executing my every wish:

 

Untitled.png

 

How cool is that!

 

 

But how do I attach a command to my speakable items? It is simple: AppleScript.

 

AppleScript is Apple’s own scripting language which can automate a ton of stuff on your Mac. The cool part is that once you save your AppleScript in a specific folder called “speakable items”  you can actually trigger actions by voice. Just name the script to a voice command and we should be good to go. The fun starts ;-)

 

First off, I need a JSON parser for AppleScript . As simple Google search gives me one:

 

JSON Helper

 

It turns out to be a fantastic piece of software which runs in the background waiting for AppleScript to ask it to do something, just like Apple's own System Events does. A brief look at the samples gives me an idea on how I need to create my script. In the end, 5 lines of code is all I need:

 

tell application "JSON Helper"

       set myRecord to fetch JSON from "http://smartwatchplusapp.appspot.com/temperature" name "admin" password "welcome"

       set myResult to |TEMPERATURE| of item 1 of results of d of myRecord as string

       say ("The temperature of the HANA Alerter is " & myResult & " °C")

end tell

 

 

Basically I call JSON helper from my script, I feed it my XS service (the mentioned URL is the XS URL I rerouted to a public domain, the original would look like this: http://54.246.85.50:8000/alerter/services/temperature.xsodata/TEMPERATURE/?$orderby=TIMESTAMP%20desc&$top=1&$format=json), username and password for basic authentication and finally I format the result into something readable. The last step in my script (say) speaks the output of my service. Executing services via speech by voice just became a reality!

 

I just used one of my services which was still active (from my TechEd HANA Alerter demo). Ofcourse you can do whatever you want with this technology. Any HANA table or view you can make accessible via voice on your Mac!

 

Below a clip on how it will all look in the end:

 

 

Have fun and let me know in the comments which use cases you build! I'm sure you can be tremendously creative.

 

Thank you for reading and take care,

 

Ronald.

Shreepad Patil

SAP TechEd and me

Posted by Shreepad Patil Dec 31, 2013

Greetings!!!

 

I got a chance to attend SAP TechEd Bangalore 2013. It was three days of knowledge overflow. We had a stall at TechEd and I was most of the time involved in the stall activities.

 

I will share the day by day experiences that I had during this amazing TechFest.

 

Day 1:

 

First Things First - Never trust the traffic conditions in Bangalore, planned to reach at 9 but reached the TechEd premises at 12 and missed the Keynote from Vishal Sikka. But thanks to TechEd Live I was able to attend it virtually later .

I got a chance to visit all the other stalls and SCN club house. Got the SCN badges and got my photos clicked from the professional photographer.
The most important thing on this day was DemoJam. The wait for this event was killing me for the whole day. Got a sneak peek into the future from the DemoJam. Six amazing demos all back to back!!!

Couldn't wait for the Keynote from Saurav Ganguly, as I could never have reached home on the same day if I had attended it.

 

Day 2:

 

This day I planned my travel and reached there before 9 .

I had a planned to visit a couple of sessions today but could attend just one of the Hands-on-session due to crowded stall .

 

Hands-On Session:  Predictive Analytics and SAP HANA.
This session was about using the advanced functionality AFL in SAP HANA. I learnt how to use the AFL Wrapper using SQL as well as the graphical 'AFL model'. SAP provides some(I think 20) of the basic procedures with the SAP HANA AFL Library. One can either use these procedures or opt for a separate R Server approach, where R Server running on a separate Linux machine will be executing the functions as per requirements.

This was a really informative session.James Michael Amulu along with his great team conducted the session really well.

 

Apart from this, I also got a chance to visit demo pods and know more about current affairs in SAP. To name a few I got to visit the SAP Lumira, SAP HANA 101, SAP HANA Marketplace, SAP Certification, SAPUI5, Fiori and UX.

My deductions from the overall experience was that SAP HANA and new UI paradigms were dominating most of the TechEd. SAP is now reinventing their solutions with SAP HANA and SAP Fiori, the User Experience in business should be as good as personal experience.

 

Day 3:

 

I again had a plan to visit a couple of sessions on this but I was unable to attend even one of them. However, I got to interact with many people on this day. Ravindra Channe, an expert in HANA, also visited our stall and I could interact with him.

I also paid a quick visit to the other stalls and gathered knowledge as much as I could.

 

I had opted for certification on SAP HANA and this was the date I had been assigned for the certification exam. The exam was scheduled after the TechEd main event. It was delayed a bit but I was able to clear the certification which made my day .

 

Overall it was an overwhelming experience at TechEd and it left me exhausted over the weekend till Monday.

Waiting for SAP TechEd 2014 already.

 

A very Happy New Year to all and a big thanks for reading this blog.

HANA Text Analysis with Custom Dictionaries


Prerequisites:

  • How to create a developer workspace in HANA Studio.
  • How to create & share a project  in HANA Studio
  • Run HANA Text Analysis on a table

 

With release of HANA SPS07, a lot of new features are available. One of the main features is the support for custom dictionaries in Text Analysis. By default HANA comes with three configurations for text analysis:

  • Core Extraction
  • Linguistic Analysis
  • Voice of Customer

 

One of the main issues you can come across while working on HANA Text Analysis is defining your own custom configurations for Text Analysis engine to work upon.  In the following lines, you will find how to create your own custom dictionary, so you could benefit more from HANA text analysis capabilities.

 

Scenario:

 

Assume that your company manufactures laptops and have recently launched some new laptops series. You want to know if the consumers out there who have bought the machine are facing any problems or not. The consumers will be definitely tweeting, posting, blogging about the product on the social media.

 

You are now harvesting massive amount of unstructured data through social media, blogs, forums, e-mails and other mediums. The main motivation behind this will be to gain customer perception about the products (laptops). You may want to receive early warning of product defects and shortfalls and listen to channel and market-specific customer concerns and delights.

 

With HANA SPS07 we can create custom dictionaries which can be used to detect word/term/phrase occurrences which may not be detected while we run Text Analysis without any custom dictionary.

 

You need to follow the following steps to get started with custom dictionaries:

 

1. Create the source XML file

 

I have created some dummy data in a table with “ID” and “TEXT” columns.

 

User_tweets table structure

 

ID

TEXT

1

The #lenovo T540 laptop's latch are very loose.

2

my laptop's mic is too bad. It can't record any voice. will not be buying #lenovo in near future

3

LCD display is gone for my T520. Customer care too is pathetic.

4

T530 performance is awesome. Only problem I am facing is with microphone. :-(

 

The mycustomdict.xml file has the following structure:

 

<?xml version="1.0" encoding="UTF-8"?>

<dictionary name="LAPTOP_COMPONENTS">

   <entity_category name="Internal Parts">

      <entity_name standard_form="Inverter Board">

            <variant name ="InverterBoard"/>

            <variant name ="InvertrBoard"/>

      </entity_name>

      <entity_name standard_form="LCD Cable">

            <variant name ="lcdcable"/>

            <variant name ="cable lcd"/>

      </entity_name>

   </entity_category>

</dictionary>

 

Please refer to the following guide http://help.sap.com/hana/SAP_HANA_Text_Analysis_Extraction_Customization_Guide_en.pdf  to know more about the creation of the source xml file to build custom dictionaries.

 

Using the above mentioned custom dictionary, HANA text analysis engine will detect “inverter board” & “LCD Cable” as entities of type internal parts of a Laptop.

 

2. Compiling the mycustomdict.xml file to a .nc file

 

First copy the XML file to your HANA machine using some FTP client.

I have copied the mycustomdict.xml to  /home/root/customDict folder

 

You can find the dictionary complier “tf-ncc” in your HANA installation at:

/<INSTALLATION_DIR>/<SID>/HDB<INSTANCE_NO>/exe/dat_bin_dir

 

Text analysis configuration files can be found at the following path:

/<INSTALLATION_DIR>/<SID>/SYS/global/hdb/custom/config/lexicon/lang

 

Run the complier on the source mycustomdict.xml file:

export LD_LIBRARY_PATH = <INSTALLATION_DIR>/<SID>/SYS/exe/hdb:/<INSTALLATION_DIR>/<SID>/SYS/exe/hdb/ dat_bin_dir

 

/<INSTALLATION_DIR>/<SID>/HDB<INSTANCE_NO>/exe/hdb/dat_bin_dir/tf-ncc -d /<INSTALLATION_DIR>/<SID>/SYS/global/hdb/custom/config/lexicon/lang -o /<INSTALLATION_DIR>/<SID>/SYS/global/hdb/custom/config/lexicon/lang/mycustomdict.nc /home/root/customDict/mycustomdict.xml

 

After executing the above command a file named mycustomdic.nc will be generated in the

/<INSTALLATION_DIR>/<SID>/SYS/global/hdb/custom/config/lexicon/lang folder which will be later used by the text analysis engine.

 


3. Create custom HANA Text Analysis configuration file

 

After compiling the xml file, we need to create a custom text analysis configuration to refer to the compiled .nc file we created in the previous step. The configuration file specify the text analysis

processing steps to be performed, and the options to use for each step.

 

In HANA studio create a workspace and then create and share a project.  Under this project create a new file with extension “hdbtextconfig”. Copy all the contents of one of the predefined configurations delivered by SAP as mentioned above. They are located in the HANA repository package: “sap.hana.ta.config”. For this scenario, I have copied the contents of the configuration file “EXTRACTION_CORE_VOICEOFCUSTOMER”.

 

Creating a Text Analysis Configuration: Section 10.1.3.2.1 of the HANA developer guide SPS07: http://help.sap.com/hana/SAP_HANA_Developer_Guide_en.pdf

 

After copying, modify the “Dictionaries” node under configuration node name "SAP.TextAnalysis.DocumentAnalysis.Extraction.ExtractionAnalyzer.TF” and add a child node for <string-list-value>

 

<string-list-value>mycustomdict.nc</string-list-value>


config.png

 

Now save, commit and activate the .hdbtextconfig file. After activation, now we can run Text Analysis engine using the custom configuration. To run text analysis, run the following SQL command:

 

CREATE FULLTEXT INDEX <indexname> ON <tablename> CONFIGURATION ‘<custom_configuration_file>’

TEXT ANALYSIS ON;

The fulltext index will be created as “TA_<indexname>".  For our scenario table the output of the fulltext index table is:

 

fulltext_index.png

 

As you can see the Text Analysis engine have indentified LCD, latch, Mic as internal parts. The above results can be used for data mining or analytical purposes.

This is an extension to my previous post on Analytic View SAP HANA - How to Create Analytic Views

In this post we will try to look at the Calculated columns and Restricted Columns in Analytic Views.

H1.png

CALCULATED COLUMN :

Calculated Column in analytic view is of two types 1.Measure  2.Attribute

In this example we gonna create a new calculated column measure called TOTAL_AMOUNT which is calculated by multiplying Order Quantity(from Data Foundation) and Material Unit Price(from Attribute View)

 

1. Right click on the CALCULATED COLUMNS tree structure under output ( screen shot above)

2. Provide the necessary information in the popup'd window. Name, Data Type,Column Type(here Measure).

H1.png

   In the expression editor drag and drop the fields from the ELEMENTS on which we need to calculate. In this example we are selecting QUANTITY from the output columns list of Data Foundation and UNIT PRICE from the output column list of Attribute View AV_MATERIAL_INFO like below

different Operations and Functions available.

H1.png

3. Click OK and CHECK and ACTIVATE the analytic view.

4, Now newly create calculated column is available for reporting.

H1.png

RESTRICTED COLUMN :

Restricted column is used to restrict a measure based on a condition of master data.This feature helps us to push the processing of data set result into HANA rather than to client side.

In this example we gonna created a restricted measure of Net Amount only for product category "Flat Screens"

 

1. Right Click on Restricted Column in output panel of the analytic view

2. Provide the necessary information like below, Select the restricted measure column here Net Amount and Add Restriction column here Product Category and select the value as "Flat Screens"

H1.png

3. Click OK and CHECK and ACTIVATE. Go to Data Preview here you can see a new restricted column was added in which the values are displayed for which Product Category = Flat Screens for other it is null( ? ).

H1.png

In Analysis Tab

H1.png

Fundamental to any serious development is the ability to have code which is organized and easy to maintain.  Therefore I am choosing the topic of using libraries in XS as the first post for my XS/SAPUI5 blog series.

 

The actual mechanism for creating and using a library within XS is quite straigt-forward.  First, the libary function is placed in an .xsjslib file.  For this example, I am calling the library SimpleLib.xsjslib.

 

////////////////////////////////////////////////////////////////////////////////////////////////////
// David Brookler's XS/SAPUI5 Blog
// 001 -- Using libraries in XS
// SimpleLib.xsjslib
////////////////////////////////////////////////////////////////////////////////////////////////////
 
function simpleLibFunction(s) {
  return "This came from simpleLibFunction() -- " + s;
}


Next we need to import the library.  The $.import command has two parameters.  The first parameter is a string representing the path to the library with the directory separator being a period.  The second parameter is the name of the library without the .xsjslib. See line 8 below.

 

///////////////////////////////////////////////////////////////////////////////////////////////////
// David Brookler's XS/SAPUI5 Blog
// 001 -- Using libraries in XS
// simpleLibCall.xsjs
////////////////////////////////////////////////////////////////////////////////////////////////////
 
// import the library
$.import("blog.b001", "SimpleLib");
 
// create a variable for simpler access to the library
var SimpleLib = $.blog.b001.SimpleLib;
 
// create a response object
var oResponse = {};
oResponse.directCall = $.blog.b001.SimpleLib.simpleLibFunction('direct call');
oResponse.callThroughLibVariable = SimpleLib.simpleLibFunction('call through variable');
 
// send the response object as a JSON string
$.response.setBody(JSON.stringify(oResponse));
$.response.status = $.net.http.OK;

 

Once the library has been imported, a call to a library function simply involves prefacing the call with the fullpath to the library, including the library's name.  The root is represented by a $.  See line 15 above,

 

It is also possible to create a variable to reference the library.  See line 11 above.  This variable can then be used to simplify access to the library.  See line 16 above.

 

The result of calling the XS web service will be:

 

{
     directCall: "This came from simpleLibFunction() -- direct call",
     callThroughLibVariable: "This came from simpleLibFunction() -- call through variable"
}

 

In the next post, I plan to look at how to create libraries which control what is exposed as its interface.

Actions

Filter Blog

By author:
By date:
By tag: