Technology Blogs by Members
Explore a vibrant mix of technical expertise, industry insights, and tech buzz in member blogs covering SAP products, technology, and events. Get in the mix!
cancel
Showing results for 
Search instead for 
Did you mean: 
uladzislau_pralat
Contributor
I came across a number of text mining examples and all them where dealing with text information that is not from my area of expertise, for example, microbiology. Such examples are not catching your attention because you can not really tell if text mining is working based on your knowledge. I created SAP TechEd 2015 Session Catalogue data to play with HANA Text Mining. Any SAP professional is familiar with this text data and can make his opinion about HANA Text Mining.

So here is my data. As you can see there are SESSION and TITLE information fields, CATEGORY field that classifies each document and DESCRIPTION field which contains text data for text mining (full-text index is built on this field).



I prepared a number of examples. Lets see text mining in action.

 

Find Similar Documents

For example, find session similar to DEV260 'Building Applications with ABAP Using Code Pushdown to the Database'



 

Find Relevant Terms

For example, find terms relevant to DEV260 'Building Applications with ABAP Using Code Pushdown to the Database' session



 

Find Related Terms

For example, find terms related to 'Fiori'



 

Find Relevant Documents

For example, find documents relevant for term 'Fiori'



 

Categorize Documents

For example, you have a new session for which you have to assign to a proper category. I took SAP TechEd 2014 DEV161 'SQLScript – Push Code Down into SAP HANA to Achieve Maximum Performance' session which belongs to 'Development and Extension Platform for SAP HANA and Cloud' category and classified it using 2015 SAP TechEd catalog documents. Lets see if the document will be correctly classified.



As you can see the document was classified correctly.

 

You can import attacted TM_DEMO-sap.com.tgz delivery unit into you HANA system and play with examples and data. Once you delivery unit is imported you will have following objects in tm_demo package created



Note: described examples are in query.sql file

 

Excecute install.sql script to assign proper authorizations, fix data in session table data and create full text index for text mining.

Following catalog objects will be created



Note: for text mining function to work correctly you need to be on HANA SPS10

 

Installation instruction:

  1. Import TM_DEMO-sap.com.tgz Delivery Unit

  2. Execute install SQL Script


Here is a content of Delivery Unit:

TM_DEMO.hdbschema

TM_DEMO_ROLE.hdbrole

install.sql

query.sql

session.csv

session.hdbdd

session.hdbti

session_fix.hdbprocedure

 

Note: here is a helpful link How to Import Delivery Unit to HCP HANA MDC
5 Comments
Labels in this area