Technology Blogs by Members
Explore a vibrant mix of technical expertise, industry insights, and tech buzz in member blogs covering SAP products, technology, and events. Get in the mix!
cancel
Showing results for 
Search instead for 
Did you mean: 
Former Member

I've been Inspired by Cloudera's example of using Hadoop to collect & collate seismic information, HANA's recent Geo-spatial improvements, and the geographic mapping capability of Mike Bostocks amazing D3.  

I thought  it would be interesting to combine these powerful tools to make an End to End example using:

1)  Hadoop to collect seismic information

2)  HANA to graphical present the data using HANA XS, SAPUI5  & D3.


The following example was built using a Hortonworks HDP2.0 Cluster & HANA SPS7 both running on AWS.


The final results were as follows:


The controls are provided by SAPUI5, and the rotatable globe is made using D3.  This was developed with approximately 800 lines of code.  See below for full code details.


As a brief example of this in action you can watch this short video:






Before I could present the information I firstly used HADOOP to automate the following:

a)  Collect the data from Earthquake Hazards Program

b)  Reformat the data and determine DELTA's, since the last run.

c)  Export to HANA

d)  Execute a HANA procedure to update Geo-spatial Location information.

Below is diagram of the HADOOP tools I used:

Each of the steps are summarised below.  They where scheduled in one workflow using HADOOP OOZIE.


a) Get DATA:  I used  Cloudera's JAVA example to collect recent seismic info  cloudera/earthquake · GitHub

      Note: I modified Cloudera example slightly in order to get place information relating to the quake.

     AronMacDonald/earthquake · GitHub


      The source data is supplied by the US geographical Survey  Earthquake Archive Search & URL Builder



b) Pig Scripts

          1) Reformat the data into TAB delimited files (easier for importing text to HANA)

          2) Prepare a Delta file, comparing data previously send to HANA with new data

      [Note: for a simplified version of using PIG with HANA see Using HADOOP PIG to feed HANA Deltas]


     The pig scripts I created for this more complex example are available at  AronMacDonald/Quake_PIG · GitHub


c) SQOOP  was used to export the delta records to HANA

      [Note: for an overview of using SQOOP with HANA see Exporting and Importing DATA  to HANA with HADOOP SQOOP]

    

     The sqoop export statement for this tab delimited file was:

      sqoop export -D sqoop.export.records.per.statement=1 --username SYSTEM --password manager

     --connect jdbc:sap://zz.zz.zz.zzz:30015/ --driver com.sap.db.jdbc.Driver  --table HADOOP.QUAKES

     --input-fields-terminated-by '\t' --export-dir /user/admin/quakes/newDelta


    The target table in HANA is:

create column table quakes (

     time      timestamp,

     latitude  decimal(10,5),

     longitude decimal(10,5),

     depth     decimal(7,4),

     mag       decimal(4,2),

     magType   nvarchar(10),

     nst       integer,

     gap       decimal(7,4),

     dmin      decimal(12,8),

     rms       decimal(7,4),

     net       nvarchar(10),

     id        nvarchar(30),

     updated   timestamp,

     place     nvarchar(150),

     type      nvarchar(50)


);



d) Execute a HANA procedure (from HADOOP) to populate geospatial location information for the new records

      [Note: For a simplified example of calling Hana procedures from HADOOP see Creating a HANA Workflow using HADOOP Oozie]

   

Geospatial information is stored in the following table in HANA:

create column table quakes_geo (

     id        nvarchar(30),

     location ST_POINT  

);

      In order to populate the locations a Hana procedure (populateQuakeGeo.hdbprocedure) was created which performs the following statement:

   insert into HADOOP.QUAKES_GEO

       (select Q.id, new ST_Point( Q.longitude , Q.latitude ) 

        from HADOOP.QUAKES as Q 

        left outer join HADOOP.QUAKES_GEO as QG

        ON Q.id = QG.id

        where QG.id is null );

Finally an  Oozie workflow was created for the above steps on a Hortonworks HDP 2.0 cluster.

An example of the execution log in the Hadoop User interface (HUE) is:


I then got to work building the HTML5 webpage on HANA XS.

These were the main references I used for building the D3 rotating Globe:

Rotating Orthographic

Rotate the World

Current Global Earthquakes

To serve up the quake information, which can be easily consumed by D3 [geojson], a custom server side Javascript (quakeLocation.xsjs) was created.

The basis of the geojson output was the following statement, which used the SAPUI5 controls for Date range and Quake magnitude:

select Q.id, Q.mag, Q.place, Q.time, QG.location.ST_AsGeoJSON() as "GeoJSON" 

from HADOOP.QUAKES as Q 

left outer join HADOOP.QUAKES_GEO as QG

ON Q.id = QG.id

where QG.id is not null

For a simplified version of Using D3 with HANA , including an example of how to create XSJS geojson, then see Serving up Apples & Pears: Spatial Data and D3

The complete HANA XS Project (including above mentioned XSJS, Prodedure and HTML5 source code) is available to download here:

HadoopQuakes.zip - Google Drive


I hope you found this example interesting and I hope it inspires you to automate your HADOOP HANA workflows with OOZIE, as well as exploring the graphical visualisation capabilities of SAPUI5 & D3.

3 Comments
Labels in this area