Hana Smart Data Integration - Adapters

werner_daehn · ‎12-08-2014

This post is part of an entire series

The foundation to Data Integration is being able to connect to various sources. Looking at the SPS08 Smart Data Integration option and its connectivity, you can see the usual suspects: Oracle, SQL Server etc.
With SPS09 and its Adapter SDA-extension not much did change, except one: There is an Adapter SDK now and you can write your own Adapters in Java!

The question towards connectivity is an obvious one. Without the ability to connect to a source system directly a workaround has to be used, e.g. writing files, copying them and then loading them into Hana. That files are cumbersome to handle is obvious as well. Or do you support Twitter? Workaround might be to use its underlying restful protocol. Common to all these workarounds is that they put all the development burden on the user. The user has to write a program to create files. The user has to create and parse the restful messages.

For the common sources that is no problem, all tools support relational databases of the various vendors. But even there you might find features unique to one database that is either supported or not.

While that is the case for Smart Data Integration Adapters as well, thanks to the Adapter SDK every Java developer can write adapters for Hana without compromising the Hana stability.

Architecture

The most important change, from an architectural point of view, was to move as much code as possible out of Hana's IndexServer into separate processes.

All that remains in the IndexServer is the optimizer code, translating the user entered SQL into an execution plan containing remote execution and local execution in - hopefully - the most efficient manner. The part of the SQL that should be sent to the source is handed over to the Data Provisioning Server process, which is another Hana process. This contain all the logic common to all adapters. Most important, it contains the communication protocol in order to talk to an agent process, the host of all adapters.

This architecture has multiple positive side effects:

If anything happens to the remote source, Hana Index Server is not impacted. Since the Index Server is the core process of Hana, any core dump in any line of code could have brought down the entire Hana instance.
Because the agent is an installation of its own, you can install the agent anywhere. One option is to place it on the Hana server itself. But that might not be preferred because then the entire middleware of all sources has to be installed there and the network has to allow passage of those middleware protocols. More likely the agent will be installed on the specific source and talk to Hana via the network. No problem as one Hana instance can have as many agents as required. Or a server of its own is used - possible as well.
Because the agent can be installed anywhere, it can be even installed on-premise and be connected to a Hana cloud. instance. Not even a VPN tunnel has to be used as the supported protocol includes https as well. In this case the agent does establish a https connection to the Cloud Hana instance just as any other web browser would do.
Developing an Adapter is much easier. Hana Studio has a plugin so it acts as an agent and now the developer can watch the internals easily.

Deploying existing Adapters

All the SAP provided Adapters are part of the Hana Agent installation, which is a separate download in SMP.

see Installations and Upgrades -> Index H -> SAP HANA SDI

in A- Z Index | SAP Support Portal

Once the agent is installed deploying an adapter is as easy as copying a jar file.

see SAP HANA Enterprise Information Management Administration Guide - SAP Library for details

Writing your own Adapter

The most important question is how difficult it is to write your won adapter. The goal of development was to make it as simple as possible of course but most important, you do not need to be an expert in Java, Hana and the Source system. Any Java developer should be able to write new adapters easily, just by implementing or extending some Adapter base classes.

Frankly it is quite simple, you start a new project in Hana Studio, a Java osgi/Equinox plugin project, a wizard builds the base class for you and your task is to add the code to open a connection to the source, list the source tables and their structure and the such. The SAP help portal has an excellent manual describing all step by step.

Create a Custom Adapter Using a New Plug-in Project - SAP HANA Data Provisioning Adapter SDK - SAP L...

Building an Adapter by Extending the BaseAdapterClass - SAP HANA Data Provisioning Adapter SDK - SAP...

Using the Adapters

All Adapters follow the Hana Smart Data Access paradigm.

In Hana Studio you can create a new remote source, you browse the remote tables. And for selected tables you will create virtual tables so that these look and feel like any other Hana table and can be queried. All the complexity underneath is hidden.

Just imagine the power of this! You deploy the e.g. File Adapter on your file server.

Obviously the Adapter needs to follow certain security rules, like in the case of the FileAdapter the developer decided that you can query not the entire server but only files and directories within a specified root directory. And in addition the adapter requires the consumer to authenticate himself as a valid user.

Then the allowed Hana instance, local or remote or in the cloud, can see files as regular tables with a user defined table layout. Each of these tables can be instantiated as virtual table and when selecting from a virtual table, all files - or the files specified in the where clause of the query - are parsed using the defined file format and their data is shown.