Spotfire on HANA (and a bit of a comparison)

RonaldKonijnenb · ‎08-23-2012

Spotfire on HANA (and a bit of a comparison)

After writing my “Tableau on HANA” blog I got curious again to see how a similar 3^rd party application would behave when connecting it to HANA. It did not take a lot of time to find another subject for my investigation: Tibco Spotfire. I started off by Googling “Spotfire HANA”. The first hit is immediately a very interesting one:

SAP Commits To Cloud, Beefs Up Mobile And In-Memory Options

The following paragraph in this blog from Cindi Howson (BI Scorecard) keeps me thinking for a minute or two:

…… One of the most interesting announcements in this area was planned support by third-party vendors including Tableau and Tibco Spotfire. Tableau has long had a philosophy and architecture of not requiring customers to replicate data into their in-memory engine (an option added in 2010 through Tableau Version 6). So support for SAP’s HANA makes sense.…… Spotfire, in contrast, has typically had an extract-and-load-into-memory approach (with a drill on demand option), so I will be curious to see how this is implemented.

What I read here is that both applications will get a native connection in the near future, meaning that for the time being a “classic” ODBC connection needs to be used. What Cindy also mentions is that a “live” connection (like in Tableau) is not the way Spotfire is designed, the data is extracted and loaded into the application after connecting. Time to investigate what this all means!

It always starts with a download

Getting a version of Spotfire is very easy. Just go to the website and download a trial version. This version will give you the possibility to try out the software for 15 days and will give the possibility to upload your created reporting to a trial server on the web.

Spotfire connections to HANA

It seems that there are actually two ways of connection Spotfire to HANA (or any other database):

Via ODBC
Via an information link by using the so-called Information Designer

This second option is interesting as it will give several means to limit the number of records which are retrieved by the user (e.g. via prompts). The information link connection is based on the Java JDBC connector (specifically for HANA: ngdbc.jar), not the ODBC connector. This is however not an option for me to explore, as the required software is server based and not available in a trial version from the Spotfire website. This leaves me with the option to connect via ODBC, which is in fact exactly the same way I have connected Tableau to HANA in my previous blog.

The ODBC connection

Connection Spotfire to HANA is a walk in the park. Just follow below steps and your good to go.

1. Create a user or system DSN via the Windows control panel

2. Fire up Spotfire and goto File --> Open from Database and select your DSN

3. Press OK, go to your schema and select your table or view

I will create two examples. One where I will load a full table of 5 million records and one example where I will take a selection from my large, 120 million records, “Sales” table.

Example 1 – Sales Dashboard for Adobe (selection from 120 million records)

After selecting your table or view, Spotfire automatically generates the full load statement. You can easily adjust this, which I did in the below example by putting the “Where” clause:

After pressing OK, Spotfire only loads the data for customer “Adobe”

In total it took 1 second to load and create a sample report. Impressive!

After fiddling around for an hour or so with my data set, I managed to create the following sales dashboard:

Spotfire has a simple interface which makes creating reports quite easy. Combining different graphs into different views and changing graph types is a simple matter of drag and drop, point and click. Also making use of filter and sliders is pre-configured without any code. It’s not all flawless (ofcourse), I had some annoyances like not being able to create a “Top N” easily. I had to use a so-called “Graphical table”, limit the number of records and sort on value which seems a bit much. I also could not find an easy way to sort my time dimension…

I actually combined two tables in my example: my “Sales” table and “Opportunity” table. To add a table, simply press File --> Add Data Tables. You could even create some transformation if you need to.

If you want to know where these tables originate from, just click here and here for full details 😉

Publishing

As the finishing touch, you can also save your report in a library which makes it available on the web to end user via the web player:

You can try out my created report here. Please note that this is viewing only, there is no refresh button.

Example 2 – “Confirmation” reporting

You might know that I was part of a team that competed in the HANA InnoJam a couple of months ago. I am re-using one of the tables for my next test.

The confirmation table is fairly simple (couple of characteristics showing per hour basically), but the table contains over 5 million records which I want to use in this example. Connecting and selecting is the same as in my previous example. As I want to show my data against “hour”, its best to select a nice “Line Chart” showing the trend:

Spotfire automatically generates an average line (black one) and also creates some filters (dropdown, slider, bullet,…) is a matter of right click.

Press here to jump to the report online. There is no delay in loading the report nor navigating through these 5 million records.

Reload!

Now let’s do a reload on the client side

It takes 23 minutes to refresh the data.

This might look like a lot of time, but let's compare this to SAP Visual Intelligence and Tableau:

SAP Visual Intelligence

First thing for me to do it create an analytical view out of my table to be able to load it into "Visi". 5 minute job, nothing to it.

When it comes to connecting to HANA, there are two ways to connect to a view (cube):

HANA Cube
HANA Cube Online

The first option will load the data into the application, but will force you to make a restriction (limits the load to 2.500.000 records)

Loading and refreshing 2.5 million records in an Analytical view takes about 15 minutes on my machine:

Now lets load that Table into Tableau and choose for the second option "Import all data"

Data loads in 21 minutes

What about that second option in Visual Intelligence: HANA Cube Online?

Well ladies and gentlemen this is where you will get that big smile on your face. The data refreshes instantly:

Data loads in a split second with all the nice options off changing graph types, online filtering, creating ranges, the whole 9 yards. Very, very impressive!

Conclusion

I guess I can almost quote my previous blog. The behavior of Spotfire is similar to Tableau and Visual Intelligence when loading the data in the application. Performance wise, all three are not particularly keen on loading millions of records without a proper selection. As shown in the first example, a select on a customer loads at lighting speed, a full load of millions of records takes a substantial amount of time. Exactly the behavior as in Tableau and Visual Intelligence. Once it’s in the application, it is all fast and flawless again. Take my measurements with a grain of salt though. I have not done this in a lab environment with controlled conditions and my test case is not even the same (5 million vs. 2.5 million records, table vs. analytical view). It's just to give an idea on how all three handle big data loads into the application. Don't expect your millions of records to load instantly in order for you to start your journey of "visual data discovery". If your selections are not smart, you'll have to get a cup of coffee (or two).

All changes when you use the option to connect to a HANA Cube Online, data refreshes instantly. "So Ronald, why would you ever go for that option to load data into the application"? Because there is a very smart option when not going for an online connection, you can transform (read clean!) your data. Best of both worlds. Good data: online connection, data to be cleansed: load and transform.

So SAP Visual Intelligence has the big advantage that it has a working option to use data online. Tableau and Spotfire lack that at the moment (look at my previous blog to see the disadvantages when using a live connection with Tableau). Curious to see if that will change once they get that native connection.

There is a another difference which is in the use off analytical views and tables. Visual Intelligence (by design) can handle views, but not tables (a bit of a shame if you just want to take a quick look at your data in a freshly modeled table), while Tableau and Spotfire can handle tables but not analytical views.

Analytical views are simply not supported which can even be verified by running a select statement on the view in HANA studio:

Could not execute 'SELECT * FROM "_SYS_BIC"."0-s0007457730/AV_CONFIRMATION"'

SAP DBTech JDBC: [7] (at 25): feature not supported: cannot execute select * on olap cube: 0-s0007457730/AV_CONFIRMATION: line 1 col 26 (at pos 25)

No doubt though that views will be supported in Tableau and Spotfire once the new connectors are released.

As in terms of functionality I cannot choose between Tableau, Spotfire or Visual Intelligence. Spotfire brought me up to speed a little faster, while Tableau and Visual Intelligence had some out of the box features I was looking for (like the Top N). For me without a doubt all three are awesome visualization and data discovery tools!

Thank you for reading this and take care,

Ronald.

Spotfire on HANA (and a bit of a comparison)

SAP PI for Beginners

ABAP 7.40 Quick Reference

Fiori: technical installation and configuration of one app from A - Z