Inside SAP HANA - optimising data load performance...

Former Member · ‎03-20-2012

So late last year, I had the opportunity to load some reasonable data volumes into SAP HANA. We got about 16GB of data or 160m records, and ran some tests.

Well now I have closer to 256GB of the same data to load, and I only have a 512GB SAP HANA appliance to spare, which already has a SAP BW system and a bunch of other things on it! So I thought it was time to try to optimise the data model and make it slicker.

We cut a few corners last time around because some of the data was a bit corrupt and so we had some text fields where there could otherwise be dates - and in addition, we were not judicious with the use of things like smaller integer data types, where there were only small numbers of distinct values.

I'm not sure how much value this has in SAP HANA because of the way it creates indexes, but text fields certainly take up a lot of unneccessary space. Today I'm going to do a bunch of testing with a 42m row test data set, and then use this for the basis of the final model - loading the full 2bn+ rows in-memory. And we'll see how it performs!

Step 1: Data Model optimisation

I now have no text values in my main fact table - only TINYINT, SMALLINT, INT, REAL, DATE and TIME. I'm going to load this file into the old fact table which isn't optimised, and compare for space to see how much we have saved. The model is already well normalised so I'm not expecting to be able to reduce this any further.

So we are moving from:

create column table "POC"."FACT"( "Col1" INTEGER, "Col2" INTEGER, "Col3" INTEGER, "Col4" INTEGER, "Col5" INTEGER, "Col6" INTEGER, "Col7" INTEGER, "Col8" INTEGER, "Col9" REAL, "Col10" INTEGER, "Col11" INTEGER, "Col12" INTEGER, "Col13" VARCHAR (32) default '', "Col14" VARCHAR (16) default '', "Col15" INTEGER, "Col16" INTEGER, "Col17" INTEGER, "Col18" INTEGER, "Col19" INTEGER);

To:

create column table "POC"."FACT2"( "Col1" INTEGER, "Col2" DATE, "Col3" TINYINT, "Col4" TINYINT, "Col5" SMALLINT, "Col6" SMALLINT, "Col7" TINYINT, "Col8" TINYINT, "Col9" REAL, "Col10" INTEGER, "Col11" TINYINT, "Col12" TINYINT, "Col13" REAL, "Col14" TIME, "Col15" SMALLINT, "Col16" TINYINT, "Col17" TINYINT, "Col18" SMALLINT, "Col19" DATE);

Now I've loaded the same data into both fact tables.

Fact Table 1: 1,327,747kb

Fact Table 2: 778,737kb

Wow - I'm really surprised by this. I'm guessing it's the text fields which were the killer, but this has halved the table size. This will make a big difference! I checked this and of course the column store has limited data types so things like TINYINT don't have an impact as integer datatypes are already compressed.

What's even more interesting is that we have reduced the load time from 88 seconds down to 26 seconds. I guess those text inserts were expensive in the column store.

Step 2: Load optimisation

HANA has two main load optimisation steps - the number of parallel processes and the amount of data consumed before commit. This SAP HANA box is a Medium, which means 40 CPUs and 512GB RAM. Theory should mean that you will get an improvement up to 40 threads, which will tail off.

This is really important right now because I know that the original 160m row test case takes 9 minutes to load. So we're looking at at least 2 hours to load the full set. Do you think we can make this much less?

	Default	10k	50k	200k	500k
1 Thread	7m11s
2 Threads	3m33s
4 Threads	1m57s
8 Threads	1m13s			38s
16 Threads	44s	34s	28s	25s	26s
32 Threads	47s	32s	27s	25s	27s
40 Threads	47s	32s	43s	25s	42s

The learning here is really clear - it's really easy to get down from 7m to about 30 seconds without being very clever and then another 10-15% performance can be had by fine-tuning. It also looks like there is no benefit of moving past about 16 load threads or a Batch of more than 200k, at least in this system. What's more when you move to 40 threads, the results start becoming inconsistent. I have no idea why.

Step 3: HANA Database Version

So I'm interested to see what happens if we update our HANA Database version. I'm told that there improvements both in compression and performance as new releases come out. What do you think?

Well I looked into it and unfortunately the current version of HANA SP03 is Patch 25, which has a problem with data load performance. Guess this is going to have to wait until another day.

Update: well we were on HANA SP03 Patch 20 and it wasn't as stable as I'd have liked, so Lloyd upgraded us to Patch 25. This seems to be so much better and I'd really recommend keeping up to date. I'm not sure it actually fixed any of our problems but it stopped a whole load of errors from appearing in the logs.

Step 4: The main load

OK - now we're ready to do the main load. I don't think this system can be optimised any further and by my calculations of 40m records in 25 seconds, we should be able to load the full 2bn in 20 minutes or so. Are you ready? Let's see.

The load is split up into 17 files - details for the load are below.

	Size	Rows	Time
File 1	3925MB	42m	25s
File 2	1510MB	16m	11s
File 3	14304MB	152m	120s
File 4	13196MB	140m	144s
File 5	14239MB	152m	164s
File 6	13049MB	139m	107s
File 7	13569MB	144m	105s
File 8	15511MB	166m	177s
File 9	14156MB	151m	144s
File 10	14960MB	160m	156s
File 11	13449MB	146m	218s
File 12	15316MB	166m	419s
File 13	19843MB	214m	163s
File 14	17037MB	176m	415s
File 15	16399MB	183m	247s
File 16	15275MB	163m	198s
File 17	11959MB	128m	178s
Total	227697MB	2312m	2991s

As you can see, the performance is great initially and then starts to tail off in the later files. This leads to a fairly disappointing load performance (still massively faster than any other database remember!!!) of 2991s or 50 minutes to load 2.3bn rows into memory. I was hoping for over double this.

The reason for this seems to be that HANA puts the rows initially into a delta store, and after that has loaded, it then automaticaly merges them into the main column store. This happens concurrently with the load and seems to kill performance. So what I did was to join all 17 files into one big file, and then try loading it. Let's see what happens:

	Size	Rows	Time
File 1	227690	2312m	51m

Curiously it's not any faster, and this turns out it's because it does the merge delta during the load. You can disable this with the following statement:

ALTER TABLE "POC"."FACT2" WITH PARAMETERS ('AUTO_MERGE' = 'OFF);

Note that whilst auto merge is off (turn it on when you're done loading) you can do manual merges doing:

MERGE DELTA OF "POC"."FACT2";

Step 5: Learning about partitions

It's at this point that i learnt there is a 2bn row limit in a single SAP HANA database partition. Curiously it allows you to load more than 2bn rows but then fails on the subsequent merge delta. This sounds like a pain but it's actually a blessing in disguise for two reasons. First, it turns out that as the amount of data in the main store increases, so does the cost of a merge delta and second, using partitions allows you to get the best out of HANA. If you turn automatic merge off and do it manually, look what happens:

An early merge takes just a minute:

Statement 'IMPORT FROM '/hana/FACT_05_2010.ctl' WITH THREADS 16 BATCH 200000' successfully executed in 1:50.816 minutes - Rows Affected: 0

Statement 'MERGE DELTA OF "POC"."FACT2"' successfully executed in 1:04.592 minutes - Rows Affected: 0

Wait a while and you see that the merge time has increased by a factor of 4 whilst the load time into the merge store is more or less linear. This makes sense of course because the merge process has to insert the records into a compressed store, which is computationally expensive. It appears to increase at O(m.log(n)) where n is the size of the main store and m is the size of the delta store, which more or less makes sense based on my knowledge of search and sort algorithms.

Statement 'IMPORT FROM '/hana/FACT_09_2011.ctl' WITH THREADS 16 BATCH 200000' successfully executed in 2:23.475 minutes - Rows Affected: 0

Statement 'MERGE DELTA OF "POC"."FACT2"' successfully executed in 4:11.611 minutes - Rows Affected: 0

And since it turns out that you can partition easily by month, I emptied the table and decided to repartition it like this. I now have 26 partitions, one for each month, plus an extra partition for anything else that doesn't fit.

ALTER TABLE "POC"."FACT2" PARTITION BY RANGE (DATE)

(PARTITION '2010-01-01' <= VALUES < '2010-02-01',

……

PARTITION '2012-02-01' <= VALUES < '2012-03-01',

PARTITION OTHERS);

Note that HANA makes it easy to add new partitions and move data between partitions. Data management even in large volumes won't be a problem, you will be glad to know. And look at the load times - it is completely linear for the delta merge right up until the last partition. This is a major improvement compared to a legacy RDBMS where you get very slow batch load times unless you drop indexes - massively slowing concurrent read performance.

Statement 'IMPORT FROM '/hana/FACT_02_2012.ctl' WITH THREADS 16 BATCH 200000' successfully executed in 1:46.066 minutes - Rows Affected: 0

Statement 'MERGE DELTA OF "POC"."FACT2"' successfully executed in 53.618 seconds - Rows Affected: 0

What's more it loads a total of 3.7bn rows in 73 minutes - including the merge delta exercise, which I wasn't even counting before.

Step 6 - Using Table locks

Another HANA tip is to spend some time reading the SQL Reference Manual. It has lots of stuff in it, much of which you won't find documented anywhere else. I found a little function called TABLE lock which should allow you to load data faster. Let's try it, the SQL Syntax looks like this:

IMPORT FROM '/hana/FACT_02_2010.ctl' WITH THREADS 16 BATCH 200000 WITH TABLE LOCK;

Note that you only want to do this on initial batch loads because it locks the entire table, but it's unlikely you will want to load billions of rows in one go after the initial load. What's really interesting here is that this feature puts the data directly into the main table - bypassing the delta table and the need to do a merge delta - but it is some slower 16% than a regular load followed by a merge delta. Why? Who knows!

Step 7 - Conclusions

Well I'm sure you have figured out a lot of this as you read this blog, but here's the takeaways that I got out of this exercise. But first and foremost, despite being fast, it is definitely worth your time to optimise your SAP HANA scenario.`

1) Spend time on optimising your HANA Data Model. It will reduce the size of your database for the big tables, improve performance and also reduce cost, because HANA is licensed by appliance memory size.

2) Test to optimise your load parameters. But don't spend too much time here. It's not that sensitive to detail changes so get it nearly right and move on.

3) Choose your partitioning scheme carefully. And then load partitions into individual files and do manual merges in-between partitions, if you are loading a lot of data. You don't have to but it will speed end-end load performance and allow for speedier issue resolution.

SAP HANA is pretty amazing technology and if you have worked with any other kind of RDBMS you will know that even the slowest of these times is far faster than anything else.

But first and foremost two things are persistent for me. First, tuning SAP HANA is just as important as with any other system - with a bit of work you can achieve performance that you wouldn't have believed before. And second, performance tuning SAP HANA is different to other systems - you tune for parallelisation and memory usage rather than for I/O. But remember this: performance tuning is about finding the performance envelope of a system and working around the weakest point. In that sense SAP HANA is no different to any other computer system in the world.

And in case you are in any way confused, here is the final SQL I used to create and load the table optimally:

drop table "POC"."FACT2";

create column table "POC"."FACT2"(

"Col1" INTEGER,

"Col2" DATE,

"Col3" TINYINT,

"Col4" TINYINT,

"Col5" SMALLINT,

"Col6" SMALLINT,

"Col7" TINYINT,

"Col8" TINYINT,

"Col9" REAL,

"Col10" INTEGER,

"Col11" TINYINT,

"Col12" TINYINT,

"Col13" REAL,

"Col14" TIME,

"Col15" SMALLINT,

"Col16" TINYINT,

"Col17" TINYINT,

"Col18" SMALLINT,

"Col19" DATE) NO AUTO MERGE;

ALTER TABLE "POC"."FACT2" PARTITION BY RANGE (DATE)

(PARTITION '2010-01-01' <= VALUES < '2010-02-01',

PARTITION '2010-02-01' <= VALUES < '2010-03-01',

PARTITION '2010-03-01' <= VALUES < '2010-04-01',

PARTITION '2010-04-01' <= VALUES < '2010-05-01',

PARTITION '2010-05-01' <= VALUES < '2010-06-01',

PARTITION '2010-06-01' <= VALUES < '2010-07-01',

PARTITION '2010-07-01' <= VALUES < '2010-08-01',

PARTITION '2010-08-01' <= VALUES < '2010-09-01',

PARTITION '2010-09-01' <= VALUES < '2010-10-01',

PARTITION '2010-10-01' <= VALUES < '2010-11-01',

PARTITION '2010-11-01' <= VALUES < '2010-12-01',

PARTITION '2010-12-01' <= VALUES < '2011-01-01',

PARTITION '2011-01-01' <= VALUES < '2011-02-01',

PARTITION '2011-02-01' <= VALUES < '2011-03-01',

PARTITION '2011-03-01' <= VALUES < '2011-04-01',

PARTITION '2011-04-01' <= VALUES < '2011-05-01',

PARTITION '2011-05-01' <= VALUES < '2011-06-01',

PARTITION '2011-06-01' <= VALUES < '2011-07-01',

PARTITION '2011-07-01' <= VALUES < '2011-08-01',

PARTITION '2011-08-01' <= VALUES < '2011-09-01',

PARTITION '2011-09-01' <= VALUES < '2011-10-01',

PARTITION '2011-10-01' <= VALUES < '2011-11-01',

PARTITION '2011-11-01' <= VALUES < '2011-12-01',

PARTITION '2011-12-01' <= VALUES < '2012-01-01',

PARTITION '2012-01-01' <= VALUES < '2012-02-01',

PARTITION '2012-02-01' <= VALUES < '2012-03-01',

partition others);

IMPORT FROM '/hana/FACT_02_2010.ctl' WITH THREADS 16 BATCH 200000 WITH TABLE LOCK;

MERGE DELTA OF "POC"."FACT2";

IMPORT FROM '/hana/FACT_03_2010.ctl' WITH THREADS 16 BATCH 200000 WITH TABLE LOCK;

MERGE DELTA OF "POC"."FACT2";

IMPORT FROM '/hana/FACT_05_2010.ctl' WITH THREADS 16 BATCH 200000 WITH TABLE LOCK;

MERGE DELTA OF "POC"."FACT2";

IMPORT FROM '/hana/FACT_06_2010.ctl' WITH THREADS 16 BATCH 200000 WITH TABLE LOCK;

MERGE DELTA OF "POC"."FACT2";

IMPORT FROM '/hana/FACT_07_2010.ctl' WITH THREADS 16 BATCH 200000 WITH TABLE LOCK;

MERGE DELTA OF "POC"."FACT2";

IMPORT FROM '/hana/FACT_08_2010.ctl' WITH THREADS 16 BATCH 200000 WITH TABLE LOCK;

MERGE DELTA OF "POC"."FACT2";

IMPORT FROM '/hana/FACT_09_2010.ctl' WITH THREADS 16 BATCH 200000 WITH TABLE LOCK;

MERGE DELTA OF "POC"."FACT2";

IMPORT FROM '/hana/FACT_10_2010.ctl' WITH THREADS 16 BATCH 200000 WITH TABLE LOCK;

MERGE DELTA OF "POC"."FACT2";

IMPORT FROM '/hana/FACT_11_2010.ctl' WITH THREADS 16 BATCH 200000 WITH TABLE LOCK;

MERGE DELTA OF "POC"."FACT2";

IMPORT FROM '/hana/FACT_12_2010.ctl' WITH THREADS 16 BATCH 200000 WITH TABLE LOCK;

MERGE DELTA OF "POC"."FACT2";

IMPORT FROM '/hana/FACT_01_2011.ctl' WITH THREADS 16 BATCH 200000 WITH TABLE LOCK;

MERGE DELTA OF "POC"."FACT2";

IMPORT FROM '/hana/FACT_02_2011.ctl' WITH THREADS 16 BATCH 200000 WITH TABLE LOCK;

MERGE DELTA OF "POC"."FACT2";

IMPORT FROM '/hana/FACT_03_2011.ctl' WITH THREADS 16 BATCH 200000 WITH TABLE LOCK;

MERGE DELTA OF "POC"."FACT2";

IMPORT FROM '/hana/FACT_04_2011.ctl' WITH THREADS 16 BATCH 200000 WITH TABLE LOCK;

MERGE DELTA OF "POC"."FACT2";

IMPORT FROM '/hana/FACT_05_2011.ctl' WITH THREADS 16 BATCH 200000 WITH TABLE LOCK;

MERGE DELTA OF "POC"."FACT2";

IMPORT FROM '/hana/FACT_06_2011.ctl' WITH THREADS 16 BATCH 200000 WITH TABLE LOCK;

MERGE DELTA OF "POC"."FACT2";

IMPORT FROM '/hana/FACT_07_2011.ctl' WITH THREADS 16 BATCH 200000 WITH TABLE LOCK;

MERGE DELTA OF "POC"."FACT2";

IMPORT FROM '/hana/FACT_08_2011.ctl' WITH THREADS 16 BATCH 200000 WITH TABLE LOCK;

MERGE DELTA OF "POC"."FACT2";

IMPORT FROM '/hana/FACT_09_2011.ctl' WITH THREADS 16 BATCH 200000 WITH TABLE LOCK;

MERGE DELTA OF "POC"."FACT2";

IMPORT FROM '/hana/FACT_10_2011.ctl' WITH THREADS 16 BATCH 200000 WITH TABLE LOCK;

MERGE DELTA OF "POC"."FACT2";

IMPORT FROM '/hana/FACT_11_2011.ctl' WITH THREADS 16 BATCH 200000 WITH TABLE LOCK;

MERGE DELTA OF "POC"."FACT2";

IMPORT FROM '/hana/FACT_12_2011.ctl' WITH THREADS 16 BATCH 200000 WITH TABLE LOCK;

MERGE DELTA OF "POC"."FACT2";

IMPORT FROM '/hana/FACT_01_2012.ctl' WITH THREADS 16 BATCH 200000 WITH TABLE LOCK;

MERGE DELTA OF "POC"."FACT2";

IMPORT FROM '/hana/FACT_02_2012.ctl' WITH THREADS 16 BATCH 200000 WITH TABLE LOCK;

MERGE DELTA OF "POC"."FACT2";

ALTER TABLE "POC"."FACT2" WITH PARAMETERS ('AUTO MERGE' = ON);

Next Steps

Well what's next? I think you know - I've optimised this data model for load performance, but how did I do for query performance? I'm already wondering about this because HANA is a parallel processing engine. If you put 27 months of partitions and then do a query over a wide date range then it should kick off a bunch of parallel processes. If you query on a single month you will hit less data but with fewer parallel processes?

What's the optimal data model for query performance and how does how you setup your joins, CE Functions and SQL Script affect this?

Acknowledgements and thank yous.

As is often the case with this sort of post, there are thanks to dole around. Lloyd Palfrey from Bluefin for being the HANA Ops guru and fixing the database every time I filled up disks, corrupted it and tuning the box. Vijay Vijaysankar, Vitaliy Rudnytskiy and David Hull from IBM, HP and SAP for bouncing ideas. Margaret Anderson and Serge Mutts from SAP CSA for helping with issue resolution. Markus Friedel from SAP AGS for helping resolve errors - mostly between my keyboard and chair.

And let's not forget Aleks Barilko, Marilyn Pratt, Gail Kling Schneider and the rest of the SCN crew for recovering most of this blog, which got lost in the move from the old to the new SCN whilst I was on vacation

Inside SAP HANA - optimising data load performance and tuning

SAP PI for Beginners

ABAP 7.40 Quick Reference

Fiori: technical installation and configuration of one app from A - Z