on 08-05-2015 8:21 PM
Hi All,
Has anyone tried the option SPLITTED_LOAD for importing table data parallely?
This is the documentation which i can see in the SAP note.
Each R3load process creates a copy of the target
table during import and uses DB2 LOAD to load this copy. Before creating the
first index on the target table, all data is copied from the table copies to the
target table using DB2 LOAD FROM CURSOR. If you use the option SPLITTED_LOAD,
you can make time when using the duplicate loading process: LOAD FROM CURSOR is
significantly faster than LOAD via the CLI client, and the loading of table
copies can happen in parallel. You can use the environment variable
DB6LOAD_TEMP_TBSPACE to specify, in which table space the table copies are
created.
A large LOAD FROM CURSOR ... SELECT statement is
used to copy the data from the table copies to the target table. The SELECT part
of this statement can be accelerated by activating SMP parallelism (SAP Note 2047006). To do so, it is
sufficient to set INTRA_PARALLEL=YES and MAX_QUERYDEGREE='ANY'. You can leave
the value for DFT_DEGREE as 1.
Hi,
I am also very much interested in more feedback about this option from customer tests. I got positive feedback from consulting teams but I do not have performance numbers I could share.
SPLITTED_LOAD certainly provides more room for errors if the R3load processes and tasks are not executed in the right order ( all data pieces first , primary keys and indexes last ... ) and it may require more tuning ( query degree ... ) . Also it is not clear if it will be faster compared to sequential LOAD of the data pieces on all systems for example if not enough CPU resources are available. Thats why this option is marked as an expert option.
Regards
Frank
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hi Frank,
Guten Tag!
Thanks so much for getting back to me.
We are following the same protocol frank .First tables commented out the primary key and indexes waiting for the data load to complete.
Will this loadargs work frank?
loadArgs=-stop_on_error -loadprocedure fast SPLITTED_LOAD:COMPRESS_STATIC_ALL -nolog -c 100000
Hi,
the loadArgs should work. As mentioned -nolog -c 100000 have no effect on LOAD.
The LOAD FROM CURSOR should happen when the primary key or first index is created by a seperate R3load process that runs after all data packages have been loaded. Although this R3load process may only create indexes it still needs to be started with "-loadprocedure fast SPLITTED_LOAD ... " . Otherwise it won't look for temporary tables to be merged.
Regards
Frank
Hi Frank,
Yes true commit count is waste in db2load as you have mentioned, i forgot to remove it actually .
Right now i just kept ign for primary key and the indexes in the tasks till the data load is complete. Once done I will change them and retrigger the import so that the index build would happen and data merge would happen as you have told.
Once will merge into the main table take a long time? Will it be longer than the db2load into the temprorary tables?
Hallo Frank,
Guten Tag!
We do have the CPU's available, can you please throw some light on this topic which you have mentioned please.
it may require more tuning ( query degree ... )
We are ready to give it a shot.
DB6CONV is ruled for extremely large tables, we have to go for an export/import option which is much faster.
Frank,
Finally it started loading of the data.
But i see the performance of merging is not that great.
What would be your advise be on the below values.
DB6LOAD_CPU_PARALLELISM=<n>
CPU_PARALLELISM parameter.
DB6LOAD_DATA_BUFFER_SIZE=<n>
BUFFER_SIZE parameter.
DB6LOAD_DISK_PARALLELISM=<n>
DISK_PARALLELISM parameter.
DB6LOAD_INDEXING_MODE =<n>
INDEXING_MODE parameter, where 0
AUTOSELECT (Default), 1 equals REBUILD, 2 equals INCREMENTAL, and 3
DEFERRED.
DB6LOAD_FORCE_LOAD
Hi,
well ... tons of questions ... some of them would be better answered by migration specialists that have used SPLITTED_LOAD in migration projects. Let's try anyways.
The default for DB6LOAD_CPU_PARALLELISM is 4 in current R3load versions . This should be for the CLI LOAD into the temporary tables. If you fell that you have free CPU resources for the LOAD FROM cursor, you can try to set DB6LOAD_SPLITTED_CPU_PARALLELISM to a somewhat higher value.
I do not advice to change DB6LOAD_DISK_PARALLELISM and DB6LOAD_INDEXING_MODE for SPLITTED_LOAD. Only for sequential LOAD of data pieces into one table DB6LOAD_INDEXING_MODE=2 is sometimes beneficial. You can set DB6LOAD_DATA_BUFFER_SIZE to a fraction of your memory available in UTIL_HEAP . Increasing UTIL_HEAP explicitly while your R3load is running may also be a good idea. For productive database UTIL_HEAP=AUTOMATIC is a good setting but for LOAD allocating some memory explicitly is a good idea. If UTIL_HEAP is large enough LOAD should be able to pick a good default for DB6LOAD_DATA_BUFFER_SIZE automatically.
Since SPLITTED_LOAD is using LOAD anyways you can ignore DB6LOAD_FORCE_LOAD.
A lot of useful hints are already contained in the note mentioned at the beginning. For example using INTRA_PARALLEL=YES is a good idea for the SPLITTED_LOAD merge phase. Also putting the temporary tables into a seperate tablespace using DB6LOAD_TEMP_TBSPACE is a good idea.
Thats almost all I know about SPLITTED_LOAD. Hope other can provide more hints...
Regards
Frank
Frank,
Thanks so much for the information.
First what would you advise regarding CPU parallelism.
lparstat -i |grep CPU
Online Virtual CPUs : 40
Maximum Virtual CPUs : 64
Minimum Virtual CPUs : 1
Maximum Physical CPUs in system : 64
Active Physical CPUs in system : 64
Active CPUs in Pool : 64
Shared Physical CPUs in system : 64
Physical CPU Percentage : 50.00%
Desired Virtual CPUs : 40
What do you think is a good figure to start with?
lparstat -i | grep 'Desired Capacity'
Desired Capacity : 20.00
For the other points, will respond back shortly.
Hi Frank,
Sure point taken, would take iterative methods first with 8 and then 16.
Have started with 30 processes and from db2top i see the below stats.
Memory Memory Percent Current High Percent Maximum # of
Type Level Pool Total Size WaterMark Max Size Pool(s)
----------- ---------- -------------------- ------- ------------ ------------ ------- ------------ -------
Instance DB2EP1 Monitor 0.08% 82.0M 82.3M 21866.67% 384.0K 1
Instance DB2EP1 FCMBP 0.00% 832.0K 832.0K 100.00% 832.0K 1
Instance DB2EP1 Other 0.25% 267.0M 271.6M 279.58% 95.5M 1
Database EP1 Applications 0.12% 130.7M 160.1M 24.67% 530.0M 530
Database EP1 Database 0.88% 946.0M 949.5M 470.39% 201.1M 1
Database EP1 Lock Mgr 2.33% 2.4G 2.4G 100.00% 2.4G 1
Database EP1 Utility 0.31% 337.0M 354.1M 172.58% 195.3M 1
Database EP1 Package Cache 0.53% 572.0M 648.8M 108.45% 527.5M 1
Database EP1 Catalog Cache 0.02% 21.3M 21.6M 213.75% 10.0M 1
Database EP1 Other 0.00% 192.0K 192.0K 0.94% 20.0M 1
Database EP1 BufferPool 94.88% 100.1G 100.1G 100.00% 100.1G 5
Database EP1 SharedSort 0.00% 2.2M 2.0G 0.05% 4.4G 1
Database EP1 ApplShrHeap 0.04% 45.6M 151.1M 58.40% 78.1M 1
Application EP1 Applications 0.12% 130.7M 160.1M 24.67% 530.0M 530
Application EP1 Other 0.44% 475.4M 549.3M 0.00% 103.5T 530
from topas and vmstat i dont see waits and cpu percentage being high.
Hi Frank,
Thanks a ton for the information. That's good to know.
During the testing when checked we were able to cut short time little bit with CPU_PARALLELISM 8.
Probably increasing to 16 will enhance the performance.
One question by increasing the CPU_PARALLELISM how does it affect the load time.
Wanted to know control flow.
CPU_PARALLELISM controls how many theads are working in parallel on one LOAD operations. If you have a hardware with N cores and you are running M R3load processes in parallel, I would not set CPU_PARALLELISM much higher than N/M . The R3load internal default for CPU_PARALLELISM is 4. If you would let DB2 choose an automatic default for CPU_PARALLELISM, it would take N.
Regards
Frank
Frank,
Guten Tag!
Thanks a ton for the response.
Few questions we have basically three step process right.
--> Create temp tables
--> Load data into temp tables
--> Merge data into main table
--> Build indexes and runstats right
For the load data we can increase CPU_PARALLELISM.
Merge data happens using cursor right, how can we improve its performance?
How can we improve the performance of building indexes?
Sorry for pestering with so many questions.
User | Count |
---|---|
98 | |
11 | |
11 | |
10 | |
10 | |
8 | |
6 | |
5 | |
4 | |
4 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.