on 01-02-2015 3:55 PM
Can anybody help me with advantages and disadvantages of full and delta load and which one to choose in which case.
Thanks.
Hi Swati,
Full load is the method every time truncating the target data and load the whole data.
If your source contains very less records and you no need to maintain the history then full load is the better.
The main disadvantage is performance. If we take transactional data , it may have millions of records . If we want to load full load , it takes lot of time . We cannot maintain the history of the source changes in the target as everything deleting the target data.
Delta Load :
Delta load we will use in the 2 scenarios .
1. Maintain the master data history .
2. Load the transaction data .
Maintain the master data history :
In the data warehousing if we want o maintain SCD2 ( slowly changing dimensions) , we will go with table comparison ,history preserving ,map operations and key generations . Slowly changing means changes occur once in while not frequently like address changes . As it is master data it contains less data only and changes are less . In this case we can compare the source and target tables and load the delta part and maintain the history.
This scenario will not workout for transactional delta as source having huge data and we will not going to maintain the history for transactional data.
Load the transaction data
If we want to load the delta of the transaction data , we will go with database level CDC . It works on log based or trigger based. Mostly data service with log base CDC techniques. Whenever changes occur in the transaction CDC will push the records into CDC database . We will read the data from CDC base and load into the target . In this scenario we will use map_cdc transform.
It is very helpful to save the loading time . Here some negative point is most of the databases creates CDC tables in the operation databases. Most of the client don't allow to create the tables in the production database. We need to go with Sybase replication server like that options.
I hope you understand the delta load and full load.
Thanks & Regards,
Ramana.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hi Ramana,
Need your help on one scenario:
(INPUT)Souce TBL:
Cust id Tax id Cus Name
100 300 AAAA
OUTPUT:
if will change cust id or tax in source then in Target should be:
100 300 AAAA
101 300 AAAA
(here data is inserted)
if will change cus Name then output should be:
100 300 BBBB
(data is just update)
I tried to use DELTA Load like
Source->Query->TablComparison->MapOP->target tbl and try to use query mapping
but am not getting exact output as above output requirement
Please help me on it.
Regards
Varsha
Hi Varsha,
I have some confusion in your requirement.
You have 3 fields in your source system. Cust id, Tax id & Cus Name .
In the first step you are trying to changes either Cust id or Tax id.
In the second step you are trying to change Cus Name.
What is the key field here . It seems you are not maintaining the key field here. You should have primary key field in your source system.
Your requirement like this.
(INPUT)Souce TBL:
Cust id Tax id Cus Name
100 300 AAAA
OUTPUT:
if will change cust id or tax in source then in Target should be:
100 300 AAAA
101 300 AAAA
CUST ID is from the source field . It should not increase the value . You should use the surrogate key and increment the surrogate key.
As per my assumption In your input Cust_ID is the primary key column.
In the table comparison you should use CUST ID as key column and TAX id & CUS Name as comparable columns.
In the target table you should add the surrogate key addition to the source columns. Better to add the valid from, valid to date as maintaining history records .
Please share your correct requirement or share the atl with requirement and will correct the flow.
Thanks & Regards,
Ramana.
Hi Ramana/Gokul
Many many thanks for your response...
In Source Cust_ID& TaxId is the primary key column and , should be inserted as new record if its changed from source and Cust Name as updated record if changed.
Job Flow:
Source->Query->TablComparison->Target Table
In the table comparison i used CUST ID & Tax as key column and also CustId & TaxId as comparable columns.
In the target table got only inserted and updated records.
Thanks Again
Regards
Varsha
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
If your source data changes frequently then go for delta load otherwise full load.Well,there are more parameters on which you have to decide whether to go for full load or delta load
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
User | Count |
---|---|
93 | |
10 | |
10 | |
9 | |
9 | |
7 | |
6 | |
5 | |
5 | |
4 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.